Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main' into astcons
Browse files Browse the repository at this point in the history
  • Loading branch information
JelleZijlstra committed Feb 27, 2024
2 parents ff0cc67 + 6087315 commit 54cf55f
Show file tree
Hide file tree
Showing 84 changed files with 1,665 additions and 388 deletions.
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ Programs/test_frozenmain.h generated
Python/Python-ast.c generated
Python/executor_cases.c.h generated
Python/generated_cases.c.h generated
Python/tier2_redundancy_eliminator_cases.c.h generated
Python/optimizer_cases.c.h generated
Python/opcode_targets.h generated
Python/stdlib_module_names.h generated
Tools/peg_generator/pegen/grammar_parser.py generated
Expand Down
6 changes: 5 additions & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Python/ast_opt.c @isidentical
Python/bytecodes.c @markshannon @gvanrossum
Python/optimizer*.c @markshannon @gvanrossum
Python/optimizer_analysis.c @Fidget-Spinner
Python/tier2_redundancy_eliminator_bytecodes.c @Fidget-Spinner
Python/optimizer_bytecodes.c @Fidget-Spinner
Lib/test/test_patma.py @brandtbucher
Lib/test/test_type_*.py @JelleZijlstra
Lib/test/test_capi/test_misc.py @markshannon @gvanrossum
Expand Down Expand Up @@ -249,3 +249,7 @@ Lib/test/test_interpreters/ @ericsnowcurrently
# SBOM
/Misc/sbom.spdx.json @sethmlarson
/Tools/build/generate_sbom.py @sethmlarson

# Config Parser
Lib/configparser.py @jaraco
Lib/test/test_configparser.py @jaraco
4 changes: 2 additions & 2 deletions .github/workflows/jit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ on:
- '**jit**'
- 'Python/bytecodes.c'
- 'Python/optimizer*.c'
- 'Python/tier2_redundancy_eliminator_bytecodes.c'
- 'Python/optimizer_bytecodes.c'
push:
paths:
- '**jit**'
- 'Python/bytecodes.c'
- 'Python/optimizer*.c'
- 'Python/tier2_redundancy_eliminator_bytecodes.c'
- 'Python/optimizer_bytecodes.c'
workflow_dispatch:

concurrency:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ Lib/test/data/*
/_bootstrap_python
/Makefile
/Makefile.pre
iOS/Resources/Info.plist
Mac/Makefile
Mac/PythonLauncher/Info.plist
Mac/PythonLauncher/Makefile
Expand Down
4 changes: 0 additions & 4 deletions Doc/library/ctypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1117,10 +1117,6 @@ api::
>>> print(hex(version.value))
0x30c00a0

If the interpreter would have been started with :option:`-O`, the sample would
have printed ``c_long(1)``, or ``c_long(2)`` if :option:`-OO` would have been
specified.

An extended example which also demonstrates the use of pointers accesses the
:c:data:`PyImport_FrozenModules` pointer exported by Python.

Expand Down
159 changes: 103 additions & 56 deletions Doc/library/pathlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,9 @@ Pure paths provide the following methods and properties:
>>> PurePath('/a/b/c.py').full_match('**/*.py')
True

.. seealso::
:ref:`pathlib-pattern-language` documentation.

As with other methods, case-sensitivity follows platform defaults::

>>> PurePosixPath('b.py').full_match('*.PY')
Expand Down Expand Up @@ -991,25 +994,15 @@ call fails (for example because the path doesn't exist).
[PosixPath('pathlib.py'), PosixPath('setup.py'), PosixPath('test_pathlib.py')]
>>> sorted(Path('.').glob('*/*.py'))
[PosixPath('docs/conf.py')]

Patterns are the same as for :mod:`fnmatch`, with the addition of "``**``"
which means "this directory and all subdirectories, recursively". In other
words, it enables recursive globbing::

>>> sorted(Path('.').glob('**/*.py'))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]

.. note::
Using the "``**``" pattern in large directory trees may consume
an inordinate amount of time.

.. tip::
Set *follow_symlinks* to ``True`` or ``False`` to improve performance
of recursive globbing.
.. seealso::
:ref:`pathlib-pattern-language` documentation.

This method calls :meth:`Path.is_dir` on the top-level directory and
propagates any :exc:`OSError` exception that is raised. Subsequent
Expand All @@ -1025,11 +1018,11 @@ call fails (for example because the path doesn't exist).
wildcards. Set *follow_symlinks* to ``True`` to always follow symlinks, or
``False`` to treat all symlinks as files.

.. audit-event:: pathlib.Path.glob self,pattern pathlib.Path.glob
.. tip::
Set *follow_symlinks* to ``True`` or ``False`` to improve performance
of recursive globbing.

.. versionchanged:: 3.11
Return only directories if *pattern* ends with a pathname components
separator (:data:`~os.sep` or :data:`~os.altsep`).
.. audit-event:: pathlib.Path.glob self,pattern pathlib.Path.glob

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.
Expand All @@ -1038,12 +1031,29 @@ call fails (for example because the path doesn't exist).
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
Return files and directories if *pattern* ends with "``**``". In
previous versions, only directories were returned.
The *pattern* parameter accepts a :term:`path-like object`.


.. method:: Path.rglob(pattern, *, case_sensitive=None, follow_symlinks=None)

Glob the given relative *pattern* recursively. This is like calling
:func:`Path.glob` with "``**/``" added in front of the *pattern*.

.. seealso::
:ref:`pathlib-pattern-language` and :meth:`Path.glob` documentation.

.. audit-event:: pathlib.Path.rglob self,pattern pathlib.Path.rglob

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.

.. versionchanged:: 3.13
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
The *pattern* parameter accepts a :term:`path-like object`.


.. method:: Path.group(*, follow_symlinks=True)

Return the name of the group owning the file. :exc:`KeyError` is raised
Expand Down Expand Up @@ -1471,44 +1481,6 @@ call fails (for example because the path doesn't exist).
strict mode, and no exception is raised in non-strict mode. In previous
versions, :exc:`RuntimeError` is raised no matter the value of *strict*.

.. method:: Path.rglob(pattern, *, case_sensitive=None, follow_symlinks=None)

Glob the given relative *pattern* recursively. This is like calling
:func:`Path.glob` with "``**/``" added in front of the *pattern*, where
*patterns* are the same as for :mod:`fnmatch`::

>>> sorted(Path().rglob("*.py"))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]

By default, or when the *case_sensitive* keyword-only argument is set to
``None``, this method matches paths using platform-specific casing rules:
typically, case-sensitive on POSIX, and case-insensitive on Windows.
Set *case_sensitive* to ``True`` or ``False`` to override this behaviour.

By default, or when the *follow_symlinks* keyword-only argument is set to
``None``, this method follows symlinks except when expanding "``**``"
wildcards. Set *follow_symlinks* to ``True`` to always follow symlinks, or
``False`` to treat all symlinks as files.

.. audit-event:: pathlib.Path.rglob self,pattern pathlib.Path.rglob

.. versionchanged:: 3.11
Return only directories if *pattern* ends with a pathname components
separator (:data:`~os.sep` or :data:`~os.altsep`).

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.

.. versionchanged:: 3.13
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
The *pattern* parameter accepts a :term:`path-like object`.

.. method:: Path.rmdir()

Remove this directory. The directory must be empty.
Expand Down Expand Up @@ -1639,6 +1611,81 @@ call fails (for example because the path doesn't exist).
.. versionchanged:: 3.10
The *newline* parameter was added.


.. _pathlib-pattern-language:

Pattern language
----------------

The following wildcards are supported in patterns for
:meth:`~PurePath.full_match`, :meth:`~Path.glob` and :meth:`~Path.rglob`:

``**`` (entire segment)
Matches any number of file or directory segments, including zero.
``*`` (entire segment)
Matches one file or directory segment.
``*`` (part of a segment)
Matches any number of non-separator characters, including zero.
``?``
Matches one non-separator character.
``[seq]``
Matches one character in *seq*.
``[!seq]``
Matches one character not in *seq*.

For a literal match, wrap the meta-characters in brackets.
For example, ``"[?]"`` matches the character ``"?"``.

The "``**``" wildcard enables recursive globbing. A few examples:

========================= ===========================================
Pattern Meaning
========================= ===========================================
"``**/*``" Any path with at least one segment.
"``**/*.py``" Any path with a final segment ending "``.py``".
"``assets/**``" Any path starting with "``assets/``".
"``assets/**/*``" Any path starting with "``assets/``", excluding "``assets/``" itself.
========================= ===========================================

.. note::
Globbing with the "``**``" wildcard visits every directory in the tree.
Large directory trees may take a long time to search.

.. versionchanged:: 3.13
Globbing with a pattern that ends with "``**``" returns both files and
directories. In previous versions, only directories were returned.

In :meth:`Path.glob` and :meth:`~Path.rglob`, a trailing slash may be added to
the pattern to match only directories.

.. versionchanged:: 3.11
Globbing with a pattern that ends with a pathname components separator
(:data:`~os.sep` or :data:`~os.altsep`) returns only directories.


Comparison to the :mod:`glob` module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The patterns accepted and results generated by :meth:`Path.glob` and
:meth:`Path.rglob` differ slightly from those by the :mod:`glob` module:

1. Files beginning with a dot are not special in pathlib. This is
like passing ``include_hidden=True`` to :func:`glob.glob`.
2. "``**``" pattern components are always recursive in pathlib. This is like
passing ``recursive=True`` to :func:`glob.glob`.
3. "``**``" pattern components do not follow symlinks by default in pathlib.
This behaviour has no equivalent in :func:`glob.glob`, but you can pass
``follow_symlinks=True`` to :meth:`Path.glob` for compatible behaviour.
4. Like all :class:`PurePath` and :class:`Path` objects, the values returned
from :meth:`Path.glob` and :meth:`Path.rglob` don't include trailing
slashes.
5. The values returned from pathlib's ``path.glob()`` and ``path.rglob()``
include the *path* as a prefix, unlike the results of
``glob.glob(root_dir=path)``.
6. ``bytes``-based paths and :ref:`paths relative to directory descriptors
<dir_fd>` are not supported by pathlib.


Correspondence to tools in the :mod:`os` module
-----------------------------------------------

Expand Down
3 changes: 2 additions & 1 deletion Doc/library/random.rst
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,8 @@ be found in any statistics text.
``a <= b`` and ``b <= N <= a`` for ``b < a``.

The end-point value ``b`` may or may not be included in the range
depending on floating-point rounding in the equation ``a + (b-a) * random()``.
depending on floating-point rounding in the expression
``a + (b-a) * random()``.


.. function:: triangular(low, high, mode)
Expand Down
89 changes: 49 additions & 40 deletions Doc/library/statistics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ or sample.
:func:`fmean` Fast, floating point arithmetic mean, with optional weighting.
:func:`geometric_mean` Geometric mean of data.
:func:`harmonic_mean` Harmonic mean of data.
:func:`kde` Estimate the probability density distribution of the data.
:func:`median` Median (middle value) of data.
:func:`median_low` Low median of data.
:func:`median_high` High median of data.
Expand Down Expand Up @@ -259,6 +260,54 @@ However, for reading convenience, most of the examples show sorted sequences.
.. versionchanged:: 3.10
Added support for *weights*.


.. function:: kde(data, h, kernel='normal')

`Kernel Density Estimation (KDE)
<https://www.itm-conferences.org/articles/itmconf/pdf/2018/08/itmconf_sam2018_00037.pdf>`_:
Create a continuous probability density function from discrete samples.

The basic idea is to smooth the data using `a kernel function
<https://en.wikipedia.org/wiki/Kernel_(statistics)>`_.
to help draw inferences about a population from a sample.

The degree of smoothing is controlled by the scaling parameter *h*
which is called the bandwidth. Smaller values emphasize local
features while larger values give smoother results.

The *kernel* determines the relative weights of the sample data
points. Generally, the choice of kernel shape does not matter
as much as the more influential bandwidth smoothing parameter.

Kernels that give some weight to every sample point include
*normal* or *gauss*, *logistic*, and *sigmoid*.

Kernels that only give weight to sample points within the bandwidth
include *rectangular* or *uniform*, *triangular*, *parabolic* or
*epanechnikov*, *quartic* or *biweight*, *triweight*, and *cosine*.

A :exc:`StatisticsError` will be raised if the *data* sequence is empty.

`Wikipedia has an example
<https://en.wikipedia.org/wiki/Kernel_density_estimation#Example>`_
where we can use :func:`kde` to generate and plot a probability
density function estimated from a small sample:

.. doctest::

>>> sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]
>>> f_hat = kde(sample, h=1.5)
>>> xarr = [i/100 for i in range(-750, 1100)]
>>> yarr = [f_hat(x) for x in xarr]

The points in ``xarr`` and ``yarr`` can be used to make a PDF plot:

.. image:: kde_example.png
:alt: Scatter plot of the estimated probability density function.

.. versionadded:: 3.13


.. function:: median(data)

Return the median (middle value) of numeric data, using the common "mean of
Expand Down Expand Up @@ -1095,46 +1144,6 @@ The final prediction goes to the largest posterior. This is known as the
'female'


Kernel density estimation
*************************

It is possible to estimate a continuous probability density function
from a fixed number of discrete samples.

The basic idea is to smooth the data using `a kernel function such as a
normal distribution, triangular distribution, or uniform distribution
<https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use>`_.
The degree of smoothing is controlled by a scaling parameter, ``h``,
which is called the *bandwidth*.

.. testcode::

def kde_normal(sample, h):
"Create a continuous probability density function from a sample."
# Smooth the sample with a normal distribution kernel scaled by h.
kernel_h = NormalDist(0.0, h).pdf
n = len(sample)
def pdf(x):
return sum(kernel_h(x - x_i) for x_i in sample) / n
return pdf

`Wikipedia has an example
<https://en.wikipedia.org/wiki/Kernel_density_estimation#Example>`_
where we can use the ``kde_normal()`` recipe to generate and plot
a probability density function estimated from a small sample:

.. doctest::

>>> sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]
>>> f_hat = kde_normal(sample, h=1.5)
>>> xarr = [i/100 for i in range(-750, 1100)]
>>> yarr = [f_hat(x) for x in xarr]

The points in ``xarr`` and ``yarr`` can be used to make a PDF plot:

.. image:: kde_example.png
:alt: Scatter plot of the estimated probability density function.

..
# This modelines must appear within the last ten lines of the file.
kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;
Loading

0 comments on commit 54cf55f

Please sign in to comment.