Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"image": "ghcr.io/python/devcontainer:2025.05.29.15334414373",
"image": "ghcr.io/python/devcontainer:latest",
"onCreateCommand": [
// Install common tooling.
"dnf",
Expand Down
84 changes: 74 additions & 10 deletions Doc/library/locale.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,17 @@ The :mod:`locale` module defines the following exception and functions:

If *locale* is given and not ``None``, :func:`setlocale` modifies the locale
setting for the *category*. The available categories are listed in the data
description below. *locale* may be a string, or an iterable of two strings
(language code and encoding). If it's an iterable, it's converted to a locale
name using the locale aliasing engine. An empty string specifies the user's
description below. *locale* may be a :ref:`string <locale_name>`, or a pair,
language code and encoding. An empty string specifies the user's
default settings. If the modification of the locale fails, the exception
:exc:`Error` is raised. If successful, the new locale setting is returned.

If *locale* is a pair, it is converted to a locale name using
the locale aliasing engine.
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding can be ``None``.

If *locale* is omitted or ``None``, the current setting for *category* is
returned.

Expand Down Expand Up @@ -345,22 +350,26 @@ The :mod:`locale` module defines the following exception and functions:
``'LANG'``. The GNU gettext search path contains ``'LC_ALL'``,
``'LC_CTYPE'``, ``'LANG'`` and ``'LANGUAGE'``, in that order.

Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.

.. deprecated-removed:: 3.11 3.15


.. function:: getlocale(category=LC_CTYPE)

Returns the current setting for the given locale category as sequence containing
*language code*, *encoding*. *category* may be one of the :const:`!LC_\*` values
except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.
Returns the current setting for the given locale category as a tuple containing
the language code and encoding. *category* may be one of the :const:`!LC_\*`
values except :const:`LC_ALL`. It defaults to :const:`LC_CTYPE`.

Except for the code ``'C'``, the language code corresponds to :rfc:`1766`.
*language code* and *encoding* may be ``None`` if their values cannot be
The language code has the same format as a :ref:`locale name <locale_name>`,
but without encoding and ``@``-modifier.
The language code and encoding may be ``None`` if their values cannot be
determined.
The "C" locale is represented as ``(None, None)``.


.. function:: getpreferredencoding(do_setlocale=True)
Expand Down Expand Up @@ -615,6 +624,61 @@ whose high bit is set (i.e., non-ASCII bytes) are never converted or considered
part of a character class such as letter or whitespace.


.. _locale_name:

Locale names
------------

The format of the locale name is platform dependent, and the set of supported
locales can depend on the system configuration.

On Posix platforms, it usually has the format [1]_:

.. productionlist:: locale_name
: language ["_" territory] ["." charset] ["@" modifier]

where *language* is a two- or three-letter language code from `ISO 639`_,
*territory* is a two-letter country or region code from `ISO 3166`_,
*charset* is a locale encoding, and *modifier* is a script name,
a language subtag, a sort order identifier, or other locale modifier
(for example, "latin", "valencia", "stroke" and "euro").

On Windows, several formats are supported. [2]_ [3]_
A subset of `IETF BCP 47`_ tags:

.. productionlist:: locale_name
: language ["-" script] ["-" territory] ["." charset]
: language ["-" script] "-" territory "-" modifier

where *language* and *territory* have the same meaning as in Posix,
*script* is a four-letter script code from `ISO 15924`_,
and *modifier* is a language subtag, a sort order identifier
or custom modifier (for example, "valencia", "stroke" or "x-python").
Both hyphen (``'-'``) and underscore (``'_'``) separators are supported.
Only UTF-8 encoding is allowed for BCP 47 tags.

Windows also supports locale names in the format:

.. productionlist:: locale_name
: language ["_" territory] ["." charset]

where *language* and *territory* are full names, such as "English" and
"United States", and *charset* is either a code page number (for example, "1252")
or UTF-8.
Only the underscore separator is supported in this format.

The "C" locale is supported on all platforms.

.. _ISO 639: https://www.iso.org/iso-639-language-code
.. _ISO 3166: https://www.iso.org/iso-3166-country-codes.html
.. _IETF BCP 47: https://www.rfc-editor.org/info/bcp47
.. _ISO 15924: https://www.unicode.org/iso15924/

.. [1] `IEEE Std 1003.1-2024; 8.2 Internationalization Variables <https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap08.html#tag_08_02>`_
.. [2] `UCRT Locale names, Languages, and Country/Region strings <https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings>`_
.. [3] `Locale Names <https://learn.microsoft.com/en-us/windows/win32/intl/locale-names>`_


.. _embedding-locale:

For extension writers and programs that embed Python
Expand Down
2 changes: 2 additions & 0 deletions Include/internal/pycore_ceval.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,10 @@ struct _ceval_runtime_state;

// Export for '_lsprof' shared extension
PyAPI_FUNC(int) _PyEval_SetProfile(PyThreadState *tstate, Py_tracefunc func, PyObject *arg);
extern int _PyEval_SetProfileAllThreads(PyInterpreterState *interp, Py_tracefunc func, PyObject *arg);

extern int _PyEval_SetTrace(PyThreadState *tstate, Py_tracefunc func, PyObject *arg);
extern int _PyEval_SetTraceAllThreads(PyInterpreterState *interp, Py_tracefunc func, PyObject *arg);

extern int _PyEval_SetOpcodeTrace(PyFrameObject *f, bool enable);

Expand Down
5 changes: 2 additions & 3 deletions Include/internal/pycore_interp_structs.h
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,6 @@ struct _ceval_runtime_state {
// For example, we use a preallocated array
// for the list of pending calls.
struct _pending_calls pending_mainthread;
PyMutex sys_trace_profile_mutex;
};


Expand Down Expand Up @@ -951,8 +950,8 @@ struct _is {
PyDict_WatchCallback builtins_dict_watcher;

_Py_GlobalMonitors monitors;
bool sys_profile_initialized;
bool sys_trace_initialized;
_PyOnceFlag sys_profile_once_flag;
_PyOnceFlag sys_trace_once_flag;
Py_ssize_t sys_profiling_threads; /* Count of threads with c_profilefunc set */
Py_ssize_t sys_tracing_threads; /* Count of threads with c_tracefunc set */
PyObject *monitoring_callables[PY_MONITORING_TOOL_IDS][_PY_MONITORING_EVENTS];
Expand Down
2 changes: 2 additions & 0 deletions InternalDocs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ Program Execution

- [Quiescent-State Based Reclamation (QSBR)](qsbr.md)

- [Stack protection](stack_protection.md)

Modules
---

Expand Down
61 changes: 61 additions & 0 deletions InternalDocs/stack_protection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Stack Protection

CPython protects against stack overflow in the form of runaway, or just very deep, recursion by raising a `RecursionError` instead of just crashing.
Protection against pure Python stack recursion has existed since very early, but in 3.12 we added protection against stack overflow
in C code. This was initially implemented using a counter and later improved in 3.14 to use the actual stack depth.
For those platforms that support it (Windows, Mac, and most Linuxes) we query the operating system to find the stack bounds.
For other platforms we use conserative estimates.


The C stack looks like this:

```
+-------+ <--- Top of machine stack
| |
| |

~~

| |
| |
+-------+ <--- Soft limit
| |
| | _PyOS_STACK_MARGIN_BYTES
| |
+-------+ <--- Hard limit
| |
| | _PyOS_STACK_MARGIN_BYTES
| |
+-------+ <--- Bottom of machine stack
```


We get the current stack pointer using compiler intrinsics where available, or by taking the address of a C local variable. See `_Py_get_machine_stack_pointer()`.

The soft and hard limits pointers are set by calling `_Py_InitializeRecursionLimits()` during thread initialization.

Recursion checks are performed by `_Py_EnterRecursiveCall()` or `_Py_EnterRecursiveCallTstate()` which compare the stack pointer to the soft limit. If the stack pointer is lower than the soft limit, then `_Py_CheckRecursiveCall()` is called which checks against both the hard and soft limits:

```python
kb_used = (stack_top - stack_pointer)>>10
if stack_pointer < hard_limit:
FatalError(f"Unrecoverable stack overflow (used {kb_used} kB)")
elif stack_pointer < soft_limit:
raise RecursionError(f"Stack overflow (used {kb_used} kB)")
```

### Diagnosing and fixing stack overflows

For stack protection to work correctly the amount of stack consumed between calls to `_Py_EnterRecursiveCall()` must be less than `_PyOS_STACK_MARGIN_BYTES`.

If you see a traceback ending in: `RecursionError: Stack overflow (used ... kB)` then the stack protection is working as intended. If you don't expect to see the error, then check the amount of stack used. If it seems low then CPython may not be configured properly.

However, if you see a fatal error or crash, then something is not right.
Either a recursive call is not checking `_Py_EnterRecursiveCall()`, or the amount of C stack consumed by a single call exceeds `_PyOS_STACK_MARGIN_BYTES`. If a hard crash occurs, it probably means that the amount of C stack consumed is more than double `_PyOS_STACK_MARGIN_BYTES`.

Likely causes:
* Recursive code is not calling `_Py_EnterRecursiveCall()`
* `-O0` compilation flags, especially for Clang. With no optimization, C calls can consume a lot of stack space
* Giant, complex functions in third-party C extensions. This is unlikely as the function in question would need to be more complicated than the bytecode interpreter.
* `_PyOS_STACK_MARGIN_BYTES` is just too low.
* `_Py_InitializeRecursionLimits()` is not setting the soft and hard limits correctly for that platform.
Loading
Loading