Embedded and virtual reference counts #700

markshannon · 2024-10-08T10:55:39Z

This is a specification for implementing a version of Deferred reference counting

Deferred reference counting is a mechanism to reduce the overhead of naive reference counting by a large amount by deferring the tracking of references to objects from the stack, eagerly updating the reference count only for references from the heap.

This is great in terms of efficiency, but defers the reclamation of objects. Some users of Python need immediate reclamation.
Deferring the reclamation of just a few short lived, but very large tensors and arrays can cause a process to run out of memory or other limited resource.

What we want is a mechanism that gives us the performance of full deferred reference counting, but the immediate reclamation of naive reference counting.
That is impossible, in general, but we should be able to get fairly close.

Types of reference counts

First of all we need to mechanism to track the reference counts of objects that is much closer in cost to fully deferred reference counts, ie. zero, than to naive reference counting which involves multiple memory accesses and branches. To that end we propose three types of reference counts in increasing order of cost:

Virtual reference counts: The reference count is tracked virtually by an optimizer. If the optimizer can track the creation and destruction of the reference, it can be made virtual.
Embedded reference counts: The reference count is tracked by a tag in the reference.
Immediate reference counts: The reference count is tracked in the object's ob_refcnt field.

The reference count of an object is the sum of its virtual, embedded and immediate reference counts.
Because of virtual reference counts, it may be impossible to know the actual reference count of an object at runtime, just a lower bound.

Optimizing reference counts

References can be made virtual or embedded provided we are guaranteed that there is another reference that will outlive this reference. For references on the stack, this analysis is actually not too complex.

Converting reference counts into virtual reference counts is only possible if the whole lifetime of the reference can be seen. While this is theoretically possible in the bytecode compiler, the dynamic nature of Python and error handling make this impractical. We should be able to do this in tier 2, but the details are out of scope for this proposal.

Embedded reference counts are much easier. Because the reference count is explicit, we don't need to worry about errors or escaping calls. We do, however, need to be sure that the reference will be outlived by another reference or will eventually be handled by the GC.

A reference to an object can have an embedded count if:

we can prove that there another reference that outlives the reference with an embedded count, or
the object is immortal, or
the object allows deferred reclamation

Deferred reclamation of objects

Some objects can support deferred reclamation. This means that their immediate reference count is allowed to drop to zero, when their total reference count is not zero. These objects are no reclaimed when their immediate reference count is zero, but some time later after their total reference count drops to zero.

Because of the indefinite delay in reclamation, not all objects should support deferred reclamation.

Invariants

Virtual and embedded reference counting is only allowed for references on the stack. All references in the heap must be immediate.
All stack references to immortal objects must use embedded or virtual references. This avoids the need for additional immortality checks

Implementation

Using the low bit of the reference

We can use a low bit of the stack reference as the embedded reference count. The free-threading build already does this for deferred references, so this should be relatively easy to implement.

Code

#define Py_TAG_BITS 1
#define Py_TAG_REFCNT 1
#define TO_PTR(REF) ((PyObject *)((REF).bits)
#define TO_PTR_MASKED(REF) ((PyObject *)(((REF).bits) & (~Py_TAG_BITS))

static inline int
PyStackRef_HasCount(_PyStackRef ref)
{
    return ref.bits & Py_TAG_REFCNT;

static inline PyObject *
PyStackRef_AsPyObjectBorrow(_PyStackRef ref)
{
    return TO_PTR_MASKED(ref);
}

static inline PyObject *
PyStackRef_AsPyObjectSteal(_PyStackRef ref)
{
    if (PyStackRef_HasCount(ref)) {
        return Py_NewRef(TO_PTR_MASKED(ref));
    }
    else {
        return TO_PTR(ref);
    }
}

static inline _PyStackRef
_PyStackRef_FromPyObjectSteal(PyObject *obj)
{
    unsigned int tag = _Py_IsImmortal(obj) ? (Py_TAG_REFCNT) : 0;
    return ((_PyStackRef){.bits = ((uintptr_t)(obj)) | tag;
}

static inline _PyStackRef
PyStackRef_FromPyObjectNew(PyObject *obj)
{
    if (_Py_IsDeferrable(obj)) {
        return (_PyStackRef){ .bits = (uintptr_t)obj | Py_TAG_REFCNT};
    }
    Py_INCREF(obj);
    return (_PyStackRef){ .bits = (uintptr_t)obj };

}

static inline _PyStackRef
PyStackRef_FromPyObjectImmortal(PyObject *obj)
{
    assert(_Py_IsImmortal(obj));
    return (_PyStackRef){ .bits = (uintptr_t)obj | Py_TAG_REFCNT};
}

static inline _PyStackRef
PyStackRef_DUP(_PyStackRef ref)
{
    assert(!PyStackRef_IsNull(ref));
    if (!PyStackRef_IsDeferred(ref)) {
        Py_INCREF_MORTAL(TO_PTR(ref));
    }
    return ref;
}

static inline void
PyStackRef_CLOSE(_PyStackRef ref)
{
    assert(!PyStackRef_IsNull(ref));
    if (!PyStackRef_HasCount(ref)) {
        Py_DECREF_MORTAL(TO_PTR(ref));
    }
    return ref;
}

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedded and virtual reference counts #700

Embedded and virtual reference counts #700

markshannon commented Oct 8, 2024 •

edited

Loading

Embedded and virtual reference counts #700

Embedded and virtual reference counts #700

Comments

markshannon commented Oct 8, 2024 • edited Loading

Types of reference counts

Optimizing reference counts

Deferred reclamation of objects

Invariants

Implementation

Using the low bit of the reference

Code

markshannon commented Oct 8, 2024 •

edited

Loading