You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deferred reference counting is a mechanism to reduce the overhead of naive reference counting by a large amount by deferring the tracking of references to objects from the stack, eagerly updating the reference count only for references from the heap.
This is great in terms of efficiency, but defers the reclamation of objects. Some users of Python need immediate reclamation.
Deferring the reclamation of just a few short lived, but very large tensors and arrays can cause a process to run out of memory or other limited resource.
What we want is a mechanism that gives us the performance of full deferred reference counting, but the immediate reclamation of naive reference counting.
That is impossible, in general, but we should be able to get fairly close.
Types of reference counts
First of all we need to mechanism to track the reference counts of objects that is much closer in cost to fully deferred reference counts, ie. zero, than to naive reference counting which involves multiple memory accesses and branches. To that end we propose three types of reference counts in increasing order of cost:
Virtual reference counts: The reference count is tracked virtually by an optimizer. If the optimizer can track the creation and destruction of the reference, it can be made virtual.
Embedded reference counts: The reference count is tracked by a tag in the reference.
Immediate reference counts: The reference count is tracked in the object's ob_refcnt field.
The reference count of an object is the sum of its virtual, embedded and immediate reference counts.
Because of virtual reference counts, it may be impossible to know the actual reference count of an object at runtime, just a lower bound.
Optimizing reference counts
References can be made virtual or embedded provided we are guaranteed that there is another reference that will outlive this reference. For references on the stack, this analysis is actually not too complex.
Converting reference counts into virtual reference counts is only possible if the whole lifetime of the reference can be seen. While this is theoretically possible in the bytecode compiler, the dynamic nature of Python and error handling make this impractical. We should be able to do this in tier 2, but the details are out of scope for this proposal.
Embedded reference counts are much easier. Because the reference count is explicit, we don't need to worry about errors or escaping calls. We do, however, need to be sure that the reference will be outlived by another reference or will eventually be handled by the GC.
A reference to an object can have an embedded count if:
we can prove that there another reference that outlives the reference with an embedded count, or
the object is immortal, or
the object allows deferred reclamation
Deferred reclamation of objects
Some objects can support deferred reclamation. This means that their immediate reference count is allowed to drop to zero, when their total reference count is not zero. These objects are no reclaimed when their immediate reference count is zero, but some time later after their total reference count drops to zero.
Because of the indefinite delay in reclamation, not all objects should support deferred reclamation.
Invariants
Virtual and embedded reference counting is only allowed for references on the stack. All references in the heap must be immediate.
All stack references to immortal objects must use embedded or virtual references. This avoids the need for additional immortality checks
Implementation
Using the low bit of the reference
We can use a low bit of the stack reference as the embedded reference count. The free-threading build already does this for deferred references, so this should be relatively easy to implement.
This is a specification for implementing a version of Deferred reference counting
Deferred reference counting is a mechanism to reduce the overhead of naive reference counting by a large amount by deferring the tracking of references to objects from the stack, eagerly updating the reference count only for references from the heap.
This is great in terms of efficiency, but defers the reclamation of objects. Some users of Python need immediate reclamation.
Deferring the reclamation of just a few short lived, but very large tensors and arrays can cause a process to run out of memory or other limited resource.
What we want is a mechanism that gives us the performance of full deferred reference counting, but the immediate reclamation of naive reference counting.
That is impossible, in general, but we should be able to get fairly close.
Types of reference counts
First of all we need to mechanism to track the reference counts of objects that is much closer in cost to fully deferred reference counts, ie. zero, than to naive reference counting which involves multiple memory accesses and branches. To that end we propose three types of reference counts in increasing order of cost:
ob_refcnt
field.The reference count of an object is the sum of its virtual, embedded and immediate reference counts.
Because of virtual reference counts, it may be impossible to know the actual reference count of an object at runtime, just a lower bound.
Optimizing reference counts
References can be made virtual or embedded provided we are guaranteed that there is another reference that will outlive this reference. For references on the stack, this analysis is actually not too complex.
Converting reference counts into virtual reference counts is only possible if the whole lifetime of the reference can be seen. While this is theoretically possible in the bytecode compiler, the dynamic nature of Python and error handling make this impractical. We should be able to do this in tier 2, but the details are out of scope for this proposal.
Embedded reference counts are much easier. Because the reference count is explicit, we don't need to worry about errors or escaping calls. We do, however, need to be sure that the reference will be outlived by another reference or will eventually be handled by the GC.
A reference to an object can have an embedded count if:
Deferred reclamation of objects
Some objects can support deferred reclamation. This means that their immediate reference count is allowed to drop to zero, when their total reference count is not zero. These objects are no reclaimed when their immediate reference count is zero, but some time later after their total reference count drops to zero.
Because of the indefinite delay in reclamation, not all objects should support deferred reclamation.
Invariants
Implementation
Using the low bit of the reference
We can use a low bit of the stack reference as the embedded reference count. The free-threading build already does this for deferred references, so this should be relatively easy to implement.
Code
The text was updated successfully, but these errors were encountered: