Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An object referenced only through it's own __dict__ can get collected too early. #130327

Open
ktosiek opened this issue Feb 19, 2025 · 5 comments
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@ktosiek
Copy link

ktosiek commented Feb 19, 2025

Bug report

Bug description:

After reusing one object's __dict__ for another object, and adding a reference to the original object in a dictionary on the target object, the original object might get collected too early (and cause a whole other dictionary to get cleared). I hope the code below describes it better than my words.

I've bisected this problem to c32dc47.

This is the smallest reproduction I've found so far:

import gc
import sys

print(sys.version_info)

class A:
    def __del__(self):
        print("del", hex(id(self)))

a = A()
a.attr_a = "test value"
b = A()
b.__dict__ = a.__dict__
b.other_attr = {"self reference": a, "other data": 42}
del a
print("before gc:", b.__dict__)
gc.collect()

print("after gc:", b.__dict__)

Results on 3.13:

sys.version_info(major=3, minor=13, micro=0, releaselevel='alpha', serial=5)
before gc: {'attr_a': 'test value', 'other_attr': {'self reference': <__main__.A object at 0x76cc6317a900>, 'other data': 42}}
del 0x76cc6317a900
after gc: {'attr_a': 'test value', 'other_attr': {}}
del 0x76cc6256ccd0

Results on 3.12, and what I expected to see:

sys.version_info(major=3, minor=12, micro=7, releaselevel='final', serial=0)
before gc: {'attr_a': 'test value', 'other_attr': {'self reference': <__main__.A object at 0x77496b531130>, 'other data': 42}}
after gc: {'attr_a': 'test value', 'other_attr': {'self reference': <__main__.A object at 0x77496b531130>, 'other data': 42}}
del 0x77496b531160
del 0x77496b531130

I've noticed this change in behaviour while working with Locust.io, the original issue for reference: locustio/locust#3050

CPython versions tested on:

3.13, CPython main branch

Operating systems tested on:

Linux

@ktosiek ktosiek added the type-bug An unexpected behavior, bug, or error label Feb 19, 2025
@ZeroIntensity ZeroIntensity added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Feb 20, 2025
@ZeroIntensity
Copy link
Member

I suspect it's probably something about __dict__ not getting properly overwritten for objects with inline values. I'm looking into it.

@ZeroIntensity
Copy link
Member

I haven't fully figured it out, but it seems the problem is (somewhat) here:

FT_ATOMIC_STORE_PTR(_PyObject_ManagedDictPointer(obj)->dict, new_dict);

The dictionary is being stored in the object's managed pointer, but the original object still assumes it has ownership, so it clears it upon being deallocated. We need to detach ownership upon doing that, or copy the inline values of a to b. I'll put up a PR (hopefully) later today, assuming that my analysis is correct.

@colesbury
Copy link
Contributor

I think the issue is in PyObject_VisitManagedDict. The problem is not any tp_clear or tp_dealloc handlers, because nothing should be getting cleared or deallocated during GC in the example. There are cycles, but no cyclic trash. All the relevant objects, a, b, and the shared dictionary should stay alive.

cpython/Objects/dictobject.c

Lines 7146 to 7164 in 6c450f4

int
PyObject_VisitManagedDict(PyObject *obj, visitproc visit, void *arg)
{
PyTypeObject *tp = Py_TYPE(obj);
if((tp->tp_flags & Py_TPFLAGS_MANAGED_DICT) == 0) {
return 0;
}
if (tp->tp_flags & Py_TPFLAGS_INLINE_VALUES) {
PyDictValues *values = _PyObject_InlineValues(obj);
if (values->valid) {
for (Py_ssize_t i = 0; i < values->capacity; i++) {
Py_VISIT(values->values[i]);
}
return 0;
}
}
Py_VISIT(_PyObject_ManagedDictPointer(obj)->dict);
return 0;
}

If there's a managed dictionary, we should visit the dictionary, not the inline values, even if the values are "valid".

@ZeroIntensity
Copy link
Member

Thanks for the insight, Sam. I haven't dived into these parts of the codebase before, so bear with me here. I'm not sure it's possible for an object to have valid inline values while also having a non-null managed dictionary.

The relevant code that sets the managed dictionary pointer also immediately invalidates the inline values:

cpython/Objects/dictobject.c

Lines 7174 to 7181 in 5a13faa

FT_ATOMIC_STORE_PTR(_PyObject_ManagedDictPointer(obj)->dict, new_dict);
if (values->valid) {
FT_ATOMIC_STORE_UINT8(values->valid, 0);
for (Py_ssize_t i = 0; i < values->capacity; i++) {
Py_CLEAR(values->values[i]);
}
}

The problem is not any tp_clear or tp_dealloc handlers, because nothing should be getting cleared or deallocated during GC in the example

Hmm. If not tp_clear (or subtype_clear, I guess), what might clear a managed dictionary?

@colesbury
Copy link
Contributor

I'm not sure it's possible for an object to have valid inline values while also having a non-null managed dictionary

Yes, this is allowed and fairly common. It allows fast access to the attributes via LOAD_ATTR_INSTANCE_VALUE even if there are also accesses via obj.__dict__. For example:

class MyObj: pass

a = MyObj()
a.attr = "value"
d = a.__dict__  # a now has a managed dictionary and still has valid inline values
tmp = a.attr # this will be specialized to `LOAD_ATTR_INSTANCE_VALUE` if executed enough times

The relevant code that sets the managed dictionary pointer also immediately invalidates the inline values

That's the code path for changing the managed dictionary. For example, obj.__dict__ = new_dict will invalidate any previous inline values of obj. The most common way way to create a manage dictionary is just the (read) access obj.__dict__, which will call ensure_managed_dict and doesn't invalidate inline values.

So when we have:

a = A()
b = A()
b.__dict__ = A.__dict__
  • Both a and b have the same (non-NULL) managed dictionary
  • That dictionary's values point to the inline values in a
  • a's inline values are valid
  • b's inline values are not valid

Hmm. If not tp_clear (or subtype_clear, I guess), what might clear a managed dictionary

Yes, subtype_clear is clearing the managed dictionary, but the bug is not in subtype_clear. The gc.collect() call shouldn't be calling subtype_clear() or any other tp_clear hooks. Although there are cycles in the repro, there isn't any cyclic trash. Even though there's a del a statement, that object is still reachable via b. The bug is that the GC incorrectly thinks it's cyclic trash and calls subtype_clear on it.

The bugs are in PyObject_VisitManagedDict and also dict_traverse.

Concretely, what's happening is:

  • The GC determines the b is reachable (good)
  • It then tries to transitively mark objects as reachable via b (this is in move_unreachable in gc.c or mark_reachable in gc_free_threading.c), which eventually calls PyObject_VisitManagedDict on b, which visits b's dictionary (good)
  • The dictionary is traversed, but this doesn't traverse the values because they're embedded (in a). (bad!)

So now a (and any other values in the dictionary) incorrectly look like cyclic trash to the GC because they're not marked as reachable.

What should happen?

There's ambiguity here because for objects like a, the dictionary and the inline values share a reference count for each object in the inline values. So it's not clear who "owns" the objects in the inline values and who should traverse them.

I think the logic should be:

  • If there's a non-NULL managed dictionary, PyObject_VisitManagedDict should traverse the dictionary and NOT traverse the inline values.
  • dict_traverse should traverse its values even if they are embedded

Basically, we should treat it as if the dictionary owns the values, not the object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants