Skip to content

gh-115649: Copy the filename into main interpreter before intern in import.c#120315

Merged
kumaraditya303 merged 5 commits intopython:mainfrom
aisk:copy-interned-string
Jun 17, 2024
Merged

gh-115649: Copy the filename into main interpreter before intern in import.c#120315
kumaraditya303 merged 5 commits intopython:mainfrom
aisk:copy-interned-string

Conversation

@aisk
Copy link
Member

@aisk aisk commented Jun 10, 2024

Test code:

import _interpreters

iid = _interpreters.create()
_interpreters.exec(iid, "import _tkinter")
_interpreters.destroy(iid)

Execute result:

$ ./python c.py
Debug memory block at address p=0x7899b89e3520: API '�'
    15987178197214944733 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0xdd *** OUCH
        at p-6: 0xdd *** OUCH
        at p-5: 0xdd *** OUCH
        at p-4: 0xdd *** OUCH
        at p-3: 0xdd *** OUCH
        at p-2: 0x00 *** OUCH
        at p-1: 0x00 *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0xddde5677967c12fd are fish: Job 1, './python c.py' terminated by signal SIGSEGV (Address boundary error)

In current main branch, when a single-phase extension (_tkinter is an example) got imported, it's init function will be execute in the main interpreter, although the import statement is executed in the sub-interpreter.

And there is a workaround code to intern the filename object to the main interpreter's interned string dict:

cpython/Python/import.c

Lines 1970 to 1973 in c3b6dbf

// XXX There's a refleak somewhere with the filename.
// Until we can track it down, we intern it.
PyObject *filename = Py_NewRef(info->filename);
PyUnicode_InternInPlace(&filename);

I think the filename is created in the sub-interpreter in this test code, and it's allocated with the sub-interpreter's obmalloc state, which will be freed when the sub-interpreter is destroyed by statement _interpreters.destroy(iid) in the test code.

When the main interpreter shutdown, it will try to free the interned string dict, this will cause issues and crash the interpreter.

Using gdb to dump the call stack:

(gdb) bt
#0  0x00005555556c9414 in _PyObject_DebugDumpAddress (p=p@entry=0x7ffff731b520) at Objects/obmalloc.c:3003
#1  0x00005555556c9705 in _PyMem_DebugCheckAddress (func=func@entry=0x5555559106a0 <__func__.8> "_PyMem_DebugRawFree", api=114 'r', p=p@entry=0x7ffff731b520) at Objects/obmalloc.c:2921
#2  0x00005555556c97aa in _PyMem_DebugRawFree (ctx=0x555555afecb0 <_PyRuntime+560>, p=0x7ffff731b520) at Objects/obmalloc.c:2749
#3  0x00005555556ddee4 in PyMem_RawFree (ptr=ptr@entry=0x7ffff731b520) at Objects/obmalloc.c:963
#4  0x00005555556ddfe8 in _PyObject_Free (ctx=0x0, p=0x7ffff731b520) at Objects/obmalloc.c:2416
#5  0x00005555556c97eb in _PyMem_DebugRawFree (ctx=ctx@entry=0x555555afed10 <_PyRuntime+656>, p=p@entry=0x7ffff731b530) at Objects/obmalloc.c:2754
#6  0x00005555556c9b57 in _PyMem_DebugFree (ctx=0x555555afed10 <_PyRuntime+656>, ptr=0x7ffff731b530) at Objects/obmalloc.c:2891
#7  0x00005555556de3dc in PyObject_Free (ptr=<optimized out>) at Objects/obmalloc.c:1323
#8  0x00005555557107fb in unicode_dealloc (unicode=0x7ffff731b530) at Objects/unicodeobject.c:1608
#9  0x00005555556c4735 in _Py_Dealloc (op=op@entry=0x7ffff731b530) at Objects/object.c:2854
#10 0x00005555556ac8da in Py_DECREF (filename=filename@entry=0x55555589e242 "./Include/refcount.h", lineno=lineno@entry=459, op=0x7ffff731b530) at ./Include/refcount.h:351
#11 0x00005555556ac8f9 in Py_XDECREF (op=<optimized out>) at ./Include/refcount.h:459
#12 0x00005555556ac9ad in dictkeys_decref (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>, dk=dk@entry=0x555555c51800, use_qsbr=use_qsbr@entry=false) at Objects/dictobject.c:492
#13 0x00005555556b489d in clear_lock_held (op=op@entry=0x7ffff7500050) at Objects/dictobject.c:2766
#14 0x00005555556b4977 in PyDict_Clear (op=op@entry=0x7ffff7500050) at Objects/dictobject.c:2790
#15 0x000055555571313b in clear_interned_dict (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>) at Objects/unicodeobject.c:295
#16 0x000055555573c1a6 in _PyUnicode_ClearInterned (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>) at Objects/unicodeobject.c:15098
#17 0x0000555555805f17 in finalize_interp_types (interp=0x555555b17798 <_PyRuntime+101656>) at Python/pylifecycle.c:1839
#18 0x0000555555805f98 in finalize_interp_clear (tstate=tstate@entry=0x555555b4efe8 <_PyRuntime+329064>) at Python/pylifecycle.c:1884
#19 0x0000555555806102 in Py_FinalizeEx () at Python/pylifecycle.c:2107
#20 0x0000555555832ecc in Py_RunMain () at Modules/main.c:720
#21 0x0000555555832f3e in pymain_main (args=args@entry=0x7fffffffe4e0) at Modules/main.c:748
#22 0x0000555555833004 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:772
#23 0x00005555555d4842 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Please note that the #12 frame #12 0x00005555556ac9ad in dictkeys_decref (interp=interp@entry=0x555555b17798 <_PyRuntime+101656>, dk=dk@entry=0x555555c51800, use_qsbr=use_qsbr@entry=false) at Objects/dictobject.c:492, it's code is:

Py_XDECREF(entries[i].me_value);

We can't inspect the value here because it's memory got destroyed already. But adding these codes to L492:

                if (n == 3408) {  // 3408 is the size of the interned string dict, you may need to adjust it on your environment
                    PyObject_Print(entries[i].me_value, stderr, 0);
                    fprintf(stderr, "\n");
                }

We can see that when the interpreter crashed, it's calling _Py_Dealloc on filename string object:

...
'TkttType'
'typename'
'Tcl_Obj'
'/home/xxxxx/Codes/cpython/build/lib.linux-x86_64-3.14-pydebug/_tkinter.cpython-314d-x86_64-linux-gnu.so'  <- this

Clone the filename and re-create it on the main interpreter can avoid the crash.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants