Skip to content

Using internal tokenize module's TokenizerIter in multiple threads crashes#120317

@lysnikolaou

Description

@lysnikolaou

Crash report

What happened?

Because the tokenizer is not thread-safe, using the same TokenizerIter in multiple threads under the free-threaded build leads to all kinds of unpredicted behavior. It sometimes succeeds, sometimes throws a SyntaxError when there's none and sometimes crashes with the following.

Example error backtrace
Fatal Python error: tok_backup: tok_backup: wrong character Python runtime state: initialized Current thread 0x0000000172e1b000 (most recent call first): File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 9 in next_token File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap Thread 0x0000000171e0f000 (most recent call first): File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap Thread 0x0000000170e03000 (most recent call first): File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/_base.py", line 550 in set_exception File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 60 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap Thread 0x000000016fdf7000 (most recent call first): File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in Assertion failed: (tok->done != E_ERROR), function _syntaxerror__workerrange, file helpers.c, line 17. File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap Thread 0x000000016edeb000 (most recent call first): File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.pyzsh: abort ./python.exe tmp/t1.py 

A minimal reproducer is the following:

importconcurrent.futuresimportioimporttimeimporttokenizedefnext_token(it): whileTrue: try: r=next(it) print(tokenize.TokenInfo._make(r)) time.sleep(1) exceptStopIteration: returnfor_inrange(20): withconcurrent.futures.ThreadPoolExecutor() asexecutor: source=io.StringIO("a = 'abc'\nprint(b)\nfor _ in a: do_something()") it=tokenize._tokenize.TokenizerIter(source.readline, extra_tokens=False) threads= (executor.submit(next_token, it) for_inrange(5)) fortinconcurrent.futures.as_completed(threads): t.result() print("######################################################")

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Output from running 'python -VV' on the command line:

Python 3.14.0a0 experimental free-threading build (heads/main:c3b6dbff2c8, Jun 10 2024, 14:33:07) [Clang 15.0.0 (clang-1500.3.9.4)]

Linked PRs

Metadata

Metadata

Assignees

Labels

3.13bugs and security fixes3.14bugs and security fixestopic-free-threadingtype-crashA hard crash of the interpreter, possibly with a core dump

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions