Skip to content

Conversation

@barneygale
Copy link
Contributor

@barneygalebarneygale commented Apr 10, 2024

Don't bother calling os.scandir() to scan for literal pattern segments, like foo in foo/*.py. Instead, append the segment(s) as-is and call through to the next selector with exists=False, which signals that the path might not exist. Subsequent selectors will call os.scandir() or os.lstat() to filter out missing paths as needed.

Timings:

$ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib'))" 5000 loops, best of 5: 69.4 usec per loop 20000 loops, best of 5: 13.6 usec per loop # --> 5.1x faster $ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib/'))" 5000 loops, best of 5: 73.3 usec per loop 20000 loops, best of 5: 14.2 usec per loop # --> 5.16x faster $ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib/*'))" 1000 loops, best of 5: 362 usec per loop 1000 loops, best of 5: 301 usec per loop # --> 1.2x faster $ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib/*/__init__.py'))" 200 loops, best of 5: 1.18 msec per loop 1000 loops, best of 5: 273 usec per loop # --> 4.32x faster $ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib/**/__init__.py'))" 50 loops, best of 5: 9.46 msec per loop 50 loops, best of 5: 5.72 msec per loop # --> 1.65x faster $ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()""list(p.glob('Lib/pathlib/__init__.py'))" 1000 loops, best of 5: 210 usec per loop 20000 loops, best of 5: 14.9 usec per loop # --> 14.1x faster

…al parts Don't bother calling `os.scandir()` to scan for literal pattern segments, like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call through to the next selector with `exists=False`, which signals that the path might not exist. Subsequent selectors will call `os.scandir()` or `os.lstat()` to filter out missing paths as needed.
@barneygalebarneygale merged commit 0eb52f5 into python:mainApr 12, 2024
diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
…al parts (python#117732) Don't bother calling `os.scandir()` to scan for literal pattern segments, like `foo` in `foo/*.py`. Instead, append the segment(s) as-is and call through to the next selector with `exists=False`, which signals that the path might not exist. Subsequent selectors will call `os.scandir()` or `os.lstat()` to filter out missing paths as needed.
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performancePerformance or resource usagetopic-pathlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

@barneygale