Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-117841: Add C implementation of ntpath.lexists#117842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
nineteendo commented Apr 13, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Co-authored-by: Eryk Sun <eryksun@gmail.com>
…lexists()` Use `os.path.lexists()` rather than `os.lstat()` to test whether paths exist. This is equivalent on POSIX, but faster on Windows.
ntpath.lexistseryksun commented Apr 14, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I'm working on what I hope will be an improved version compared to the first draft. AFAIK, the |
eryksun commented Apr 14, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Here's the revised implementation of staticPyObject*nt_exists(PyObject*path, intfollow_symlinks){path_t_path=PATH_T_INITIALIZE("exists", "path", 0, 1); HANDLEhfile; BOOLtraverse=follow_symlinks; intresult=0; if (!path_converter(path, &_path)){path_cleanup(&_path); if (PyErr_ExceptionMatches(PyExc_ValueError)){PyErr_Clear(); Py_RETURN_FALSE} returnNULL} Py_BEGIN_ALLOW_THREADSif (_path.fd!=-1){hfile=_Py_get_osfhandle_noraise(_path.fd); if (hfile!=INVALID_HANDLE_VALUE){result=1} } elseif (_path.wide){BOOLslow_path= TRUE; FILE_STAT_BASIC_INFORMATIONstatInfo; if (_Py_GetFileInformationByName(_path.wide, FileStatBasicByNameInfo, &statInfo, sizeof(statInfo))){if (!(statInfo.FileAttributes&FILE_ATTRIBUTE_REPARSE_POINT) || !follow_symlinks&&IsReparseTagNameSurrogate(statInfo.ReparseTag)){slow_path= FALSE; result=1} else{// reparse point but not name-surrogatetraverse= TRUE} } elseif (_Py_GetFileInformationByName_ErrorIsTrustworthy( GetLastError())){slow_path= FALSE} if (slow_path){BOOLtraverse=follow_symlinks; if (!traverse){hfile=CreateFileW(_path.wide, FILE_READ_ATTRIBUTES, 0, NULL, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_BACKUP_SEMANTICS, NULL); if (hfile!=INVALID_HANDLE_VALUE){FILE_ATTRIBUTE_TAG_INFOinfo; if (GetFileInformationByHandleEx(hfile, FileAttributeTagInfo, &info, sizeof(info))){if (!(info.FileAttributes&FILE_ATTRIBUTE_REPARSE_POINT) ||IsReparseTagNameSurrogate(info.ReparseTag)){result=1} else{// reparse point but not name-surrogatetraverse= TRUE} } else{// device or legacy filesystemresult=1} CloseHandle(hfile)} else{STRUCT_STATst; switch (GetLastError()){caseERROR_ACCESS_DENIED: caseERROR_SHARING_VIOLATION: caseERROR_CANT_ACCESS_FILE: caseERROR_INVALID_PARAMETER: if (!LSTAT(_path.wide, &st)){result=1} } } } if (traverse){hfile=CreateFileW(_path.wide, FILE_READ_ATTRIBUTES, 0, NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); if (hfile!=INVALID_HANDLE_VALUE){CloseHandle(hfile); result=1} else{STRUCT_STATst; switch (GetLastError()){caseERROR_ACCESS_DENIED: caseERROR_SHARING_VIOLATION: caseERROR_CANT_ACCESS_FILE: caseERROR_INVALID_PARAMETER: if (!STAT(_path.wide, &st)){result=1} } } } } } Py_END_ALLOW_THREADSpath_cleanup(&_path); if (result){Py_RETURN_TRUE} Py_RETURN_FALSE} |
Co-authored-by: Eryk Sun <eryksun@gmail.com>
nineteendo commented Apr 15, 2024
Are you happy with the new implementation, or do you have more ideas? |
nineteendo commented Apr 15, 2024
The performance for existent files has already greatly improved. Nice job! |
eryksun commented Apr 15, 2024
That's all I have for now. |
serhiy-storchaka left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance looks correct, although I am not happy that we added so much complicated code for pure optimization of functions which should not be used in performance critical code (os.scandir() should be used instead).
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
eryksun commented Apr 15, 2024
The performance of A disappointment with these builtin functions is that they can't be leveraged in Regarding Footnotes
|
barneygale commented Apr 15, 2024
We can at least use it from I wouldn't mind if we made the |
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
nineteendo commented Apr 26, 2024
cc @zooba |
zooba commented May 1, 2024
So I'm hesitant to take this for three reasons (and these do apply to previous enhancements as well, but didn't exist at that time):
If someone can show a scenario where you would have a significant (hundreds+) list of paths, need to check whether they exist, but couldn't use one of I don't think we can really reduce the amount of code. If it happened to be shorter and easier to follow then I'd be less concerned about long-term maintenance, but I'm pretty sure it's as good as it gets (without adding indirection and hurting the performance again - same tradeoff we made with the earlier |
barneygale commented May 1, 2024
On this specifically, one example would be globbing for Lines 508 to 520 in a7711a2
Note that glob results can include dangling symlinks, hence |
eryksun commented May 2, 2024
The implementation was consolidated with |
nineteendo commented May 2, 2024
nineteendo commented May 2, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Could we simplify this? It already seems to know if it's a directory of file: Lines 5123 to 5129 in a6b610a
Lines 5220 to 5226 in a6b610a
|
eryksun commented May 2, 2024 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Yes, the fast path can be simplified. If it's not a reparse point, |
nineteendo commented May 9, 2024
Tracking further in #118755. |

Benchmark
script
ntpath.lexists#117841