Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
WIP: bpo-34990: year 2038 problem in compileall.py#9892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
matrixise commented Oct 15, 2018 • edited by bedevere-bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-bot
Uh oh!
There was an error while loading. Please reload this page.
Use 'L' as an unsigned long for the timestamp
bmwiedemann commented Oct 15, 2018
|
matrixise commented Oct 15, 2018
@bmwiedemann yep, I was in the plane from PySS and I just commited my code, and now, I just saw the test of @tirkarthi . I am going to include it. |
…-01-20 Co-authored-by: Karthikeyan Singaravelan <tir.karthi@gmail.com>
tirkarthi commented Oct 15, 2018
@matrixise Thanks for the PR. I think it's a NEWS entry worthy fix to add a short note that |
tirkarthi commented Oct 15, 2018
@matrixise Sorry, I just noticed you have added one :) |
matrixise commented Oct 15, 2018
@tirkarthi please, could you check my PR with the last commits. |
bmwiedemann commented Oct 15, 2018
btw: the first bad mtime is 0x80000000 or 2147483648 or 2038-01-19T03:14:08+00:00 |
| deftest_compile_all_2038(self): | ||
| withopen(self.source_path, 'r') asf: | ||
| os.utime(f.name, (2147558400, 2147558400)) # Jan 20, 2038 as touch | ||
| self.assertTrue(compileall.compile_file(pathlib.Path(self.source_path))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, but maybe this is not a good place in this file to add the test in, because SOURCE_DATE_EPOCH changes the default .pyc type to be hash-based, which would not hit the error condition. See #9607
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, just copy the code of @tirkarthi . I will wait for a review from @vstinner or an other core-dev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be moved to CompileallTestsBase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tirkarthi commented Oct 15, 2018
@matrixise Thanks, Looks good to me from the context of the issue reported but I have limited understanding about the code so I hope someone else might have a better look. Seems there was some discussion about handling 2038 problem as part of the original PR where this code was added : #4575 (comment) |
bmwiedemann commented Oct 15, 2018
I think the discussion in PR 4575 was about future-proofing the new .pyc format, but it failed to notice that by 2038, only 31 bits are exhausted and bit 32 (which was erroneously interpreted as the sign-bit here), allows to keep the format alive until 2106 - which we can safely assume out-of-scope for today. |
matrixise commented Oct 15, 2018
This is the reason why I move to 'L' instead of 'l' |
serhiy-storchaka left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check also other parts of the stdlib that work with mtime in pyc files. For example zipimport and checkpyc.py need updates. Maybe there are other parts.
Uh oh!
There was an error while loading. Please reload this page.
bedevere-bot commented Oct 16, 2018
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be poked with soft cushions! |
Lib/compileall.py Outdated
| try: | ||
| mtime=int(os.stat(fullname).st_mtime) | ||
| expect=struct.pack('<4sll', importlib.util.MAGIC_NUMBER, | ||
| expect=struct.pack('<4slL', importlib.util.MAGIC_NUMBER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it would be better to handle the second field as an unsigned integer too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importlib uses int(mtime) & 0xFFFFFFFF, I suggest... int(mtime) & 0xFFFF_FFFF for readability :-D
matrixise commented Oct 17, 2018
@serhiy-storchaka last update for In https://github.com/python/cpython/blob/master/Lib/zipimport.py#L602 you convert the dword as a uint32 and LGTM. |
matrixise commented Oct 17, 2018
@vstinner and @serhiy-storchaka I have fixed mtime with 0xFFFF_FFFF and use an unsigned long, where I found |
Tools/scripts/checkpyc.py Outdated
| print(repr(name_c)) | ||
| continue | ||
| mtime=get_long(mtime_str) | ||
| mtime=get_long(mtime_str)&0xFFFF_FFFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a NOP, because mtime_str is just 4 bytes, that cannot encode anything larger than 0xFFFF_FFFF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I am not familiar with the operations on the bits.
serhiy-storchaka left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zipimport needs truncating the timestamp to 32 bit for comparison.
Also update _code_to_timestamp_pyc() in _bootstrap_external.
| continue | ||
| mtime=get_long(mtime_str) | ||
| mtime=get_long(mtime_str)&0xFFFF_FFFF | ||
| ifmtimein{0, -1}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mtime can't be -1 after adding & 0xFFFF_FFFF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add the 0xFFFF_FFFF on the line 57?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
| mtime=int(mtime) | ||
| else: | ||
| mtime=int(-0x100000000+int(mtime)) | ||
| pyc= (importlib.util.MAGIC_NUMBER+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above few lines can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the lines from 38->41?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. They are roughly equivalent to mtime = int(mtime) & 0xFFFF_FFFF if use an unsigned integer format.
| else: | ||
| mtime=int(-0x100000000+int(mtime)) | ||
| pyc= (importlib.util.MAGIC_NUMBER+ | ||
| struct.pack("<iii", 0, int(mtime), size&0xFFFFFFFF) +data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed format units are used.
Tools/scripts/checkpyc.py Outdated
| print(repr(name_c)) | ||
| continue | ||
| mtime=get_long(mtime_str) | ||
| mtime=get_long(mtime_str)&0xFFFF_FFFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sees the definition of get_long(). It returns an unsigned 32-bit integer, and returns -1 only on error. You need to convert not it, but the file timestamp.
Lib/importlib/_bootstrap_external.py Outdated
| data=bytearray(MAGIC_NUMBER) | ||
| data.extend(_pack_uint32(0)) | ||
| data.extend(_pack_uint32(mtime)) | ||
| data.extend(_pack_uint32(mtime)&0xFFFF_FFFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is useless, _pack_uint32() already does "& 0xFFFFFFFF". importlib can be left unchanged.
Lib/test/test_compileall.py Outdated
| data=file.read(12) | ||
| mtime=int(os.stat(self.source_path).st_mtime) | ||
| compare=struct.pack('<4sll', importlib.util.MAGIC_NUMBER, 0, mtime) | ||
| mtime=int(os.stat(self.source_path).st_mtime)&0xFFFF_FFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I like the underscore: it's easier to spot bugs. You missed something :-)
| deftest_compile_all_2038(self): | ||
| withopen(self.source_path, 'r') asf: | ||
| os.utime(f.name, (2147558400, 2147558400)) # Jan 20, 2038 as touch | ||
| self.assertTrue(compileall.compile_file(pathlib.Path(self.source_path))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be moved to CompileallTestsBase.
| @@ -0,0 +1 @@ | |||
| Use an unsigned long for the timestamp for the .pyc file in compileall.py | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You describe the fix, but you should describe the fixed bug instead: compileall now supports timestamps newer than year 2038.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| mtime=get_long(mtime_str) | ||
| ifmtimein{0, -1}: | ||
| print('Bad ".pyc" file', repr(name_c)) | ||
| elifmtime!=st[ST_MTIME]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this code should be modified, but it is modified: should we also cast to int()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like that?
elifint(mtime) !=int(st[ST_MTIME] &0xFFFF_FFFF)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mtime is already int. But st[ST_MTIME] is float.
matrixise commented Oct 17, 2018
@serhiy-storchaka can you continue the review with me, because I am a little bit like a chicken without head ;-) I try to understand with |
_pack_uint32() already does "& 0xFFFFFFFF"
| mtime=int(-0x100000000+int(mtime)) | ||
| pyc= (importlib.util.MAGIC_NUMBER+ | ||
| struct.pack("<iii", 0, int(mtime)&0xFFFF_FFFF, size&0xFFFF_FFFF) +data) | ||
| struct.pack("<iii", 0, int(mtime), size&0xFFFF_FFFF) +data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that you need to update the format string.
| continue | ||
| mtime=get_long(mtime_str) | ||
| mtime=get_long(mtime_str)&0xFFFF_FFFF | ||
| ifmtimein{0, -1}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
| mtime=get_long(mtime_str) | ||
| ifmtimein{0, -1}: | ||
| print('Bad ".pyc" file', repr(name_c)) | ||
| elifmtime!=st[ST_MTIME]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mtime is already int. But st[ST_MTIME] is float.
| @@ -0,0 +1 @@ | |||
| compileall now supports timestamps newer than year 2038 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use punctuation.
| mtime=int(mtime) | ||
| else: | ||
| mtime=int(-0x100000000+int(mtime)) | ||
| pyc= (importlib.util.MAGIC_NUMBER+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. They are roughly equivalent to mtime = int(mtime) & 0xFFFF_FFFF if use an unsigned integer format.
| source_date_epoch=True): | ||
| pass | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change.
brettcannon commented Nov 2, 2018
Due to the WIP in the title I added the DO-NOT-MERGE label. |
matrixise commented Nov 3, 2018
@brettcannon Thank you, I will continue to work on this issue on the next week, this week I am at the hospital with my daughter. |
bmwiedemann commented Jan 21, 2019
Only 19 years to go... Can we get this moving again? |
vstinner commented Jan 22, 2019
Ping @matrixise: they are still some comments of @serhiy-storchaka that you didn't address. |
matrixise commented Feb 4, 2019
@vstinner pong, ok thanks for the reminder. |
matrixise commented Mar 25, 2019
So, I am not serious with this bug, I am going to continue on this bug, I would like to close all my PRs. |
vstinner commented Mar 26, 2019
FYI I ignore this issue since it's still tagged as "WIP". I prefer to wait until it's no longer a WIP to review it. |
brettcannon commented Apr 17, 2019
Removing the import team from reviewing as this is still a WIP. We can be added back when it's ready to go. |
bmwiedemann commented Aug 22, 2019
@matrixise any chance to get this finished this year? |
vstinner commented Aug 22, 2019
It's ok as soon as it's fixed before 2038 :-D |
bmwiedemann commented Aug 22, 2019
I am afraid there are way too many places where software is used unchanged for a long time. Even 18 years. And of course this is not the only timebomb waiting to trigger. |
vstinner commented Aug 22, 2019
It was a joke. Hint: see the emoticon. |
ammaraskar commented Apr 22, 2020
Since Stéphane said:
I opened up #19651 with the more permanent solution of using a 64-bit timestamp value as Benjamin originally suggested in their PEP 552 PR. |
ammaraskar commented Apr 25, 2020
I pinged @matrixise about this and they said I can feel free to take this on since they're too busy to move forward with it. #19708 is a continuation of this PR to make It should be possible to compare the code for both the solutions now. |
tirkarthi commented Aug 24, 2021
I guess this can be close since #19708 is merged |
encukou commented Aug 24, 2021
Indeed. Thanks for noticing. |
Use 'L' as an unsigned long for the timestamp
https://bugs.python.org/issue34990