Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
bpo-34033: distutils: byte_compile() sort files#8057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
vstinner commented Jul 3, 2018 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
vstinner commented Jul 3, 2018
I'm not sure if this change should be documented in What's New in Python 3.8. IMHO it's fine to backport this change to Python 3.7. But I don't think that it's worth it to backport the change to 2.7 and/or 3.6. |
vstinner commented Jul 3, 2018
I created this PR as a follow-up of https://bugs.python.org/issue34022 |
methane left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some credits on NEWS entry?
serhiy-storchaka commented Jul 3, 2018
Perhaps it is worth to open an issue for discussing this and for reference in the NEWS file. |
vstinner commented Jul 3, 2018
Hum, it changes at least the creation date of the .pyc files, no?
commands/build_py.py can call byte_compile() with more than 1 filename.
A single sort() seems simpler and more reliable than modifying all calling site, especially because distutils is commonly monkey-patched and custom commands are written in setup.py files.
This change is linked to https://bugs.python.org/issue29708 "support reproducible Python builds" which is still open. |
methane commented Jul 3, 2018 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I found source of this patch. https://bugzilla.opensuse.org/show_bug.cgi?id=1049186
Currently, marshal uses refcnt to determine using w_ref or not. Some immutable objects I think we should use more deterministic way instead of refcnt. Maybe, count all constants in the module before marshal, like we did in compiling function for co_consts and co_names. |
mcepl commented Jul 3, 2018 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
@vstinner I think the very original source of this patch is master...distropatches:pycrb36 (i.e., it was created by @bmwiedemann). |
To get reproducible builds, byte_compile() of distutils.util now sorts filenames. Patch coming from OpenSUSE: distutils-reproducible-compile.patch. Co-Authored-By: Bernhard M. Wiedemann <[email protected]>
vstinner commented Jul 3, 2018 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Thank you, I credit him in my commit message and the NEWS entry. |
vstinner commented Jul 3, 2018
Serhiy: How changing the order of compiling Python files affects the reproducibility of the build? Oh, https://bugzilla.opensuse.org/show_bug.cgi?id=1049186 contains the full rationale ;-) |
vstinner commented Jul 3, 2018
@serhiy-storchaka: Do you want me to open a new issue, or is it ok to use https://bugs.python.org/issue29708 ? |
vstinner commented Jul 3, 2018
@methane: "I think we should use more deterministic way instead of refcnt. Maybe, count all constants in the module before marshal, like we did in compiling function for co_consts and co_names." Right, we can enhance marshal, but I don't think that this change is exclusive with this idea. This change is simple, tiny, safe, and cannot make Python less deterministic, no? :-) |
methane commented Jul 3, 2018
Yes. We can mitigate the issue before fixing it. That's why I approved this. |
serhiy-storchaka commented Jul 3, 2018
I think we need to understand the issue better before committing changes. When found the source of unstability of file names, we can find other similar sources and make them stable too. For example if the source is listdir() or glob(), we can consider sorting results of all listdir() or glob() in distutils and related methods. On other side, if the problem is with reference counters in marshal, we can change the marshal module instead. |
vstinner commented Jul 3, 2018
Ok, I created https://bugs.python.org/issue34033 "distutils is not reproducible". |
vstinner commented Jul 4, 2018
I added the scary red DO-NOT-MERGE label since @benjaminp and @serhiy-storchaka want to fix the issue differently: https://bugs.python.org/issue34033#msg320991 I keep the PR open until a different (better) fix is designed. |
vstinner commented Jul 16, 2018
I'm now confused by the issue and I'm no longer sure that this PR adds anything. @methane and others are working on better solutions, so I close my PR. At least, it seems like this PR and https://bugs.python.org/issue34033 helped to highlight the issue ;-) |
https://build.opensuse.org/request/show/994298 by user dirkmueller + dimstar_suse - skip subversion tests, not that relevant to pull in dozens of dependencies into small bootstrap - Add distutils-reproducible-compile.patch to make installed files ordered correctly and thus builds reproducible again (port of the fix for bpo#29708 and gh#python/cpython#8057).
To get reproducible builds, byte_compile() of distutils.util now
sorts filenames.
Patch coming from OpenSUSE: distutils-reproducible-compile.patch.
https://bugs.python.org/issue34033