Skip to content

Conversation

@methane
Copy link
Member

@methanemethane commented Jul 10, 2018

marshal.dumps() tests refcnt(obj)==1 to decide use FLAG_REF or not.

But refcnt of interned string is very unstable.
When compiling same source, refcnt of interned string in the output
may be 1 or >1. It makes FLAG_REF usage unstable.

To help reproducible build, use FLAG_REF for interned string even if
refcnt(obj)==1.

https://bugs.python.org/issue34033

@methanemethane changed the title bpo-34033: marshal: Use FLAG_REF stablybpo-34093: marshal: Use FLAG_REF stablyJul 11, 2018
@serhiy-storchakaserhiy-storchaka self-requested a review July 12, 2018 05:45
@serhiy-storchaka
Copy link
Member

I'm in process of reviewing.

@methanemethane changed the title bpo-34093: marshal: Use FLAG_REF stablybpo-34093: marshal: Stabilize FLAG_REF usageJul 14, 2018
@serhiy-storchaka
Copy link
Member

Do you mind to provide also your initial PR? I want to try to optimize or simplify it.

Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't using FLAG_REF for all interned strings slows down unmarshalling and increases the memory consumption for both marshalling and unmarshalling?

@methane
Copy link
MemberAuthor

Isn't using FLAG_REF for all interned strings slows down unmarshalling and increases the memory consumption for both marshalling and unmarshalling?

For unmarshaling speed, it's possible. But interned string has cost of interning; create temoporary string and calling PyDict_SetDefault(). Overhead of FLAG_REF (PyList_Append) is much smaller than it.

For marshaling memory overhead, it will increase hashtable size for each interned string with refcnt==1.
For unmarshaling memory overhead, it will increase list length too.

I don't expect it can be pragmatic problem though.

@mcepl
Copy link
Contributor

@bmwiedemann Do you know about this one?

@bmwiedemann
Copy link
Contributor

I might have seen this patch 2 years ago. Chances are, I didnt try it, because it did not apply cleanly to the python we had in Tumbleweed.

@mcepl
Copy link
Contributor

mcepl commented Aug 6, 2020

@bmwiedemann Yes, you are right, this is basically non-applicable (especially the monstrosity in these two conflicting importlib*.h files).

@methane Do you think, you would be willing and able to port this PR to something more recent (e.g., 3.8 or master), please?

@methane
Copy link
MemberAuthor

methane commented Aug 7, 2020

importlib.h and importlib_external.h are generated files. You can apply 6c8ea7c cleanly and "make regen-importlib".

@mcepl
Copy link
Contributor

mcepl commented Aug 7, 2020

@bmwiedemann
Copy link
Contributor

@mcepl I tried my multibuildrbk script for a python38-base build there and got:

--- old /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)+++ new /usr/lib64/python3.8/test/test_json/__pycache__/test_scanstring.cpython-38.pyc (hex)@@ -1,6 +1,6 @@ 000002c0 00 00 54 29 02 f5 06 00 00 00 7a f0 9d 84 a0 78 |..T)......z....x| 000002d0 e9 05 00 00 00 7a 08 22 5c 75 30 30 37 62 22 29 |.....z."\u007b")| -000002e0 02 fa 01 7b e9 08 00 00 00 7a 3c 22 41 20 4a 53 |...{.....z<"A JS|+000002e0 02 da 01 7b e9 08 00 00 00 7a 3c 22 41 20 4a 53 |...{.....z<"A JS| 000002f0 4f 4e 20 70 61 79 6c 6f 61 64 20 73 68 6f 75 6c |ON payload shoul| 00000300 64 20 62 65 20 61 6e 20 6f 62 6a 65 63 74 20 6f |d be an object o| 00000310 72 20 61 72 72 61 79 2c 20 6e 6f 74 20 61 20 73 |r array, not a s| --- old /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)+++ new /usr/lib64/python3.8/__pycache__/netrc.cpython-38.pyc (hex)@@ -1,7 +1,7 @@ 000007c0 63 64 65 66 7a 02 20 09 da 01 0a 7a 04 20 09 0d |cdefz. ....z. ..| 000007d0 0a 7a 15 62 61 64 20 74 6f 70 6c 65 76 65 6c 20 |.z.bad toplevel | 000007e0 74 6f 6b 65 6e 20 25 72 3e 04 00 00 00 72 1e 00 |token %r>....r..| -000007f0 00 00 72 21 00 00 00 72 20 00 00 00 72 22 00 00 |..r!...r ...r"..|+000007f0 00 00 72 22 00 00 00 72 20 00 00 00 72 21 00 00 |..r"...r ...r!..| 00000800 00 7a 26 6d 61 6c 66 6f 72 6d 65 64 20 25 73 20 |.z&malformed %s | 00000810 65 6e 74 72 79 20 25 73 20 74 65 72 6d 69 6e 61 |entry %s termina| 00000820 74 65 64 20 62 79 20 25 73 da 05 6c 6f 67 69 6e |ted by %s..login| --- old /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)+++ new /usr/lib64/python3.8/__pycache__/pathlib.cpython-38.pyc (hex)@@ -1,5 +1,5 @@ 000079c0 64 20 27 2e 2e 27 2e 0a 20 20 20 20 20 20 20 20 |d '..'.. | -000079d0 3e 02 00 00 00 72 32 00 00 00 72 a9 00 00 00 4e |>....r2...r....N|+000079d0 3e 02 00 00 00 72 a9 00 00 00 72 32 00 00 00 4e |>....r....r2...N| 000079e0 29 05 72 64 01 00 00 72 69 01 00 00 72 b5 00 00 |).rd...ri...r...| 000079f0 00 72 d5 00 00 00 72 f7 00 00 00 72 44 01 00 00 |.r....r....rD...| 00007a00 72 22 00 00 00 72 22 00 00 00 72 23 00 00 00 da |r"...r"...r#....|

The last 2 could be a different issue of ordering.

@methanemethane requested a review from tiran as a code ownerMay 3, 2022 09:15
@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented May 3, 2022

We started freezing a bunch of stdlib modules last year, which definitely impacts the story here.

Regardless, treating all interned strings as "ref" objects makes sense, so +1 from me.

Copy link
Member

@ericsnowcurrentlyericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@methanemethane changed the title bpo-34093: marshal: Stabilize FLAG_REF usagegh-78214: marshal: Stabilize FLAG_REF usageMay 4, 2022
@methanemethane merged commit 6dcfd6c into python:mainMay 4, 2022
@methanemethane deleted the stable-marshal branch May 4, 2022 01:01
@methane
Copy link
MemberAuthor

This is temporal fix and will be reverted after more complete fix is landed.

Millak pushed a commit to Millak/guix that referenced this pull request Jan 22, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request May 7, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 9, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 9, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 11, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 17, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 27, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jun 29, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Jul 11, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
Z572 pushed a commit to Z572/guix that referenced this pull request Jul 22, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
Z572 pushed a commit to Z572/guix that referenced this pull request Jul 23, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Aug 10, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Aug 28, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
github-actionsbot pushed a commit to guix-ru/guix that referenced this pull request Aug 31, 2024
While Python build was reproducible on a single machine, once multiple file systems entered the picture, it was no longer true. The solution adopted by the upstream (and Debian) was cherry-picked. More info: <python/cpython#8226>. * gnu/packages/python.scm (python-3.10) [source]: Apply reproducibility patch. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Modified-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Change-Id: I0273dc0f8511a7acdcc2b462a26cc29a9756c801
SomberNight added a commit to SomberNight/python-for-android that referenced this pull request Oct 15, 2024
f321x pushed a commit to f321x/spesmilo-python-for-android-fork-fork that referenced this pull request Aug 25, 2025
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-featureA feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

@methane@serhiy-storchaka@mcepl@bmwiedemann@ericsnowcurrently@gpshead@the-knights-who-say-ni@bedevere-bot