Skip to content

Conversation

@skirpichev
Copy link
Member

@skirpichevskirpichev commented Oct 11, 2024

>>> f"{123_456.123_456:_._f}"# Whole and fractional '123_456.123_456' >>> f"{123_456.123_456:_f}"# Integer component only '123_456.123456' >>> f"{123_456.123_456:._f}"# Fractional component only '123456.123_456' >>> f"{123_456.123_456:.4_f}"# with precision (._4f is illegal now) '123456.123_5' >>> f"{123_456.123_456:.4,f}"# with comma '123456.123,5'

Notes for reviewers.

  • This is a simplest version, using _PyUnicode_InsertThousandsGrouping() like for the integer part. But perhaps we should insert digits in the fractional part, starting from the least significand digit instead, i.e. last example will print 123456.123_5 instead. Second commit puts separators, iterating from the least significand digit of fractional part.
  • The issue thread has several suggestions for grouping: e.g. 3 or 5. 3 was chosen.
  • Maybe grouping length should be configurable (e.g. ._3f - for grouping in three digits).
  • Heh, just for completeness: maybe break compatibility (see this)? I.e. _f will be "use underscore's both in the integer and the fractional part".
  • Also comma as a separator? (Perhaps, it's better to do in a separate pr.)

📚 Documentation preview 📚: https://cpython-previews--125304.org.readthedocs.build/

…floats ```pycon >>> f"{123_456.123_456:_._f}" # Whole and fractional '123_456.123_456' >>> f"{123_456.123_456:_f}" # Integer component only '123_456.123456' >>> f"{123_456.123_456:._f}" # Fractional component only '123456.123_456' >>> f"{123_456.123_456:.4_f}" # with precision '123456.1_235' ```
@skirpichevskirpichevforce-pushed the fmt-frac-grouping-87790 branch from b648865 to 5858d8cCompareOctober 11, 2024 09:04
@skirpichevskirpichev marked this pull request as ready for review October 11, 2024 09:46
Copy link
Contributor

@rruuaanngrruuaanng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@skirpichevskirpichev changed the title gh-87790: support underscore for formatting fractional part of floatsgh-87790: support underscore/comma for formatting fractional part of floatsNov 16, 2024
@skirpichevskirpichev changed the title gh-87790: support underscore/comma for formatting fractional part of floatsgh-87790: support thousands separators for formatting fractional part of floatsNov 16, 2024
@skirpichev
Copy link
MemberAuthor

(JFR, review request on d.p.o: https://discuss.python.org/t/71229)

/* Determine the grouping, separator, and decimal point, if any. */
if (get_locale_info(format->type == 'n' ? LT_CURRENT_LOCALE :
format->thousands_separators,
format->frac_thousands_separator,
Copy link
Contributor

@nineteendonineteendoNov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we going to do with the locale? At the very least I would reject specifying the separator manually:

>>>importlocale>>>locale.setlocale(locale.LC_ALL, 'af_za') >>>print(f"{123_456.123_456:.12,n}") 123.456,123,456# <- 2 decimal points

Would it be a breaking change to do this?

>>>print(f"{123_456.123_456:.12n}") 123.456,123.456

Copy link
MemberAuthor

@skirpichevskirpichevNov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just reject separators with the n type.

Edit:

Would it be a breaking change to do this?

Yes, this will be a compatibility break.

IMO, without a legacy - it would be better to use thousands separators both for integer and fractional part. Then we could do same with n type formatting.

Edit2:

Maybe we should try something more closer to the original implementation? I.e. with a distinct optional frac_grouping option, which can have _ value. If present:

  1. for floating-point format types that means: "apply current settings for thousands separators both to integer and fractional part";
  2. for n type that means: "apply current locale settings for thousands separators both to integer and fractional part".

This is less flexible (you can't put separators to the fractional part only), but will more consistently interact with locale.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible to use the underscore separator with n format:

>>> format(1234, "_n") ValueError: Cannot specify '_' with 'n'. 

So I agree that we should reject separators with the n type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See invalid_thousands_separator_type() in parse_internal_render_format_spec().

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner, this check already added. Unfortunately, that means with n type we can have locale-dependent separators only for integer part.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, should we allow different separators? So far it's permitted:

>>> f"{122222.22222:_.7,f}" '122_222.222,220,0'

/* Determine the grouping, separator, and decimal point, if any. */
if (get_locale_info(format->type == 'n' ? LT_CURRENT_LOCALE :
format->thousands_separators,
format->frac_thousands_separator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible to use the underscore separator with n format:

>>> format(1234, "_n") ValueError: Cannot specify '_' with 'n'. 

So I agree that we should reject separators with the n type.

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the multiple updates.

@vstinner
Copy link
Member

@serhiy-storchaka: Would you mind to review this change?


.. productionlist:: format-spec
format_spec: [[`fill`]`align`][`sign`]["z"]["#"]["0"][`width`][`grouping_option`]["." `precision`][`type`]
format_spec: [[`fill`]`align`][`sign`]["z"]["#"]["0"][`width`][`grouping_option`]["." `precision` [`grouping_option`]][`type`]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not accurate. More accurate would be ["." `precision` [`grouping_option`] | "." `grouping_option`] or ["." (`precision` [`grouping_option`] | `grouping_option`)] or ["." [`precision`][`grouping_option`]] with addition that either precision or grouping_option should be specified after ".".

Copy link
MemberAuthor

@skirpichevskirpichevNov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

.. productionlist:: format-spec format_spec: [`options`][`width_and_precision`][`type`] options: [[`fill`]`align`][`sign`]["z"]["#"]["0"] fill: <any character> align: "<" | ">" | "=" | "^" sign: "+" | "-" | " " width_and_precision: [`width_with_grouping`][`precision_with_grouping`] width_with_grouping: [`width`][`grouping_option`] precision_with_grouping: "." (`precision`[`grouping_option`] | `grouping_option`) width: `~python-grammar:digit`+ grouping_option: "_" | "," precision: `~python-grammar:digit`+ type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" : | "G" | "n" | "o" | "s" | "x" | "X" | "%" 

?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant [...] in precision_with_grouping.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

pos++;
}

while (pos<end && Py_ISDIGIT(PyUnicode_READ(kind, data, pos))){

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in the if block.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, which if?

Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this the way we will go, the implementation LGTM.

@skirpichev
Copy link
MemberAuthor

@vstinner, @serhiy-storchaka should we ask someone else to review this pr?

@vstinnervstinner merged commit f39a07b into python:mainFeb 25, 2025
42 checks passed
@vstinner
Copy link
Member

I merged your PR, thanks.

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

@skirpichev@vstinner@serhiy-storchaka@nineteendo@rruuaanng