Skip to content

Fix: Unexpected behaviour: parse_numbers() doesn't handle space as a grouping symbol, however, parse_decimal() does #1061#1062

Closed
sebas-inf wants to merge 0 commit intopython-babel:masterfrom
sebas-inf:master
Closed

Fix: Unexpected behaviour: parse_numbers() doesn't handle space as a grouping symbol, however, parse_decimal() does #1061#1062
sebas-inf wants to merge 0 commit intopython-babel:masterfrom
sebas-inf:master

Conversation

@sebas-inf
Copy link
Copy Markdown

Cleaned the string to ensure that it still works if there is a non-breakable space, \xa0, in the number and separed the logic so that it is easeir to understand.

Copy link
Copy Markdown
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've committed and pushed your entire .venv – please clean that up first.

@sebas-inf
Copy link
Copy Markdown
Author

You've committed and pushed your entire .venv – please clean that up first.

Should I just push the file I changed or everything except the .venv?

@akx
Copy link
Copy Markdown
Member

akx commented Jan 31, 2024

Should I just push the file I changed or everything except the .venv?

Your commit(s) should only have relevant changes.

@sebas-inf
Copy link
Copy Markdown
Author

Should I just push the file I changed or everything except the .venv?

Your commit(s) should only have relevant changes.

Ok, I will push the file with the fix only then.

@akx
Copy link
Copy Markdown
Member

akx commented Jan 31, 2024

You still have 755 files in this pull request. You'll probably need to git reset 40e60a1f6cf178d9f57fcc14f157ea1b2ab77361 (the current master), then carefully git add only the changes you want, git commit them, and git push --force-with-lease them to your repository.

@sebas-inf
Copy link
Copy Markdown
Author

You still have 755 files in this pull request. You'll probably need to git reset 40e60a1f6cf178d9f57fcc14f157ea1b2ab77361 (the current master), then carefully git add only the changes you want, git commit them, and git push --force-with-lease them to your repository.

should i do all of this within the terminal in vscode where i was working?

@akx
Copy link
Copy Markdown
Member

akx commented Jan 31, 2024

should i do all of this within the terminal in vscode where i was working?

Maybe. I don't use Visual Studio Code much. Visual Studio Code might have other tools for backing out commits.

@sebas-inf
Copy link
Copy Markdown
Author

sebas-inf commented Jan 31, 2024

should i do all of this within the terminal in vscode where i was working?

Maybe. I don't use Visual Studio Code much. Visual Studio Code might have other tools for backing out commits.

It worked, I made it so i only pushed the numbers.py file.

Copy link
Copy Markdown
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to err on the side of caution here, if there is the remotest chance this could cause mis-parsing of numbers (e.g. something that couldn't be parsed before being now accidentally parsed as something it's not intended to be parsed as).

To that end, maybe this function should also have a strict=True parameter like parse_decimal does, and then maybe the grouping character logic could be shared between the two in non-strict mode?

babel/numbers.py Outdated
group_symbol = get_group_symbol(locale, numbering_system=numbering_system)

# Replace non-breakable spaces with the group symbol
cleaned_string = string.replace('\xa0', group_symbol)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case (either a doctest or a real test in tests/) that uses an \xa0 character?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be a valid test case for that: >>> parse_number('1 099', locale='ru')

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My human eyes can't tell if that space is a \xA0. You could spell it out there, if it really is one.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made some changes applying some of the logic from parse_decimal and it says its passing the test cases on my end

babel/numbers.py Outdated
def parse_number(
string: str,
locale: Locale | str | None = LC_NUMERIC,
strict: bool = False,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new argument into the docstring. strict mode should probably be the default here, and the new behavior (which allows more loose strings to be parsed) should be opt-in.

babel/numbers.py Outdated
1099
>>> parse_number('1.099', locale='de_DE')
1099
>>> parse_number('1 099', locale='ru')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to test a string that has a non-breaking space character, make it obvious:

Suggested change
>>> parse_number('1 099', locale='ru')
>>> parse_number('1\xa0099', locale='ru')

babel/numbers.py Outdated
Comment on lines +1044 to +1085
decimal_symbol = get_decimal_symbol(locale, numbering_system=numbering_system)

try:
parsed = decimal.Decimal(string.replace(group_symbol, '')
.replace(decimal_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(f"{string!r} is not a valid number") from exc

proper = format_decimal(parsed, locale=locale, decimal_quantization=False, numbering_system=numbering_system)

if string != proper and proper != _remove_trailing_zeros_after_decimal(string, decimal_symbol):
try:
parsed_alt = decimal.Decimal(string.replace(decimal_symbol, '')
.replace(group_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
) from exc
else:
proper_alt = format_decimal(
parsed_alt,
locale=locale,
decimal_quantization=False,
numbering_system=numbering_system,
)
if proper_alt == proper:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
)
else:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}? Or maybe {proper_alt!r}?",
suggestions=[proper, proper_alt],
)

# Remove other spaces and replace the group symbol with an empty string
cleaned_string = string.replace(' ', '').replace(group_symbol, '')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_number is not meant to parse decimal numbers, only ever integers, anything decimal-related shouldn't be here at all.

If anything, we should check for the existence of a decimal separator in the input in strict mode, and fail if it is there.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the changes in the latest commit

babel/numbers.py Outdated

# If strict mode is enabled, handle irregular formatting
if strict and group_symbol in string:
# Your additional logic for strict mode here
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a strange comment... 😁

@sebas-inf sebas-inf requested a review from akx February 6, 2024 02:41
@akx
Copy link
Copy Markdown
Member

akx commented Apr 24, 2024

This will require rebasing, and there are still unresolved comments.

@akx
Copy link
Copy Markdown
Member

akx commented Jul 11, 2024

@sebas-inf If you're still around, please see the comments and rebase this to resolve conflicts. Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants