Fix: Unexpected behaviour: parse_numbers() doesn't handle space as a grouping symbol, however, parse_decimal() does #1061#1062
Conversation
akx
left a comment
There was a problem hiding this comment.
You've committed and pushed your entire .venv – please clean that up first.
Should I just push the file I changed or everything except the .venv? |
Your commit(s) should only have relevant changes. |
Ok, I will push the file with the fix only then. |
|
You still have 755 files in this pull request. You'll probably need to |
should i do all of this within the terminal in vscode where i was working? |
Maybe. I don't use Visual Studio Code much. Visual Studio Code might have other tools for backing out commits. |
It worked, I made it so i only pushed the numbers.py file. |
akx
left a comment
There was a problem hiding this comment.
I would like to err on the side of caution here, if there is the remotest chance this could cause mis-parsing of numbers (e.g. something that couldn't be parsed before being now accidentally parsed as something it's not intended to be parsed as).
To that end, maybe this function should also have a strict=True parameter like parse_decimal does, and then maybe the grouping character logic could be shared between the two in non-strict mode?
babel/numbers.py
Outdated
| group_symbol = get_group_symbol(locale, numbering_system=numbering_system) | ||
|
|
||
| # Replace non-breakable spaces with the group symbol | ||
| cleaned_string = string.replace('\xa0', group_symbol) |
There was a problem hiding this comment.
Could you add a test case (either a doctest or a real test in tests/) that uses an \xa0 character?
There was a problem hiding this comment.
would this be a valid test case for that: >>> parse_number('1 099', locale='ru')
There was a problem hiding this comment.
My human eyes can't tell if that space is a \xA0. You could spell it out there, if it really is one.
There was a problem hiding this comment.
made some changes applying some of the logic from parse_decimal and it says its passing the test cases on my end
babel/numbers.py
Outdated
| def parse_number( | ||
| string: str, | ||
| locale: Locale | str | None = LC_NUMERIC, | ||
| strict: bool = False, |
There was a problem hiding this comment.
Please add the new argument into the docstring. strict mode should probably be the default here, and the new behavior (which allows more loose strings to be parsed) should be opt-in.
babel/numbers.py
Outdated
| 1099 | ||
| >>> parse_number('1.099', locale='de_DE') | ||
| 1099 | ||
| >>> parse_number('1 099', locale='ru') |
There was a problem hiding this comment.
If the intent is to test a string that has a non-breaking space character, make it obvious:
| >>> parse_number('1 099', locale='ru') | |
| >>> parse_number('1\xa0099', locale='ru') |
babel/numbers.py
Outdated
| decimal_symbol = get_decimal_symbol(locale, numbering_system=numbering_system) | ||
|
|
||
| try: | ||
| parsed = decimal.Decimal(string.replace(group_symbol, '') | ||
| .replace(decimal_symbol, '.')) | ||
| except decimal.InvalidOperation as exc: | ||
| raise NumberFormatError(f"{string!r} is not a valid number") from exc | ||
|
|
||
| proper = format_decimal(parsed, locale=locale, decimal_quantization=False, numbering_system=numbering_system) | ||
|
|
||
| if string != proper and proper != _remove_trailing_zeros_after_decimal(string, decimal_symbol): | ||
| try: | ||
| parsed_alt = decimal.Decimal(string.replace(decimal_symbol, '') | ||
| .replace(group_symbol, '.')) | ||
| except decimal.InvalidOperation as exc: | ||
| raise NumberFormatError( | ||
| f"{string!r} is not a properly formatted decimal number. " | ||
| f"Did you mean {proper!r}?", | ||
| suggestions=[proper], | ||
| ) from exc | ||
| else: | ||
| proper_alt = format_decimal( | ||
| parsed_alt, | ||
| locale=locale, | ||
| decimal_quantization=False, | ||
| numbering_system=numbering_system, | ||
| ) | ||
| if proper_alt == proper: | ||
| raise NumberFormatError( | ||
| f"{string!r} is not a properly formatted decimal number. " | ||
| f"Did you mean {proper!r}?", | ||
| suggestions=[proper], | ||
| ) | ||
| else: | ||
| raise NumberFormatError( | ||
| f"{string!r} is not a properly formatted decimal number. " | ||
| f"Did you mean {proper!r}? Or maybe {proper_alt!r}?", | ||
| suggestions=[proper, proper_alt], | ||
| ) | ||
|
|
||
| # Remove other spaces and replace the group symbol with an empty string | ||
| cleaned_string = string.replace(' ', '').replace(group_symbol, '') |
There was a problem hiding this comment.
format_number is not meant to parse decimal numbers, only ever integers, anything decimal-related shouldn't be here at all.
If anything, we should check for the existence of a decimal separator in the input in strict mode, and fail if it is there.
There was a problem hiding this comment.
made the changes in the latest commit
babel/numbers.py
Outdated
|
|
||
| # If strict mode is enabled, handle irregular formatting | ||
| if strict and group_symbol in string: | ||
| # Your additional logic for strict mode here |
|
This will require rebasing, and there are still unresolved comments. |
|
@sebas-inf If you're still around, please see the comments and rebase this to resolve conflicts. Thanks :) |
Cleaned the string to ensure that it still works if there is a non-breakable space, \xa0, in the number and separed the logic so that it is easeir to understand.