Fix #75035: Fix 5+ digit years in ISO 8601 date strings#2672
Fix #75035: Fix 5+ digit years in ISO 8601 date strings#2672adsr wants to merge 1 commit intophp:masterfrom
Conversation
This patch adds support for parsing (unserializing, constructing) ISO 8601 date strings with 5+ digit years, consistent with our already existing support for serializing and formatting these types of dates. The standard allows implementations to use 5+ digit years so long as the sender and receiver agree on the format. The patch is at odds with tests for bugfix #62852, which asserts that a 5 digit year should fail to parse. Reviewing the original bug report, it seems to be more related to invalid timezone data. I didn't dig into it very far. As a bonus, the patch also ensures that ISO 8601 years -1 thru -999 are formatted with 4 digits (instead of 3 digits) prefixed by a minus sign. I believe this is required by the standard. Additionally it fixes an overflow error that causes bad year formatting. I focused only on ISO 8601 for this patch, but the same `long long -> int` overflow exists in other date formats, e.g., RFC 2822 dates. I regenerated `ext/date/lib/parse_date.c` manually with `$ re2c -d -b -o ...`. I'm not sure if that's correct.
derickr
left a comment
There was a problem hiding this comment.
Changes to timelib need to go here: https://github.com/derickr/timelib
| break; | ||
| case 'c': { | ||
| const char *fmt = t->y < 0 && t->y > -1000 | ||
| /* Ensure 4 digits for years -1 thru -999 */ |
There was a problem hiding this comment.
Is there a specific rubric for distinguishing between bugfix and bc break? I view this more as a bugfix than a bc break as the docs advertise ISO 8601 support. Ditto for the overflow fix. Thanks.
There was a problem hiding this comment.
It's also against the ISO standard, which says a minimum of four digits optionally prefixed by a - (or +): https://en.wikipedia.org/wiki/ISO_8601#Years
|
@derickr should this be closed ? Is it the kind of BC break you will never be happy with, or can we have it in 7.3 ? |
|
The BC break is in the formatting, and the datex rules need to go to timelib upstream, so yes, this PR should be closed (for now). I'll have a look at whether the datex rules breaks any tests on any other integration that uses timelib. |
| date.timezone=GMT | ||
| --FILE-- | ||
| <?php | ||
| $s2 = 'O:3:"Foo":3:{s:4:"date";s:20:"10007-06-07 03:51:49";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}'; |
There was a problem hiding this comment.
This test should not be removed, but instead, a different wrong date string should be used that is unparsable.
|
@derickr Let me know if you'd like me to assist / file a pr on timelib. Thanks. |
|
@adsr Sure - just make sure to add tests and that all existing tests pass. |
|
We can fix this bug in 7.3 if derickr/timelib#16 makes it into timelib 2018.01. Can we re-open? @krakjoe @derickr |
This patch adds support for parsing (unserializing, constructing)
ISO 8601 date strings with 5+ digit years, consistent with our
already existing support for serializing and formatting these
types of dates. The standard allows implementations to use 5+ digit
years so long as the sender and receiver agree on the format.
The patch is at odds with tests for bugfix #62852, which asserts that
a 5 digit year should fail to parse. Reviewing the original bug
report, it seems to be more related to invalid timezone data. I
didn't dig into it very far.
As a bonus, the patch also ensures that ISO 8601 years -1 thru -999
are formatted with 4 digits (instead of 3 digits) prefixed by a minus
sign. I believe this is required by the standard.
Additionally it fixes an overflow error that causes bad year
formatting. I focused only on ISO 8601 for this patch, but the same
long long -> intoverflow exists in other date formats, e.g., RFC2822 dates.
I regenerated
ext/date/lib/parse_date.cmanually with$ re2c -d -b -o .... I'm not sure if that's correct.