Fix #77565: Incorrect locator detection in ZIP-based phars#6507
Fix #77565: Incorrect locator detection in ZIP-based phars#6507cmb69 wants to merge 4 commits intophp:PHP-7.4from
Conversation
We must not assume that the first end of central dir signature in a ZIP archive actually designates the end of central directory record, since the data in the archive may contain arbitrary byte patterns. Thus, we better search from the end of the data, what is also slightly more efficient.
|
|
||
| static char *phar_find_eocd(const char *s, size_t n) | ||
| { | ||
| const char *end = s + n + sizeof("PK\5\6") - 1 - sizeof(phar_zip_dir_end); |
There was a problem hiding this comment.
Because that marker string might (theoretically) be part of the directory record. This code makes sure that we really get the start of the end of central directory record.
There was a problem hiding this comment.
I don't think this really guarantees it either ... say it's at the start of a 255 byte trailing comment. The -sizeof(phar_zip_dir_end) won't skip over that. Or am I misunderstanding what you mean here?
There was a problem hiding this comment.
Ugh, you're right! It seems to me that the only way to reliably detect the end of central directory header would be to read through all headers and data from the beginning of the file. Anyhow, I'm going to commit a mitigitation for the current approach; maybe this is reasonably sufficient? With that change, two tests fail due to different errors; these would need to be fixed, if we're going that route.
There was a problem hiding this comment.
This approach looks okay to me. Personally I'd start at end = s + n and then check eocd_start + sizeof(phar_zip_dir_end) <= p + n before accessing comment_len ... your current code is safe, but it took me a moment to understand that this is guaranteed due to the used start position.
There was a problem hiding this comment.
Yeah, I see that might be confusing; I added a respective assertion, and also adapted the tests.
There is no way to detect the end of central directory signature by searching from the end of the ZIP archive with absolute certainty, since the signature could be part of the trailing comment. To mitigate, we check that the comment length fits to the found position, but that might still not be the correct position in rare cases.
We must not assume that the first end of central dir signature in a ZIP
archive actually designates the end of central directory record, since
the data in the archive may contain arbitrary byte patterns. Thus, we
better search from the end of the data, what is also slightly more
efficient.