Skip to content

gh-121650: Encode newlines in headers, and verify headers are sound#122233

Merged
encukou merged 12 commits intopython:mainfrom
encukou:gh-121650-email-newlines-in-headers
Jul 30, 2024
Merged

gh-121650: Encode newlines in headers, and verify headers are sound#122233
encukou merged 12 commits intopython:mainfrom
encukou:gh-121650-email-newlines-in-headers

Conversation

@encukou
Copy link
Member

@encukou encukou commented Jul 24, 2024

Re: #121812

Hello @basbloemsaat,

I've spent the day reading through the email module, and RFCs, and I believe I found a better place to fix the issue.
This involved lots of experimentation, so I'm sending an alternative PR rather than a review on yours.

  • The generator (writer) verifies that the representation of each header is sound (a parser won't treat it as multiple headers, start-of-body, or part of another header). That should cover custom fold() implementations or Header subclasses.

    • However, some user out there is probably misusing such header injection in working code, so, I added a policy attribute to turn it back.
  • Newlines are encoded in fold(), just like undecodable bytes and other special characters.

Overall, this means that we treat newlines as valid content of headers, but “escape” them when such a header is serialized to text.

This PR is a proof of concept. It needs tests and documentation, but I'm out of time for today, and I wanted to share what I have.

Does this look reasonable to you?

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants