Modernize util/strencodings and util/string: string_view and optional#24764
Modernize util/strencodings and util/string: string_view and optional#24764sipa wants to merge 10 commits intobitcoin:masterfrom
string_view and optional#24764Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Concept ACK
Returning a string_view can be somewhat dangerous (in combination with auto? also when the original string goes out of scope, especially if it's a temporary), so giving these functions new names is a good idea. |
maflcko
left a comment
There was a problem hiding this comment.
Returning a string_view can be somewhat dangerous (in combination with auto? also when the original string goes out of scope, especially if it's a temporary), so giving these functions new names is a good idea.
A new name is good, but doesn't solve the problem. I think this can be fixed with an attribute: #20493
I think this can be merged, but I left a few nits and questions.
review ACK 6e1f14780452d956503f1dbc3c91b17b44287f77 💢
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
review ACK 6e1f14780452d956503f1dbc3c91b17b44287f77 💢
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUghZAv+JPOBXiEZ9L1jCF9HPLDO1vGLEuvq6hw6EkaMSKd0Zn3UaiYvP98jPBMZ
HzTidM3yPWwDfLw3QVTlPCdt0J8wBAwOg5ZgjGRZ7fu1hn0SZ93q960giU68DbeJ
ZZCF4fAYPOkTqf1a06DnV7anglhEIG/JXMg+XESTvUGmtdKnP48sdwek4NZOw1h7
PcA+Ck38qlb68W38HQMwsQYr/aO2Ftc2Utw2GMv3Kkr7nD6kIwtuH9WOTWc8Kw2H
vr5NeRfQS1smm9rP+VrixKmCaC1h37r6NMuXDqh/MRNj77ONwmLDLWES/4bP9ioc
w814fQled+vIue+Mv5yBUNeGZmlOryQsC7wJwNkiRNKxG5H+bSn2KUKEtZAGjHpv
rfNI9X9nbDCvDQdHemrYCessHaCfGFCMeUDC4PWDZ0uNaZSC6RrU0y8pESeMOzxe
BJJxSpq0sHCOAVt8LKGyumBlma/k2+n9d6UNo3++K/am9Uxei4kFgLJ0QebpXKqm
NqMH+EA3
=/7qh
-----END PGP SIGNATURE-----
Agree, I just meant that putting a string_view-returning function as a drop-in replacement would have been a very bad idea, a new function with a new name means that people using it can reason about the risks anew. Sure, if an attribute can help avoid such problems more widely that's of course even better. |
|
review ACK e4092c5 🐤 Only changes:
Show signatureSignature: |
|
Concept ACK - nits:
|
Yeah, I'll separate them (now that we figured out the issue with ignoring the pfInvalid flag in the http auth logic).
That would be incorrect. Hex numbers don't need to be a multiple of 2 characters (0xF is a valid hex number, e.g.). |
In addition, to make sure that no call site ignores the invalid decoding status, make the pf_invalid argument mandatory.
Base32/base64 are mechanisms for encoding binary data. That they'd decode to a string is just bizarre. The fact that they'd do that based on the type of input arguments even more so.
maflcko
left a comment
There was a problem hiding this comment.
re-ACK 5ea072e only effective change is adding a "return false" 🌷
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
re-ACK 5ea072ed08b43cdfdf8554597dcf8c1a51ef4dae only effective change is adding a "return false" 🌷
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUgUyQwAhIAVoAgFscYNynT/fORwnmC2JJczY8Zne+7IKHyQadrQGB3uGARyJceI
WDRjN7aLHHJwFmFZ/Lz4HGa5sZYwAUC/Vz9y2eX0nu2PInfZgdtNbx8JT0cx5wkX
03LWXiek/qiy6BeEctpE1nP37iSAqB4E18yEhCMqdLe+m7KCq/rTZN2XdF/R3DHK
j9mqyyy7M5dYQUE2dViBdqTydSrhGIhw2HqHXytP8/urfFB04wLXeJHNDZ6lRfZ1
zq28KVL2FOuFXdYTIAtdeCAFAXFgwiZ2lYik1+aftCBBO9BdBnmKqRD0/ERAC2u8
TfVPFMQECHbWxcxGwZGBjhNhcXGI2fVUF2g4SqJVssqPnG53GcSRvN6H0tcKDAH7
g1x6fqDibKBIUXcrTp8l5tlZYknN2wU0wq9M7U6uJIt5gGWOTlpYpbQsMIlzg6Ls
omRKFXjTXHvFmO8eilHMaMVk6qY/12MjoTyHZ1JwN7TWHRxH6Q9XNTFeq1VJE/qd
uqaqLwJe
=bejp
-----END PGP SIGNATURE-----
| * Check if a string does not contain any embedded NUL (\0) characters | ||
| */ | ||
| [[nodiscard]] inline bool ValidAsCString(const std::string& str) noexcept | ||
| [[nodiscard]] inline bool ValidAsCString(std::string_view str) noexcept |
There was a problem hiding this comment.
I think the name ValidAsCString is now a bit dangerous. I'd assume that it returns true when the argument is 0 terminated at the end (which was the case for the const std::string& str argument), but with a std::string_view argument this might not be the case.
So now even when ValidAsCString(str) returns true, it is not allowed to use e.g. strlen(str.data()).
I would rename the method to something like ContainsNoNUL to make it more clear what's checked, or keep the const std::string& argument for now
There was a problem hiding this comment.
According to the documentation, it only checks if a string has embedded \0 characters. That std::string is zero-terminated internally is an implementation detail, the caller is still responsible for passing c_str() (and never .data()) when passing it to a function that takes a C string.
it is not allowed to use e.g. strlen(str.data()).
I'd really frown on code like that in any case.
Edit: The only time it'd be remotely acceptable to pass .data() as a C string would be if there is an embedded \0 in the array, which is exactly what this function avoids 😄
There was a problem hiding this comment.
well, maybe it's just me, but from the interface ValidAsCString(std::string_view str) I'd guess that it returns true when it actually finds a \0 somewhere, but it's exactly the opposite 🙂
There was a problem hiding this comment.
I have no objection to adding a scripted-diff to rename ValidAsCString to ContainsNoNUL. Either in this pull or as a follow-up.
|
🐙 This pull request conflicts with the target branch and needs rebase. Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft". |
|
I am happy to pick this up if you are no longer working on it |
|
@MarcoFalke Go for it. |
Make use of
std::string_viewandstd::optionalin the util/{strencodings, string} files.This avoids many temporary string/vector objects being created, while making the interface easier to read. Changes include:
std::string_viewinstead ofstd::string.RemovePrefixViewandTrimStringViewwhich also returnstd::string_viewobjects (the correspondingRemovePrefixandTrimStringkeep returning anstd::string, as that's needed in many call sites still).std::stringfromDecodeBase32andDecodeBase64, but return vectors. Base32/64 are fundamentally algorithms for encoding bytes as strings; returningstd::stringfrom those (especially doing it conditionally based on the input arguments/types) is just bizarre.bool* pf_invalidoutput argument pointer inDecodeBase32andDecodeBase64; return anstd::optionalinstead.DecodeBase32andDecodeBase64more efficient by doing the conversion from characters to integer symbols on-the-fly rather than through a temporary vector.