optionally warn when encountering tabs in whitespace#1589
optionally warn when encountering tabs in whitespace#1589parsonsmatt merged 10 commits intoyesodweb:masterfrom
Conversation
7eee714 to
e4cc2bf
Compare
| let (warnings, res) = runConfiguredParser defaultParserSettings initialExtraState (some anyToken) "" s | ||
| case res of | ||
| Left peb -> | ||
| (warnings, Left peb) | ||
| Right (tokens, _acc) -> (warnings, Right tokens) |
There was a problem hiding this comment.
tbh I think I would have this discard warnings for this function and another function tokenizeWithWarnings that returns warnings
This allows you to keep most of the tests the same, but then have new tests with new behavior that should have warnings
There was a problem hiding this comment.
Hmm — yeah, we could do that, but I'd advocate for keeping it the way it is now, because IMO not having warnings is part of the "successful parse" behavior that the tests are intended to exercise.
i.e., if we were to in the future make a change that inadvertently caused the parse task in one of the existing tests to generate a warning, we would want that test to start failing.
| let cpr = parse ps s | ||
| case cpr of | ||
| let (warnings, res) = parse ps s | ||
| _ <- reportWarning $ renderWarnings warnings |
There was a problem hiding this comment.
would suggest reporting as multiple warnings
| _ <- reportWarning $ renderWarnings warnings | |
| _ <- traverse (reportWarning . renderWarning) warnings |
| ( ParserSettings (..) | ||
| , defaultParserSettings |
There was a problem hiding this comment.
We'll need to expose a safe interface for ParserSettings from Database.Persist.Quasi
Suggest creating new module Database.Persist.Quasi.ParserSettings which defines the interface + Database.Persist.Quasi.ParserSettings.Internal which exports everything
| renderWarnings :: [ParserWarning] -> String | ||
| renderWarnings warnings = intercalate "\n" $ fmap parserWarningMessage warnings | ||
|
|
||
| -- Attempts to parse with a provided parser. If it fails with an error matching |
There was a problem hiding this comment.
| -- Attempts to parse with a provided parser. If it fails with an error matching | |
| -- | Attempts to parse with a provided parser. If it fails with an error matching |
| -> Parser a | ||
| -> Parser a | ||
| -> Parser a | ||
| tryOrRegisterError msg f l r = do |
There was a problem hiding this comment.
If using single variable names, it is idiomatic to use p with a predicate (ie a -> Bool)
| -- Due to backtracking, it is possible that we have accumulated duplicate warnings. | ||
| -- For example, if two parsers which depend on the same subparser both attempt to | ||
| -- parse the same span of input, any warnings generated by that subparser will be | ||
| -- registered twice. | ||
| filteredWarnings = Set.toList $ Set.fromList warnings |
There was a problem hiding this comment.
Perhaps instead of using [Warning] in the WRiterT you use Set Warning? Then you get uniqueness by construction and more efficient additions anyway
There was a problem hiding this comment.
Oh yeah, of course — that's way better
There was a problem hiding this comment.
I'll still convert it back to a list at this point I think? since we'll want it to be a functor later
There was a problem hiding this comment.
Do we need Functor specificlaly? We can do Set.map
There was a problem hiding this comment.
Is that order-preserving?
There was a problem hiding this comment.
well, I guess it doesn't matter — we've thrown away the order by using Set in the first place
There was a problem hiding this comment.
So I guess the right thing to do is to sort them by source position after retrieving them from the set and before presenting them
| import Database.Persist.Quasi.Internal | ||
| import Database.Persist.Quasi.Internal.ModelParser |
There was a problem hiding this comment.
so these Internal imports are a bit smelly - can you make them explicit + ensure that anything we really need is available from Database.Persist.Quasi ?
There was a problem hiding this comment.
So at minimum we need to import ModelParser here in order to test it. We could expose ModelParser from Database.Persist.Quasi simply in order to avoid importing Internal, but I'm a bit reluctant to do that because I can't think of any legitimate reason that a consumer of this library would need to interact with ModelParser.
It must be exported from somewhere or it can't be used in these tests, but leaving it in an Internal module is a way to signal that it shouldn't be used. The library already seems to have the pattern of exposing some things needed in tests from Internal modules; see import Database.Persist.EntityDef.Internal on line 17.
I think at minimum explicitly importing the stuff we need from Internal is a good idea, but I have a slight preference for not exposing ModelParser from Database.Persist.Quasi.
There was a problem hiding this comment.
Yes - as discussed elsewhere, I'm fine importing internal modules, but I think we should try to use explicit imports and ensure that we're able to get what we need for a public API from the Database.Persist.Quasi
parsonsmatt
left a comment
There was a problem hiding this comment.
Code and imlpementation look great! A few comments on module organization
| @@ -0,0 +1,91 @@ | |||
| module Database.Persist.Quasi.PersistSettings where | |||
There was a problem hiding this comment.
ok naming conventions: without Internal in the name this whole module is public API
suggest doing, like,
module Database.Persist.Quasi.PersistSettings
( public
, api
, goes
, here
) where
...
module Database.Persist.Quasi.PersistSettings.Internal where
-- export everything
There was a problem hiding this comment.
also love putting this in it's own module
| let cpr = parse ps s | ||
| case cpr of | ||
| let (warnings, res) = parse ps s | ||
| _ <- traverse_ (reportWarning . parserWarningMessage) $ warnings |
There was a problem hiding this comment.
| _ <- traverse_ (reportWarning . parserWarningMessage) $ warnings | |
| traverse_ (reportWarning . parserWarningMessage) $ warnings |
don't need to <- out - the traverse_ function returns f () already, and you can safely not bind out of a f ()
| ( Set ParserWarning | ||
| , Either (ParseErrorBundle String Void) (a, ExtraState) | ||
| ) |
There was a problem hiding this comment.
| ( Set ParserWarning | |
| , Either (ParseErrorBundle String Void) (a, ExtraState) | |
| ) | |
| ParseResult (a, ExtraState) |
| :: (Monoid a) => [ParseResult a] -> CumulativeParseResult a | ||
| toCumulativeParseResult prs = do | ||
| let | ||
| (warnings, eithers) = sequence prs |
There was a problem hiding this comment.
very nice - sequence :: Monoid a => [(a, b)] -> (a, [b])
| case parseEntities ps filepath (Text.unpack source) of | ||
| (warnings, Right blocks) -> (warnings, Right (toParsedEntityDef mSourceLoc <$> blocks)) | ||
| (warnings, Left peb) -> (warnings, Left peb) |
There was a problem hiding this comment.
instance Functor ((,) r) where
fmap f (r, a) = (r, f a) We also have:
instance Functor (Either l) where
fmap f (Left l) = Left l
fmap f (Right r) = Right (f r)This means we can use fmap to not bother with the tuple structure, and fmap to not bother with the Either structure:
| case parseEntities ps filepath (Text.unpack source) of | |
| (warnings, Right blocks) -> (warnings, Right (toParsedEntityDef mSourceLoc <$> blocks)) | |
| (warnings, Left peb) -> (warnings, Left peb) | |
| fmap (fmap (toParsedEntityDef mSourceLoc )) <$> parseEntities ps filepath (Text.unpack source) |
This is a bit golfy, but it also reduces noise.
parsonsmatt
left a comment
There was a problem hiding this comment.
hell yeah let's roll
| , upperCaseSettings | ||
| , lowerCaseSettings | ||
| -- ** Getters and Setters | ||
| , module Database.Persist.Quasi |
There was a problem hiding this comment.
hmm does this reduce the interface of the module? may be good to look at a diff of the generated haddocks. stack haddock --open for here + the haddocks on persistent hackage
There was a problem hiding this comment.
Looks like the new interface is a superset of the old one, as it should be.
| then parserWarningMessage l <= parserWarningMessage r | ||
| else warningPos l <= warningPos r | ||
|
|
||
| -- @since 2.16.0.0 |
There was a problem hiding this comment.
| -- @since 2.16.0.0 | |
| -- | | |
| -- | |
| -- @since 2.16.0.0 |
these need to be in haddock comment blocks i think to render out
efc251b to
43fbd05
Compare
1c35d82 to
a18232c
Compare
parsonsmatt
left a comment
There was a problem hiding this comment.
nice 😎 let's get this in
This PR's branch is based on #1584. Most of the changes in this branch are also present in that PR's branch. I strongly recommend looking at #1584 first, and I've marked this PR as a draft until that one has been reviewed.
Per a conversation with @parsonsmatt, we would like the ability to optionally emit an error or warning when the entity DSL parser encounters tab characters in whitespace. This PR adds that, with the error level (error, warning, or none) configured in
PersistSettings.To maximize compatibility with the old parser, each tab is treated as a single character of whitespace.
After submitting your PR:
(unreleased)on the Changelog