Skip to content

optionally warn when encountering tabs in whitespace#1589

Merged
parsonsmatt merged 10 commits intoyesodweb:masterfrom
dtpowl:warnings
May 19, 2025
Merged

optionally warn when encountering tabs in whitespace#1589
parsonsmatt merged 10 commits intoyesodweb:masterfrom
dtpowl:warnings

Conversation

@dtpowl
Copy link
Contributor

@dtpowl dtpowl commented Apr 22, 2025

This PR's branch is based on #1584. Most of the changes in this branch are also present in that PR's branch. I strongly recommend looking at #1584 first, and I've marked this PR as a draft until that one has been reviewed.

Per a conversation with @parsonsmatt, we would like the ability to optionally emit an error or warning when the entity DSL parser encounters tab characters in whitespace. This PR adds that, with the error level (error, warning, or none) configured in PersistSettings.

To maximize compatibility with the old parser, each tab is treated as a single character of whitespace.


After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Bumped the version number if there isn't an (unreleased) on the Changelog
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

@dtpowl dtpowl changed the title Warnings optionally warn when encountering tabs in whitespace Apr 22, 2025
@dtpowl dtpowl force-pushed the warnings branch 10 times, most recently from 7eee714 to e4cc2bf Compare May 11, 2025 22:57
Comment on lines +103 to +107
let (warnings, res) = runConfiguredParser defaultParserSettings initialExtraState (some anyToken) "" s
case res of
Left peb ->
(warnings, Left peb)
Right (tokens, _acc) -> (warnings, Right tokens)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I think I would have this discard warnings for this function and another function tokenizeWithWarnings that returns warnings

This allows you to keep most of the tests the same, but then have new tests with new behavior that should have warnings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm — yeah, we could do that, but I'd advocate for keeping it the way it is now, because IMO not having warnings is part of the "successful parse" behavior that the tests are intended to exercise.

i.e., if we were to in the future make a change that inadvertently caused the parse task in one of the existing tests to generate a warning, we would want that test to start failing.

let cpr = parse ps s
case cpr of
let (warnings, res) = parse ps s
_ <- reportWarning $ renderWarnings warnings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest reporting as multiple warnings

Suggested change
_ <- reportWarning $ renderWarnings warnings
_ <- traverse (reportWarning . renderWarning) warnings

Comment on lines +10 to +11
( ParserSettings (..)
, defaultParserSettings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to expose a safe interface for ParserSettings from Database.Persist.Quasi

Suggest creating new module Database.Persist.Quasi.ParserSettings which defines the interface + Database.Persist.Quasi.ParserSettings.Internal which exports everything

renderWarnings :: [ParserWarning] -> String
renderWarnings warnings = intercalate "\n" $ fmap parserWarningMessage warnings

-- Attempts to parse with a provided parser. If it fails with an error matching
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- Attempts to parse with a provided parser. If it fails with an error matching
-- | Attempts to parse with a provided parser. If it fails with an error matching

-> Parser a
-> Parser a
-> Parser a
tryOrRegisterError msg f l r = do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using single variable names, it is idiomatic to use p with a predicate (ie a -> Bool)

Comment on lines +179 to +183
-- Due to backtracking, it is possible that we have accumulated duplicate warnings.
-- For example, if two parsers which depend on the same subparser both attempt to
-- parse the same span of input, any warnings generated by that subparser will be
-- registered twice.
filteredWarnings = Set.toList $ Set.fromList warnings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps instead of using [Warning] in the WRiterT you use Set Warning? Then you get uniqueness by construction and more efficient additions anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, of course — that's way better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll still convert it back to a list at this point I think? since we'll want it to be a functor later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need Functor specificlaly? We can do Set.map

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that order-preserving?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I guess it doesn't matter — we've thrown away the order by using Set in the first place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess the right thing to do is to sort them by source position after retrieving them from the set and before presenting them

Comment on lines +28 to +29
import Database.Persist.Quasi.Internal
import Database.Persist.Quasi.Internal.ModelParser
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so these Internal imports are a bit smelly - can you make them explicit + ensure that anything we really need is available from Database.Persist.Quasi ?

Copy link
Contributor Author

@dtpowl dtpowl May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at minimum we need to import ModelParser here in order to test it. We could expose ModelParser from Database.Persist.Quasi simply in order to avoid importing Internal, but I'm a bit reluctant to do that because I can't think of any legitimate reason that a consumer of this library would need to interact with ModelParser.

It must be exported from somewhere or it can't be used in these tests, but leaving it in an Internal module is a way to signal that it shouldn't be used. The library already seems to have the pattern of exposing some things needed in tests from Internal modules; see import Database.Persist.EntityDef.Internal on line 17.

I think at minimum explicitly importing the stuff we need from Internal is a good idea, but I have a slight preference for not exposing ModelParser from Database.Persist.Quasi.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - as discussed elsewhere, I'm fine importing internal modules, but I think we should try to use explicit imports and ensure that we're able to get what we need for a public API from the Database.Persist.Quasi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 💯

@dtpowl dtpowl marked this pull request as ready for review May 16, 2025 18:28
@dtpowl dtpowl requested a review from parsonsmatt May 16, 2025 18:29
Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and imlpementation look great! A few comments on module organization

@@ -0,0 +1,91 @@
module Database.Persist.Quasi.PersistSettings where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok naming conventions: without Internal in the name this whole module is public API

suggest doing, like,

module Database.Persist.Quasi.PersistSettings 
    ( public
    , api
    , goes
    , here
    ) where
   
...

module Database.Persist.Quasi.PersistSettings.Internal where
-- export everything

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also love putting this in it's own module

let cpr = parse ps s
case cpr of
let (warnings, res) = parse ps s
_ <- traverse_ (reportWarning . parserWarningMessage) $ warnings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_ <- traverse_ (reportWarning . parserWarningMessage) $ warnings
traverse_ (reportWarning . parserWarningMessage) $ warnings

don't need to <- out - the traverse_ function returns f () already, and you can safely not bind out of a f ()

Comment on lines +110 to +112
( Set ParserWarning
, Either (ParseErrorBundle String Void) (a, ExtraState)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
( Set ParserWarning
, Either (ParseErrorBundle String Void) (a, ExtraState)
)
ParseResult (a, ExtraState)

:: (Monoid a) => [ParseResult a] -> CumulativeParseResult a
toCumulativeParseResult prs = do
let
(warnings, eithers) = sequence prs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice - sequence :: Monoid a => [(a, b)] -> (a, [b])

Comment on lines +778 to +780
case parseEntities ps filepath (Text.unpack source) of
(warnings, Right blocks) -> (warnings, Right (toParsedEntityDef mSourceLoc <$> blocks))
(warnings, Left peb) -> (warnings, Left peb)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance Functor ((,) r) where
  fmap f (r, a) = (r, f a) 

We also have:

instance Functor (Either l) where
  fmap f (Left l) = Left l
  fmap f (Right r) = Right (f r)

This means we can use fmap to not bother with the tuple structure, and fmap to not bother with the Either structure:

Suggested change
case parseEntities ps filepath (Text.unpack source) of
(warnings, Right blocks) -> (warnings, Right (toParsedEntityDef mSourceLoc <$> blocks))
(warnings, Left peb) -> (warnings, Left peb)
fmap (fmap (toParsedEntityDef mSourceLoc )) <$> parseEntities ps filepath (Text.unpack source)

This is a bit golfy, but it also reduces noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hell yeah let's roll

, upperCaseSettings
, lowerCaseSettings
-- ** Getters and Setters
, module Database.Persist.Quasi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm does this reduce the interface of the module? may be good to look at a diff of the generated haddocks. stack haddock --open for here + the haddocks on persistent hackage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the new interface is a superset of the old one, as it should be.

then parserWarningMessage l <= parserWarningMessage r
else warningPos l <= warningPos r

-- @since 2.16.0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- @since 2.16.0.0
-- |
--
-- @since 2.16.0.0

these need to be in haddock comment blocks i think to render out

@dtpowl dtpowl force-pushed the warnings branch 3 times, most recently from efc251b to 43fbd05 Compare May 17, 2025 16:33
@dtpowl dtpowl force-pushed the warnings branch 10 times, most recently from 1c35d82 to a18232c Compare May 19, 2025 18:49
Copy link
Collaborator

@parsonsmatt parsonsmatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 😎 let's get this in

@parsonsmatt parsonsmatt merged commit 55e6d53 into yesodweb:master May 19, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants