Refactoring token parsing in quasi module#1206
Conversation
parsonsmatt
left a comment
There was a problem hiding this comment.
There's one behavior change here, but otherwise this looks great.
| * Add `createRawSqlitePoolFromInfo`, `createRawSqlitePoolFromInfo_`, | ||
| `withRawSqlitePoolInfo`, and `withRawSqlitePoolInfo_` to match the existing | ||
| pool functions for regular `SqlBackend`. [#983](https://github.com/yesodweb/persistent/pull/983) | ||
| >>>>>>> master |
persistent/Database/Persist/Quasi.hs
Outdated
| tokenize t | ||
| | T.null t = [] | ||
| | "-- | " `T.isPrefixOf` t = [DocComment t] | ||
| | "-- | " `T.isPrefixOf` t = [DocComment (T.drop 5 t)] |
There was a problem hiding this comment.
I'm not a fan of the magic number - is that T.length "-- | " == 5 ?
If so, may be better to change this to a pattern guard:
| | "-- | " `T.isPrefixOf` t = [DocComment (T.drop 5 t)] | |
| | Just txt <- T.stripPrefix "-- | " t = [DocComment t] |
There was a problem hiding this comment.
Oh. this is a breaking change, actually - the original code has the documentation comments unmodified, eg still carrying the -- | stuff.
There was a problem hiding this comment.
Yeah, this isn't so great but this function here returns the -- | to the Text value representation of DocComment:
persistent/persistent/Database/Persist/Quasi.hs
Lines 560 to 564 in 10bc6c5
IE when it gets sent back into the EntityDef here
lineText ultimately calls tokenText to convert the line into text for those EntityDef fields - there is a bit more detail on the PR description about this. I don't think it is amazing, but the state of this currently isn't too great either, and I think this change is a step in the right direction and certainly is making use of the types more rather than passing around Text values.
I'm happy to update/write any tests to assert nothing has broken, as this should not break anything, my intention is to maintain the existing functionality here. I have obviously had to remove/update some tests due to changing types, and removing some functions, and I've tried to update them but unfortunately it isn't the cleanest diff so that may not be so obvious (and there may be some flaws in how I've changed the tests possibly?).
Looking through the tests again, I guess we have nothing that asserts what should appear on the EntityDef at the end of this process (AFAICT), perhaps I could add some along with this PR to assert that that behaviour has been maintained?
There was a problem hiding this comment.
That's a good idea :)
There was a problem hiding this comment.
Ok so I've added some tests for parse, I've tried to cover off most variations of comments in there, but if there is a particular entity definition you want me to check (or some use case I've missed out or something), then I can always add that to these tests. You can see now though that it is retaining the -- | in the values passed back to the entity def in these tests:
persistent/persistent/test/main.hs
Lines 283 to 286 in ddb7b0a
persistent/persistent/test/main.hs
Lines 351 to 354 in ddb7b0a
This field here, however, does strip out the -- | :
persistent/persistent/test/main.hs
Lines 361 to 364 in ddb7b0a
But this is the case in master, this function here is loading the comment into LinesWithComments:
persistent/persistent/Database/Persist/Quasi.hs
Lines 698 to 701 in 03e794f
Where isComment is stripping out the -- | portion:
persistent/persistent/Database/Persist/Quasi.hs
Lines 661 to 663 in 03e794f
As a way of testing the tests, I cherry picked these onto master and ran them to assert that they assert the same functionality as what is in master currently. Hope this is sufficient to cover off that this change is not having any detrimental impact, if there is something I have missed from the above that would be good to check then I can always add more checks here 👍
There was a problem hiding this comment.
oh man i never even thought to test that the comments weren't being picked up by attrs and extras! That feels like a silly mistake on my part.
Great investigation, thanks! I'm happy with this.
persistent/Database/Persist/Quasi.hs
Outdated
| lns <- NEL.nonEmpty (T.lines txt) | ||
| NEL.nonEmpty $ mapMaybe parseLine (NEL.toList lns) | ||
|
|
||
| -- TODO: refactor to return (Line' NonEmpty) |
There was a problem hiding this comment.
I noticed I'd left this in, but there is no real explanation here... sorry! This is something I wanted to change, but was worried that it may inflate this initial PR, I will address this following this PR I think. Effectively we can parse a Line' NonEmpty upfront here, which means we can greatly simplify the implementation of Line' as it no longer needs to be polymorphic as the the type used for the 'collection' of tokens. I will do this as another change though following this one.
If you don't want TODOs lying around I can remove this 👍
There was a problem hiding this comment.
Could you leave this PR comment in the note? That'd be great for whoever sees this TODO later on. THanks!
|
This is ready for another look now :) With the comments thing, is it that the comments that appear in I don't really use this functionality in persistent so unsure how exactly it is meant to work. If that is something that needs addressing I can open an issue for this to follow this PR 👍 (or if you can open one with some detail about the changes required I'd be happy to work on this) |
|
Yeah, the comments should only be in |
Before submitting your PR, check that you've:
@sincedeclarations to the HaddockAfter submitting your PR:
(unreleased)on the ChangelogFurther refactoring of the
Quasimodule. Currently thetokenizefunction is collectingSpacetokens, however spaces are largely immaterial to the process of parsingEntityDefs, besides parsing the amount of indentation into theLinetype.The existence of space between tokens is implicit in the result of the token parsing, and there are no rules around the amount of spacing that comes between tokens, so this PR changes the
Tokentype so it does not include spaces at all, maintaining existing functionality, and parsing directly into theLinetype which contains all the necessary tokens and indentation. This enables us to get rid of some redundant functions (empty,removeSpaces) which retrospectively look back through the parsed space tokens in order to remove them.This change also allows us to differentiate between
Tokens andDocCommentvalues, where currently we are re-parsingDocComments(viaisComment), this function has been re-written to make use of the initial tokenization to detectDocComments.Getting a list of
Textvalues has been recovered with the functionlineTextwhich effectively gets whattokensfromLinepreviously got you. I think this is mainly becauseEntityDef's are afterTextvalues here (entityDerivesandentityExtraI think this applies to), I stopped at this point as I didn't want to change theEntityDefdefinition as part of this change. There is still plenty of room for more refactoring following this change though.