Conversation
In 4828b5d, we moved `Language.PureScript.Errors.SimpleErrorMessage` to `Language.PureScript.AST.Declarations`. Per the comment in the Pull Request: purescript#2230 (comment), we did this because we wanted to move `Language.PureScript.Errors.ErrorMessageHint` into `Language.PureScript.AST.Declarations.Expr`. Thing is, we didn't have to move everything just to move `Language.PureScript.Errors.ErrorMessageHint`. In fact, by making this move, we forced the AST to require knowledge about every type of error that can be created. For example, the AST has to know about parse errors (something that should happen before creating an AST) and code generation errors (something that should happen after creating an AST). In an ideal world, the AST would only know about errors that come about from modifications to the AST itself. The real issue that this led to was that the AST had to know about the errors that came about from the CST. Given that the CST is the surface syntax of the AST, that requirement seems a bit off. With the dependencies inverted this way, it's hard to decouple the AST from the CST. We take a step towards the ideal world by moving `Language.PureScript.AST.Declarations.SimpleErrorMessage` back into `Language.PureScript.Errors`. This should allow us to extract the AST from the `purescript` package itself such that it can be built upon in isolation from the rest of the package. We should revisit why `Language.PureScript.AST.Declarations.SimpleErrorMessage` is so massive in the future. It will probably make sense to separate the data type into the different categories of errors so those errors can live closer to where they should be constructed.
Prim stuff is part of PureScript the language, Prelude is part of the PureScript conventions. PureScript the language doesn't require an existence of Prelude in order to work properly. Instead of mixing these two levels, we separate the constants so we can pull the Prim constants out with the AST. In order to not break everything, we re-export the constants from the original module. We should put a timer on this and remove the re-exports after that point so we don't wind up with a bunch of cruft like the other re-export modules.
We've gone quite a ways with a sinle package for everything PureScript related. This has led to incredibly long compile times if you want to consume PureScript the library, as you have to compile everything no matter how much you want to really use. To help with downstream consumers, we start splitting into packages of useful functionality. We start with the AST as this _should_ be the very basis of everything PureScript related. From the AST, someone ought to be able to build packages atop it that add more functionality to the PureScript language. For instance, we can split out the CST as a package that builds atop the AST and adds more functionality (namely, parsing). It's maybe a little disheartening that there don't appear to be any tests around the AST. Maybe with things separated out a bit more, it'll be more inviting to write some tests. The modules we move here might not be all that intuitive, but they seem to be the core of what the PureScript AST is all about. The only questionable modules are the `Control.Monad.Supply` and `Control.Monad.Supply.Class` modules. We could flip the dependency in `Language.PureScript.Names` and move those values into the `Control.Monad.Supply.Class` module. It's unclear which way to go. Since the `Control.Monad.Supply.Class` module isn't in the `Language.PureScript` hierarchy, it might make sense for it to stay in the top-level. However, it might be too general for that and should ideally live in its own package (maybe even outside the project?). We also use PackageImports judiciously. When trying to figure out which dependencies went where, it was much harder than it should have been to figure this stuff out. When trying to figure out which dependencies were used and necessary, it was even harder to figure what the situation was. Finally, when trying to figure out where `Control.Monad.Supply.Class` came from, it took a frustratingly long amount of time to realize it was a module we had defined in the project. For all of these reasons, we use PackageImports to help future us know which dependencies are used, and where. Of course there are tools like weeder (https://github.com/ndmitchell/weeder) to help with finding unused dependencies and whatnot. But, it's also good to know statically what is being used and to allow other tools to be used. Finally, we update the license-generator to work with this split off package. We might have to do more work before we're done, but we can at least generate the license file in a similar manner to how we do at the top-level.
We extract out the CST to its own package that builds atop the AST. This should make it a bit easier to build other tooling atop the CST. In particular, anything that might need to parse PureScript files can now depend on this package to do so. Since this new package only depends on `purescript-ast` (and its transitive dependencies), it can be built in a reasonable amount of time. It's important to note that we don't pull out `Language.PureScript.CST`. That module is straddling the line between a re-exporting "prelude" and an adapter between `purescript-ast` and `purescript`. Since it deals with errors, and errors currently exist at the top-level, we cannot yet move this module down into `purescript-cst`. Probably what needs to happen is that we make a split between `Language.PureScript.CST` being a re-exporter and `Language.PureScript.CST` doing adapter work. Not clear which way to go on this one as both seem viable avenues. It's really great to see a test suite around some of this deep down functionality. Maybe we can use this example to feed back into the AST and make a test suite there.
|
I have no objection to splitting things up like this, it'll perhaps benefit compile/test times for non-syntax-related situations when working locally too 👍 |
joneshf
left a comment
There was a problem hiding this comment.
Okay, I think that's my review.
Some non-contextual questions:
- Did we get the package boundaries correct? The names of the modules might not all be quite right, but more the semantics behind them.
- Did we miss any modules that should be in one of these packages that isn't?
| synopsis: PureScript Programming Language Abstract Syntax Tree | ||
| description: Defines the underlying syntax of the PureScript Programming Language. |
There was a problem hiding this comment.
Wasn't entirely sure what to put here. Feel free to suggest any changes if you'd like.
| synopsis: PureScript Programming Language Concrete Syntax Tree | ||
| description: The surface syntax of the PureScript Programming Language. |
There was a problem hiding this comment.
Also not sure what to put here. Most of the wording is from the PR to add the CST. Down to change it to whatever.
|
Given that @natefaubion opened the original issue, requesting a review from him explicitly. You don't have to review this (feel free to re-request), nor do you have to review it in isolation (anybody is welcome to review). Just wanted to make sure there was at least one explicit reviewer. |
|
Thanks for tackling this! I think ideally purescript-cst would not need to depend on purescript-ast, but I don't know if it's a big deal. You addressed the double-duty aspect in a comment, and I agree, I don't think that conversions/adapters really needs to live with CST. If the primary tool that downstream authors want from this is to utilize the surface parser, then I'd think we'd want it to be as light as possible. I think aside from the conversions and reexports, CST only depends on What prompted the addition of package name imports? I'm not necessarily against it, but it doesn't seem like it's something that's commonly utilized in the Haskell ecosystem. |
|
Yeah, I wasn't entirely sure which way to go. At first I went that way and pulled out those two modules ( I might be reading too much into the wording, but it seems like the CST should depend on the AST. I don't have a strong argument against breaking the dependency, but why would we want to? A very thin parser package would be nice, but it might not be worth the effort if it's only for the sake of downstream consumers. From the perspective of As far as how to break the dependency, maybe we follow suit with what CST did for source positions and define something equivalent to The main motivator for It's a real shame that more people don't use this extension. I started using it a couple years back, found it to be incredibly useful with learning a new section of code, and never looked back. If we don't want it, I've no problem removing it. But if I'm maintaining this code as a fork, I'm keeping it. |
|
My reasoning for potentially inverting the dependency:
I'm not necessarily saying those changes are necessary, I'm just trying to explain choices that aren't apparent from what was actually executed. For package imports, is there a way to enforce usage? The compiler is not rigid stylistically, and code reflects the particular authors. I'm just concerned it will bit rot quickly, as there isn't a very large (Haskell) community drive to use this feature. |
|
Oh, I see what you're saying. I was coming at it from the perspective of CST being a parser for a particular syntax of PureScript the language. The idea being you could use a different syntax parser (maybe Algol-based syntax, maybe Lisp-based syntax, maybe another ISWIM-based syntax), convert it into the AST, and have the whole rest of the compiler work. As mentioned above, the difference between AST and CST isn't large enough for me personally to care about, so whichever way is right I'm okay with. Shall I switch things around? I don't think there's anything built into GHC to force |
Interesting! I hadn't considered that. |
|
I'm not particularly fussed about which modules go where at this stage. In general this seems okay; there are only two things I would like to address before merging:
|
I dunno, but this seems confusing to me. It's to implement tooling, which is almost always bound by what version of the language you support. It uses 0.1.3.4, what version of PureScript does that map to? I can see it both ways though. I haven't implemented tooling, so I'll defer to @joneshf on what is most helpful. |
|
Right, if we did separate the version numbers we'd need to keep a table in the README of each split-off package saying which package versions map to which compiler versions. I see your point too, though. It's just that the Hackage PVP is explicit about what you should and should not do as a package maintainer, and we would definitely be contravening it if we used the same versions across all of our packages. On second thoughts I'm happy to go either way to start with, we can wait and see if it actually turns out to be a problem in practice. |
|
FWIW I think that hackage supports some sort of metadata that your package doesn't follow the PVP. |
|
Just a side note on difference syntax frontends. Lately, I was actually thinking about creating a Swift/Rust/Go based syntax for PureScript. I was triggered by this list of languages that compile to WebAssembly. IMHO, the languages that emerged in the last two years are semantic wise very similar. That is, language theoretically, they do not add mind blowing features. They differ in the syntax they choose (Python based, C based, Haskell based, Ruby based, Lisp based, etc.) and if they are object-first of functional-first. One of the languages on the list is based on the idea of "functional, agree on semantics, disagree on syntax". But why make another one when PureScript has all the basic ML functionality extended with type classes, row types, higher-kinded types and now even poly kinds? ReasonML does something similar for OCaml. I think this could work for PureScript too. Whatever we're going to decide on this issue, it's worth keeping these possibilities in mind! |
Honestly, the reason I didn't use different versions to start with was because I thought it would be rejected. I think the answer to this question largely comes down to how we want these different packages to evolve. There are surely bugs we could fix in If we do want to allow I don't know how much external tooling should be factoring into this decision. It's good feedback to have, but it shouldn't be deciding our versioning scheme. A table of which versions work together seems like a fine way to deal with compatibility. My personal preference (not tooling-based) would be whatever allows more iteration in PureScript the language. If we don't have to always think about the entire language when making a change, but can focus on doing what's the next improvement for a part of the language, we ought to end up in a better place. The bigger decisions of how does a change affect the entire language will still happen. But they will happen in the |
|
Your reasoning sounds good to me @joneshf! |
Per review, we want our extension strategy to be that we have a uniform
language across the codebase with the exception of questionable
extensions like `TemplateHaskell`. We move as many sensible extensions
to the `default-extensions` field in `package.yaml` so we start building
up our uniform language.
There are three other extensions we don't move at the moment: `CPP`,
`DeriveAnyClass`, and `ImplicitParams`.
* `CPP` - This extension is just as questionable (if not more so) as
`TemplateHaskell`. Following a similar reasoning, we force people to
turn it on explicitly.
It's only used in the `Language.PureScript.Crash` module. That
module might be better off if we stopped casing on the compiler
version. It doesn't seem like we support older versions of GHC
anymore, so we might be better off dumping the extension outright.
* `DeriveAnyClass` - This extension is known to interact weirdly with
`GeneralizedNewtypeDeriving`:
https://gitlab.haskell.org/ghc/ghc/wikis/commentary/compiler/deriving-strategies.
Since we use `GeneralizedNewtypeDeriving` in more places, and
`DerivingStrategies` exists to address the issues with
`DeriveAnyClass`, we don't turn this on across the entire codebase.
* `ImplicitParams` - Similar to `CPP`, this extension is only used in
the `Language.PureScript.Crash` module. And similarly, if we don't
support older versions of GHC, we can drop this extension outright.
If any of this is the wrong decision, we can always change it.
Similar to the motivation behind moving `purescript-ast`s extensions to the `default-extensions` field of its `package.yaml`, we follow suit and move the sensible extensions to `purescript-cst`s `default-extensions` field of its `package.yaml`. We ended up dropping `MonoLocalBinds`, as the module compiles without the extension.
The license generator exists for the sake of distributing the licenses that the `purescript` binary uses. We don't need to use the generator for the other packages. We want the other packages to have a normal license file. We also roll back the changes to the `Makefile`.
|
@hdgarrood @natefaubion shall we go with differing versions and a compatibility table? That's in line with what @hdgarrood was saying about packages being good |
|
@joneshf I don't have any problem with that. |
kritzcreek
left a comment
There was a problem hiding this comment.
Changes look good to me, one thing to note is that this failed to build for me until I ran stack clean (Just in case someone has the same issue). Thanks for this @joneshf!
|
The tests are failing locally with |
Per review, this seems like a way to go that should foster each package being able to evolve in a way that isn't hindered by the rest of the packages at large.
Per review, this seems like a way to go that should foster each package being able to evolve in a way that isn't hindered by the rest of the packages at large.
|
I think we might want to occasionally release versions of |
We only kept `Language.PureScript.Constants` around so we could figure out the changes while we were splitting out `purescript-ast`. Now that things have settled down a bit, we remove the module and use the other imports directly.
|
Thanks for all the reviews! I pushed the last couple of commits to move to separate versions, add a compatibility table, and merge This was failing locally for me with: Looks like the line is |
Now that we have more than one package, we have to do things a little different. `stack sdist` will create an sdist for each package. This means we end up with a directory that looks like: ```console $ tree sdist-test sdist-test ├── purescript-0.13.6.tar.gz ├── purescript-ast-0.1.0.0.tar.gz └── purescript-cst-0.1.0.0.tar.gz 0 directories, 3 files ``` If we try to use `tar -xzf sdist-test/purescript-*.tar.gz -C sdist-test --strip-components=1`, the glob will expand to all three sdists and fail with a cryptic error: ```console $ tar -xzf sdist-test/purescript-0.13.6.tar.gz sdist-test/purescript-ast-0.1.0.0.tar.gz sdist-test/purescript-cst-0.1.0.0.tar.gz -C sdist-test --strip-components=1 tar: sdist-test/purescript-ast-0.1.0.0.tar.gz: Not found in archive tar: sdist-test/purescript-cst-0.1.0.0.tar.gz: Not found in archive tar: Exiting with failure status due to previous errors ``` We change to mirroring the directory structure within the `sdist-test` directory, and running that way. It's a bit unfortunate that the top-level package has to be handled differently, but this should help us move forward.
|
I think the reason this is passing on TravisCI but failing locally is that there's no golden file to compare it to. Indeed, if you remove the file, it seems to pass fine because it regenerates the golden file. It's the same situation on I added a few commits since the approval. Anybody want to give it another once over, or should we merge? |
|
I'm going to assume the lack of response here but activity elsewhere is saying another once over is not necessary. I'll merge shortly. |
|
I don't have any problem merging this I would just like to consider the conflict effects on the polykinds pr. |
|
Like merge conflicts, or something semantic? Do you want to merge the polykinds PR first? |
|
Yeah, merge conflicts. I don't know if it's a big deal, but I might need your help. |
|
Gotcha. So there's a few options here:
Any of those sound amenable to you? |
|
I don't think separate PRs are necessary. It's probably not nearly as big of a deal as I was imagining, since most of this is file relocations and import changes. Maybe I will just attempt a merge locally and see what the damage is. |
|
FWIW, I tried merging the Polykinds branch into this one last night. The conflicts were deleting any |
|
@joneshf It doesn't sound too bad. Of course, I'm happy to accept any help if you offer it 😄 |
|
K. I'll merge this tomorrow then. |
|
Thanks for the reviews all! |
This is a first crack at splitting into multiple packages. This follows some of what @jmackie did, and does some stuff differently.
My primary motivation for doing this is to make building
purtyless time-consuming. For reference, a clean TravisCI build barely makes it in the time allotted on Linux and always timesout by a few minutes on macOS. An older build that built all of thepurescriptpackage (andpurtytests) on macOS took 1,718s. With the packages split, the equivalent step on macOS took 691s. The entire clean build on macOS took 34min 25s, which is well under the timeout. Secondary motivations are the things laid out in the multiple packages issue.Since my primary motivation is to make building
purtyless time-consuming, I will end up supporting this fork for the indefinite future if this PR is not accepted and an alternative is not proposed. Sincepurtyis the motivation, I'll only sync it on released versions. I have no problem making changes to get this PR accepted, but I only have so much motivation to do so. Which is to say, I'm not going to be mad if this PR is rejected, but I'm also not going to do a ton of work to get this PR accepted.This is currently branched off of v0.13.6 due to the motivations above. Will get up to date if it's accepted.