Skip to content

Corefn Json representation#3049

Merged
paf31 merged 18 commits intopurescript:masterfrom
coot:corefn-json
Oct 2, 2017
Merged

Corefn Json representation#3049
paf31 merged 18 commits intopurescript:masterfrom
coot:corefn-json

Conversation

@coot
Copy link
Copy Markdown
Contributor

@coot coot commented Aug 19, 2017

Modify corefn json representation in a way to be able reproduce CoreFn from it.

Marcin Szamotulski added 11 commits August 19, 2017 07:57
@coot
Copy link
Copy Markdown
Contributor Author

coot commented Aug 20, 2017

I think this is ready for a review.

@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Aug 20, 2017

I'm wondering if we should add a purs codegen command, which runs the JS codegen on a corefn file. That way, we could implement DCE and other optimizer passes outside the compiler.

Also, @joneshf, do you have suggestions for the corefn JSON format, since we should break them here (or in a separate PR, but before 0.12)?

@joneshf
Copy link
Copy Markdown
Member

joneshf commented Aug 20, 2017

I'd suggest a separate PR. I don't really know what's happening here and I don't want to do a drive-by on this. I put some comments in the other PR

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Aug 20, 2017

This adds

  • annotations to curefn representation
  • represents binds as lists rather than objects so the order of recursive binds is preserved
  • adds moduleFromJSON that reconstructs CoreFn from its json representation (up to the type of foreign declaration.

This allows to pick up CoreFn from its json rep and turn it into javascript code.

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Aug 21, 2017

purs codegen would be really useful for writting tools that work with CoreFn in purescript where you just purs compile --dump-corefn; run-some-tool; purs codegen.

@coot coot mentioned this pull request Aug 22, 2017
Marcin Szamotulski added 4 commits August 24, 2017 08:51
- change the json representation of CoreFn
- store module name on the module object
- `buildMakeAction`: output single module json object, rather an object
with a single key (module name) pointing to module json object.
- read the new json CoreFn representation
- extend tests so that all the CoreFn structures are tested.
- `moduleToJSON`: include module comments inside `comments` field.
- `moduleFromJSON`
- add tests
@coot
Copy link
Copy Markdown
Contributor Author

coot commented Aug 24, 2017

I update the CoreFn json representation, taking into account comments in #3039
Here is a dump of Data.Newtype.

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Aug 24, 2017

I run the zephyr's test suite which compiles quite a few libraries. So the new json representation seems to work quite fine.

Maybe some things could be named better or represented in a better way.
@joneshf, @rightfold any comments?

Copy link
Copy Markdown
Member

@joneshf joneshf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for making these changes! I think it looks great!

literalToJSON t (ArrayLiteral xs) = toJSON ("ArrayLiteral", map t xs)
literalToJSON t (ObjectLiteral xs) = toJSON ("ObjectLiteral", recordToJSON t xs)
literalToJSON _ (NumericLiteral (Left n))
= object
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to go through and replace all of this now, but there's an alternative approach which skips over creating the intermediate data structure: https://hackage.haskell.org/package/aeson-1.2.1.0/docs/Data-Aeson.html#g:7

You'd change expressions from

object
  [ T.pack "foo" .= foo
  , T.pack "bar" .= bar
  ]

to expressions like

pairs
  ( T.pack "foo" .= foo
  <> T.pack "bar" .= bar
  )

It's supposedly faster and uses less memory. So given that the json blobs have gotten bigger, it might be a change that makes sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joneshf I think we'd need to add ToJSON instances to make it work (since the right hand side of .= expects an instance of ToJSON. This is a bit more work so I'd like to leave it for later.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just encode to Value on the right-hand-side which picks the trivial ToJSON instance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kritzcreek you're right, but let's do that in another PR.

, T.pack "foreign" .= map (identToJSON . fst) (moduleForeign m)
, T.pack "decls" .= map bindToJSON (moduleDecls m)
, T.pack "builtWith" .= toJSON (showVersion v)
, T.pack "comments" .= map toJSON (moduleComments m)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@no-longer-on-githu-b
Copy link
Copy Markdown
Contributor

I am very happy with this. 👍

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Sep 9, 2017

I have more, experimenting in zephyr. I'd like to add type annotations to corefn, would that be fine? (but this can be in another PR).

@no-longer-on-githu-b
Copy link
Copy Markdown
Contributor

no-longer-on-githu-b commented Sep 9, 2017

Yeah, that's fine, if they're available. I'd like to use them in my Go backend after finishing the purescript-purescript-compiler-backend-utils library. But I don't need them per se (can use interface{} for everything), so I'm fine with them not being there. But maybe other people find them more important.

@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Sep 9, 2017

I'm a little worried about the size of the generated JSON. Part of the reason for the unconventional format before was that I wanted to reduce the size and keep the output somewhat human-readable. Could you please see how many lines are generated for a bigger file? Maybe something like Thermite's main module? 4000+ lines for Data.Newtype seems quite a lot.

@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Sep 14, 2017

@coot Could you please try moving the source file info to the top level?

@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Sep 14, 2017

Was it not possible to just change the JSON renderer for SourcePos?

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Sep 15, 2017

I think it was, if you prefer that I'll update the PR. The only reason for having it exposed on the corefn module, is for tooling that might use it - if it's written in haskell and one would liket use the comipler moduleFromJSON. There is no tool for the moment that needs that though and it's very easy to get access to the path anyway - so maybe that's not a good enough reason to keep this.

@metaleap
Copy link
Copy Markdown

metaleap commented Sep 25, 2017

Looking forward to a more complete (incl types) JSON dump-corefn format! Can't wait for this to land, will it be in 0.12? (The old --dump-imperative-core would be great too but seems it's been postponed edit: just added that one, #3097) Not sure why anyone cares about the file size of an intermediate format to be further machine-processed. @rightfold where's your "Go backend" to be found? I was just about to embark on exactly such hackery, but if it's FOSS I'd rather contribute to yours

@no-longer-on-githu-b
Copy link
Copy Markdown
Contributor

no-longer-on-githu-b commented Sep 25, 2017 via email

@metaleap
Copy link
Copy Markdown

metaleap commented Sep 28, 2017

The latest commit in this PR from 2 weeks ago doesn't build at my end, anything I'm missing? stack is up to date. Well it passed your CI so the prob must be at my end, but I've applied all changes to a fresh unmodified fork of the current purescript/purescript master..

purescript-0.11.6: unregistering (local file changes: purescript.cabal src/Language/PureScript/CodeGen/JS.hs src/Language/PureScript/CoreFn/Desugar.hs ...)
purescript-0.11.6: configure (lib + exe)
Configuring purescript-0.11.6...
purescript-0.11.6: build (lib + exe)
Preprocessing library for purescript-0.11.6..
Building library for purescript-0.11.6..
[100 of 151] Compiling Language.PureScript.CoreFn.Module ( src/Language/PureScript/CoreFn/Module.hs, .stack-work/dist/x86_64-linux/Cabal-2.0.0.2/build/Language/PureScript/CoreFn/Module.o )
[101 of 151] Compiling Language.PureScript.CoreFn.Desugar ( src/Language/PureScript/CoreFn/Desugar.hs, .stack-work/dist/x86_64-linux/Cabal-2.0.0.2/build/Language/PureScript/CoreFn/Desugar.o )

/home/rox/g/purescript-nu-corefn/src/Language/PureScript/CoreFn/Desugar.hs:66:17: error:
    • The constructor ‘A.ValueDeclaration’ should have 1 argument, but has been given 5
    • In the pattern:
        A.ValueDeclaration (ss, com) name _ _ [A.MkUnguarded e]
      In an equation for ‘declToCoreFn’:
          declToCoreFn
            (A.ValueDeclaration (ss, com) name _ _ [A.MkUnguarded e])
            = [NonRec (ssA ss) name (exprToCoreFn ss com Nothing e)]
      In an equation for ‘moduleToCoreFn’:
          moduleToCoreFn env (A.Module modSS coms mn decls (Just exps))
            = let
                imports
                  = mapMaybe importToCoreFn decls ++ fmap ... (findQualModules decls)
                imports' = dedupeImports imports
                ....
              in Module coms mn (spanName modSS) imports' exps' externs decls'
            where
                dedupeImports :: [(Ann, ModuleName)] -> [(Ann, ModuleName)]
                dedupeImports
                  = fmap swap . M.toList . M.fromListWith const . fmap swap
                ssA :: SourceSpan -> Ann
                ssA ss = (ss, [], Nothing, Nothing)
                ....
   |
66 |   declToCoreFn (A.ValueDeclaration (ss, com) name _ _ [A.MkUnguarded e]) =
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

/home/rox/g/purescript-nu-corefn/src/Language/PureScript/CoreFn/Desugar.hs:195:12: error:
    • The constructor ‘A.TypeInstanceDeclaration’ should have 8 arguments, but has been given 6
    • In the pattern: A.TypeInstanceDeclaration _ _ _ q _ _
      In an equation for ‘fqDecls’:
          fqDecls (A.TypeInstanceDeclaration _ _ _ q _ _) = getQual' q
      In an equation for ‘findQualModules’:
          findQualModules decls
            = let
                (f, _, _, _, _)
                  = everythingOnValues
                      (++) fqDecls fqValues fqBinders (const ...) (const ...)
              in f `concatMap` decls
            where
                fqDecls :: A.Declaration -> [ModuleName]
                fqDecls (A.TypeInstanceDeclaration _ _ _ q _ _) = getQual' q
                fqDecls (A.ValueFixityDeclaration _ _ q _) = getQual' q
                fqDecls (A.TypeFixityDeclaration _ _ q _) = getQual' q
                fqDecls _ = []
                fqValues :: A.Expr -> [ModuleName]
                fqValues (A.Var q) = getQual' q
                fqValues (A.Constructor q) = getQual' q
                fqValues (A.TypeClassDictionaryConstructorApp c _) = getQual' c
                fqValues _ = []
                ....
    |
195 |   fqDecls (A.TypeInstanceDeclaration _ _ _ q _ _) = getQual' q
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

--  While building package purescript-0.11.6 using:
      /home/rox/.stack/setup-exe-cache/x86_64-linux/Cabal-simple_mPHDZzAJ_2.0.0.2_ghc-8.2.1 --builddir=.stack-work/dist/x86_64-linux/Cabal-2.0.0.2 build lib:purescript exe:purs --ghc-options " -ddump-hi -ddump-to-file"

Just tried to begin using this new --dump-corefn JSON representation as this PR has been just sitting here for some 2-5 weeks depending on how one counts 😀 and I really need to dump more type-related meta-data than the current version does!

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Sep 28, 2017

@metaleap I merged master. It should be compile now for you. Try this branch rather than applying a patch to master.

@metaleap
Copy link
Copy Markdown

That was the ticket for me to build at my end, thanks, now I can begin toying with this =)

, moduleImports :: [(a, ModuleName)]
, moduleExports :: [Ident]
, moduleForeign :: [ForeignDecl]
, moduleForeign :: [ForeignDeclT t]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a comment explaining why this type argument is needed?

@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Oct 1, 2017

@coot @rightfold @joneshf @kritzcreek @natefaubion Any more comments or shall I merge this now?

Copy link
Copy Markdown
Contributor

@paf31 paf31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Thanks for all of the work on this.

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Oct 1, 2017

My pleasure!

Copy link
Copy Markdown
Member

@kritzcreek kritzcreek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I made a few comments, but they are all optional suggestions.

case t of
"ProductType" -> return ProductType
"SumType" -> return SumType
_ -> fail ("not regonized ConstructorType: " ++ T.unpack t)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo here "regonized"

bindFromObj o = do
type_ <- o .: "bindType" :: Parser Text
case type_ of
"NonRec" -> (uncurry . uncurry $ NonRec) <$> bindFromObj' o
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(uncurry . uncurry) NonRec

Would be a little less noisy

literalToJSON t (ArrayLiteral xs) = toJSON ("ArrayLiteral", map t xs)
literalToJSON t (ObjectLiteral xs) = toJSON ("ObjectLiteral", recordToJSON t xs)
literalToJSON _ (NumericLiteral (Left n))
= object
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just encode to Value on the right-hand-side which picks the trivial ToJSON instance.

-- The CoreFn module representation
--
-- The json CoreFn representation does not contain type information. When
-- parsing it one gets back `ModuleT () Ann` rathern than `ModuleT Type Ann`,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: rathern

-- |
-- The CoreFn module representation
--
-- The json CoreFn representation does not contain type information. When
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we don't even use the types in moduleForeign. If that's the case, I'd actually rather just get rid of the types completely and lose the type argument.

Copy link
Copy Markdown

@metaleap metaleap Oct 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as they're kept around in externs, don't drop them there! (Just chiming in as I'm wrestling with an alternative backend not targeting a dynamic/untyped language like JS/Py/Erl etc ... from what I can tell I'll have to reconstruct a lot from literals & externs & such as I'm working off CoreImp that I dump from the final rawJS --- to get all the optimizations and the basic fn-to-imp transforms.) Just a quick reminder to not forget the awesome use-case for alternative backends, some of which may want to keep around as much type info as possible, to minimize usage of boxed-variants (ie what is object in Java or interface{} in Go, etc) being passed around.

In my fork I experimentally added all Anns to your --dump-corefn module, to see what I can get beyond what's in externs. Not that much more but last I checked, it did have a bit more detail. A bit exploratory all right now, to write a backend based on --dump-corefoo outputs rather than right in Haskell in the fork, but it's fun for now. All right I'm done blogging & derailing the thread 😀

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, we don't have any type information in the generated JSON right now.

@joneshf
Copy link
Copy Markdown
Member

joneshf commented Oct 2, 2017

This is great! I'm with it.

@paf31 paf31 merged commit d31638f into purescript:master Oct 2, 2017
@paf31
Copy link
Copy Markdown
Contributor

paf31 commented Oct 2, 2017

Let's merge this and fix up any remaining issues in master if we find any. Thanks!

@coot coot deleted the corefn-json branch October 2, 2017 18:29
coot pushed a commit to coot/purescript that referenced this pull request Oct 2, 2017
* CoreFn.FromJSON: parse json representation of CoreFn

* CoreFn.Module: ModuleT

Polymorphic ModuleT type: when parsing foreign declarations we don't have
access to their type, and thus we need a `ModuleT ()` type.  The
original `Module` type becomes an alias for `ModuleT Type`.

* Language.PureScript.CoreFn.DCE module

* CoreFn: include `SourceSpan`s in json representation

* CoreFn: add Meta to json representation

It is required when runnings `moduleToJs`.

* Update TestCoreFn

* Change json encoding of CoreFn Bind

* Remove `purs dce` command.

* Remove CoreFN.DCE module

* clean

* remove white space in TestCoreFn

* Object CoreFn representation.

* CoreFn moduleToJSON: new representation

- change the json representation of CoreFn
- store module name on the module object
- `buildMakeAction`: output single module json object, rather an object
with a single key (module name) pointing to module json object.

* CoreFn.FromJSON: update `moduleFromJson`

- read the new json CoreFn representation
- extend tests so that all the CoreFn structures are tested.

* Add Module Comments to json CoreFn representation

- `moduleToJSON`: include module comments inside `comments` field.
- `moduleFromJSON`
- add tests

* CoreFn json repr: modulePath to the top level

* Applied @paf31 & @kritzcreek review
@paulyoung
Copy link
Copy Markdown
Contributor

#3049 (comment)

I'm wondering if we should add a purs codegen command, which runs the JS codegen on a corefn file.

I was just wondering if it was possible to use --dump-corefn without producing JavaScript.

@coot
Copy link
Copy Markdown
Contributor Author

coot commented Oct 22, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants