Skip to content

Commit 212aec2

Browse files
committed
Add CompilerInternals.md
1 parent 95e3c30 commit 212aec2

File tree

3 files changed

+54
-3
lines changed

3 files changed

+54
-3
lines changed

.tool-versions

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
erlang 20.1
2-
elixir 1.6.0-otp-20
1+
erlang 20.3
2+
elixir 1.6.5-otp-20
33
nodejs 8.9.1

CompilerInternals.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Compiler Internals
2+
3+
This is a document describing how ElixirScript works. This is intended for those who would like to contribute to ElixirScript or those who are curious how it works.
4+
5+
## Input
6+
7+
[ElixrScript.Compiler](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/compiler.ex) is the entry point of the compiler. It takes in either a module or a list of modules. These are what are called the `entry modules` or the entry points into your application. These are the places ElixirScript will start it's compilation process. It will traverse what is used and only compile those things. This is the first step in the compilation process. Finding used modules to compile.
8+
9+
## Finding Used Modules
10+
11+
[ElixirScript.FindUsedModules](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/find_used_modules.ex) looks at our entry modules and recursively crawls them to find all the modules used. It firsts exacts the Abstract Syntax Tree (AST) from the Beam file and then looks for references to modules that haven't been crawled yet. This information is stored in [ElixirScript.State](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/state.ex)
12+
13+
## AST Extraction from Beam Files
14+
15+
ElixirScript requires at Erlang 20+ and Elixir 1.6+. The reason why is that in Erlang 20 there is a new feature that allows for debug information to be stored in beam files. Any of the beam languages can use this. Elixir uses it by storing the AST for the module in there. This is a special version of the AST where all of the macros are expanded. This means ElixirScript does not have to worry about macro expansion itself. This AST is what ElixirScript works with.
16+
17+
The code for this is in the [ElixirScript.Beam](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/beam.ex) module.
18+
19+
`ElixirScript.debug_info/1` takes in a module name and returns the AST for that module. For a normal module, `{:ok, map}` are returned. If a protocol is given, `{:ok, atom, map, list}` is returned. The `atom` is the name of the protocol, The `map` is the protocol's AST and the `list` is the list of all of the implementation modules.
20+
21+
This module handles the `String` and `Agent` modules a little bit differenly. Because of how Elixir compiles the unicode library, ElixirScript has to be careful not to compile the entire unicode library in JavaScript. So here, `debug_info` will get the AST from `String`, but replace some functions with the AST from `ElixirScript.String`. This ensures ElixirScript uses versions of functions in the standard lib that won't bring in the unicode module. The ame thing happens for `Agent` for different reasons. `Agent` is the only OTP module ElixirScript supports. ElixirScript hacks together a version of `Agent` that stores state in a way that allows ElixirScript users to use `Agent` just like they would with Elixir.
22+
23+
## Finding Used Functions
24+
25+
[ElixirScript.FindUsedFunctions](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/find_used_functions.ex) is our second process in shrinking our compilation suface. In this process, we crawl through the modules we have found for compilation and see which functions are actually being called. This information is also stored in [ElixirScript.State](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/state.ex) for each module.
26+
27+
**Note**: Because of the way protocols work, it is impossible to know what is used and what isn't. So for protocols and their implementations, we have to take in everything.
28+
29+
Now we have what we need to compile to the JavaScript AST.
30+
31+
## JavaScript AST (ESTree)
32+
33+
Before going further, here is a brief intro into the JavaScript AST we use. The [ESTree spec](https://github.com/estree/estree) is a specification based on SpiderMonkey's JavaScript AST. This is used by several tools in the JavaScript ecosystem. There are many other versions of JavaScript ASTs, but the reason ElixirScript uses this one is because there are popular tools in the JavaScript ecosystem that understand it. ElixirScript uses the [ESTree](https://github.com/elixirscript/elixir-estree) Hex package. This package has structs that represent ESTree Nodes. It can also turn those into JavaScript code.
34+
35+
## Translation
36+
37+
[ElixirScript.Translate](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/translate.ex) starts off the translation process. All this module does though is call [ElixirScript.Translate.Module](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/translate/module.ex) on each of our modules. Here is where we take in the module info for each module and start translating to JavaScript AST. We compile the function definitions into JavaScript. Here is where we process the information gained from `ElixirScript.FindUsedFunctions` to remove any unused functions. In Elixir, function names are made up of the name and the arity. In JavaScript, that is not the case. ElixirScript combines function arities here into one definition. From here, ElixirScript compiles each function and places the translated AST back into `ElixirScript.State`.
38+
39+
Functions comprise of clauses. Clauses have guards and blocks. Blocks being the blocks of code that make up the implementation.
40+
41+
[ElixirScript.Translate.Function](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/translate/function.ex) handles function translation. `ElixirScript.Translate.Function.compile_block\2` handles compilation of blocks. for each item in the block, `ElixirScript.Translate.Form.compile\2` is called. This is what is responsible for a bulk of the translation.
42+
43+
Another aside to talk about function translation. Elixir supports tail call recursion. JavaScript does not. To allow our ElixirScript-translated functions to do so, we use a technique called `trampolining`. ElixirScript implementation still has some bugs, but it works for the most part.
44+
45+
## Pattern Matching Translation
46+
47+
Patterns are processed using [ElixirScript.Translate.Forms.Pattern](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/translate/forms/pattern.ex). It takes all the forms of patterns and compiles them into JavaScript AST. The AST represents calls to the [Tailored](https://github.com/elixirscript/tailored) JavaScript library. This library is responsible for pattern matching at run time.
48+
49+
## Output
50+
51+
[ElixirScript.Output](https://github.com/elixirscript/elixirscript/blob/master/lib/elixir_script/passes/output.ex) is the last step in compilation. This modules is responsible for creating JavaScript modules and writing them to the file system. Each Elixir module is translated into a JavaScript module.

mix.exs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ defmodule ElixirScript.Mixfile do
1414
test_coverage: [tool: ExCoveralls],
1515
docs: [
1616
main: "ElixirScript",
17-
extras: ["JavaScriptInterop.md"]
17+
extras: ["JavaScriptInterop.md", "CompilerInternals.md"]
1818
]
1919
]
2020
end

0 commit comments

Comments
 (0)