Skip to content

Latest commit

 

History

History
401 lines (297 loc) · 12.4 KB

File metadata and controls

401 lines (297 loc) · 12.4 KB

FunctionalScript Language

Two main FunctionalScript principles:

  1. if FS code passes validation/compilation, then it doesn't have side-effects,
  2. the code that passed validation/compilation should behave on FunctionalScript VM the same way as on any other modern JavaScript engine.

When we implement features of FunctionalScript, the first priority is a simplification of the VM.

File Types:

File Type Extension Notes
JSON .json Tree.
FJS .f.js, .f.ts Graph with functions.

Note: An FJS value can't be serialized without additional run-time infrastructure.

1. JSON

VM:

We are introducing new commands in such a way that every new command depends only on previous commands.

format any Tag
JSON undefined 00
null 01
false 02
true 03
number 04 u64
string 05 String
array 06 Array
object 07 Object

2. DJS

The DJS form a graph of values. It can be serialized without additional run-time information.

format any Tag Notes
DJS const_ref 80 u32 const
bigint_plus 08 Array bigint
bigint_minus 0A Array bigint
undefined 0B undefined
own_property 0C property-accessor
instance_property 0E property-accessor
instance_method_call 0F property-accessor
at 10 property-accessor
operators operators

2.1. Required

  1. default-export,
  2. const,
  3. default-import.

2.2. Priority 1

We need it to use JSDoc and TypeScript.

  1. block-comment,
  2. namespace-import.

2.3. Priority 2

  1. undefined,
  2. bigint,
  3. property-accessor,
  4. operators,
  5. grouping,
  6. built-in.

2.4. Syntactic Sugar

  1. identifier-property,
  2. line-comment,
  3. trailing-comma,
  4. shorthand,
  5. destructuring.

3. FJS

The FJS can have functions. The format requires additional run-time information for serialization.

format any Tag Notes
FJS function Func function

3.1. Required

  1. function
  2. parameters
  3. body-const

3.2. Priority 2

  1. if. See https://developer.mozilla.org/en-US/docs/Glossary/Falsy
  2. let
  3. while
  4. export
  5. Ownership of Mutable Objects (Singletons)

3.3. Priority 3

  1. Regular Expressions.
  2. type inference
  3. promise
  4. class
  5. Temporal classes.

3.4. Syntactic Sugar

  1. expression
  2. one-parameter
  3. assignments
  4. async/await. Depends on the implementation of promises.

4. ECMAScript Proposals

  1. Type Annotations, Stage 1:
    • Node.js,
    • Deno supports TypeScript,
    • Bun supports TypeScript,
    • most browsers don't support the feature.
  2. Pipe Operator |>, Stage 2.
  3. Records and Tuples, Stage 2: One problem with such records and tuples is that they can't hold safe, immutable functions. Maybe we need something like #(a) => a * 2.
  4. Pattern Matching, Stage 1.
  5. Safe Assignment Operator.
  6. Temporal.

Wish list:

  1. Utf8 String. Something like u8"Hello, world".

5. I/O

5.1. Isolated I/O

Using dependency injection.

This implementation of VM requires the implementation of external functions.

5.2 Isolated Asynchronous I/O

It requires a promise implementation.

5.3. State Machine with Asynchronous Requests

VM doesn't need to implement external functions or promises.

type RequestType = ...;
type Request = readonly[Input, Continuation];
type Continuation = (_: Output) => Request;
type Main = Request

6. Content-Addressable VM

See also Unison, ScrapScript. And ZK: Lurk.

The main target is run-time performance.

Hash function: most likely SHA256 because there is a lot of hardware support from modern processors.

Hash structure: we will use several initial hashes for a compress function.

We may use CDT for huge arrays, objects, strings, and BigInts.

The first bit of a hash is reserved for a tag. If the tag is 0, we have raw data with 1 at the end. A hash with all zeroes is used for undefined. If the first bit is 0, then the value is a hash. So, we have only 255 bits for a hash.

Because we use tagged hash, we can keep small values in a nanenum. So it may reuse a lot from non-content addressable VM and ref-values can keep a hash value inside.

Instead of an address, we can use a prefix, hash. 48 bits should be enough for most cases. However, we also need a mechanism to resolve collisions (even if they are rare). For example, our value can be an enum like this

enum Value {
   Data(...),
   Hash(u48),
   Ref(u48),
}

However, while the === operation can be faster, Value::Hash is slower when we need to access the object's internals because it requires two dereference operations. So, we may come back to using only references.

enum Value {
   Data(...)
   Ref(u48)
}

The collision probability for 48 bits is 50% for 16777216 = 2^24 hashes (birthday attack).

7. Object Identity

To build custom dictionaries when using functions as a key, we need either an object identifier (for hash map O(1)) or a proper comparison operator (for BTree map O(log(n))). The best option now is to use < and then use an array for items that satisfy (a !== b) && !(a < b) && !(b > a).

One of the options is to use Map. The Map type is mutable and requires an object ownership tracking, similar to Rust.

7.1. Hack For Map. Add ReadonlyMap

type ImmutableMap<K, V> = {
    readonly set(k: K, v: V): ImmutableMap<K, V>
    readonly delete(k: K): ImmutableMap<K, V>
}

const immutableMap = <K, V>(map: ReadonlyMap<K, V>) => ({
    set: (...kv: readonly[K, V]) => new Map([...map, kv])
    delete: (k: K) => new Map([...map.filter([k] => k !== k)])
})

7.2. Hack For Map. Special Instructions

// a special expression which is converted into one command.
new Map(a).set(k, v)

// a special expression which is converted into one command.
const b = new Map(a)
b.delete(k)
type ImmutableMap<K, V> = {
    readonly set(k: K, v: V): ImmutableMap<K, V>
    readonly delete(k: K): ImmutableMap<K, V>
}

const immutableMap = <K, V>(map: ReadonlyMap<K, V>) => ({
    set: (k: K, v: V) => new Map(map).set(k, v)
    delete: (k: K, v: V) => {
        const x = new Map(map)
        x.delete(k)
        return x
    }
})

8. Mutable Objects and Ownership Tracking

The zero stage is to support let.

The first stage is to support mutable objects only as a local variable:

const my = () => {
   const map = new Map()
   map.set("hello", "world!") // we can change `map` here because we've never pass `map` to anything else.
   f(map) // now `map` is immutable.
   return map // returns an immutable Map.
}

Other stages may include passing mutable objects as parameters.

Implementing IO using mutable objects with ownership tracking:

type Io<S> = {
    readonly consoleLog: (s: S, msg: string) => void
    // ...
}

Or immutable

type Io<S> = {
    readonly consoleLog: (s: S, msg: string) => S
    // ...
}

With class supports, mutability state can be encapsulate with methods into one class:

class VirtualIo {
   #buffer = []
   function log(s: string) {
      this.#buffer.push(s)
   }
   function reset(x: readonly string[]) {
      // note, that we have to do a deep clone.
      this.#buffer = []
      for (const i of x) {
         this.#buffer.push(i)
      }
   }
}

8.1. Local mutability

const ar = () => {
   const a = [] // a is mutable
   a.push('hello')
   return a // now it's immutable
}

8.2. Pass Mutability

const ar = a => { // a is marked as mutable because we don't use the object anywhere else.
   a.push('hello')
}

const f = () => {
   const a = []
   ar(a)
   return a
}

Here we consider an object is mutable if it has only one path.

const f = a => { // a is not mutable because we return a
   return a
}

const f1 = a => {
   const x = () => {
      // a is mutable
      a.push('x')
   }
   x()
}

const f2 = a => {
   const x = () => {
      a.push('x')
   }
   return x // compilation error.
}

Should we have a global analysis?

8.3. Async in Tests

type Fs = {
   readonly readFile: (name: string) => Promise<string>
   readonly writeFile: (name: string, text: string) => Promise<void>
}

// Should every test receive a state object, similar to Map?

type AsyncMap<K, V> = {
   readonly asyncGet: (k: K) => Promise<V|undefined>
   readonly asyncSet: (k: K, v: V|undefined) => Promise<void>
}

const fs = (state: AsyncMap<string, string>) => ({
   readFile: async(name: string): Promise<string> => { /* ? */ }
   writeFile: async(name: string, text: string): Promise<void> => { /* ? */ }
})

// Another option is to allow access to `let` in `async` functions.

const test = async(f: (fs: Fs) => Promise<void>): Promise<void> => {
    let x = new Map()
    const fs = {
        readFile: async() => {
            x.get() /* ... */
        }
        writeFile: async(k: string, v: string) => { // should writeFile
            x = new Map(concat(x, [[k, v]]))
        }
    }
    await f(fs)
}

9. Bytecode

Let's have a concise and relatively stable "external" bytecode model that is fit for representation of processed FS code upon serialization (saving in-memory function object) / deserialization (loading previously saved function objects). A VM implementation has an option to transform that external bytecode into its internal representation on loading.

Looking from that perspective we should design external bytecode in most flexible manner, allowing for various kinds of optimizations in VM implementations. For example, bytecode that always copies arguments of a function call to devoted stack slots (before the proper call instruction) disallow optimization opportunities for calling well-known host (built-in) functions that can be implemented without excessive copying / slot allocations.

On the other hand we target for simplicity and speed of bytecode loader, for example, we represent number literals as an LSB double floating point (8 bytes - LSB picked here since it's an in-memory representation more commonly used in popular processor architectures).

  1. Call-like instructions