Bytecode Notes

The bytecode is also a serialization format, able to represent any value that can ever exist within ValueScript (and in ValueScript, everything is a value).

Serialization is 'batteries included' - meaning that it includes everything contained by that entity.

Notably, when it comes to functions, this means that the format must provide everything needed to execute that function in another context without any further communication with the original context. This contrasts with JavaScript's Function.prototype.toString which only returns the source code, which usually includes unresolved references to other entities.

Interpreting bytecode always starts with parsing a value, and the value parsed at the beginning is the overall value being represented.

Values

These are the values that can currently be represented along with the leading byte that indicates them:

01 void
02 undefined
03 null
04 false
05 true
06 signed byte (as a number)
07 number
08 string
09 array
0a object
0b function
0c instance

The following bytes can also be encountered when decoding a value.

00 end

0d pointer
0e register
0f external

These aren't valid in all contexts. For example, when decoding an array, we keep decoding values until encountering end, but it would not be valid to have end immediately when decoding a top-level value.

Instructions

00 end
01 mov

02 op++
03 op--

04 op+
05 op-
06 op*
07 op/
08 op%
09 op**
0a op==
0b op!=
0c op===
0d op!==

0e op&&
0f op||
10 op!

11 op<
12 op<=
13 op>
14 op>=

15 op??
16 op?.

17 op&
18 op|
19 op~
1a op^
1b op<<
1c op>>
1d op>>>

1e typeof
1f instanceof
20 in

21 call
22 apply
23 bind
24 sub
25 submov
26 subcall

27 jmp
28 jmpif

Examples

undefined is serialized as the single byte 0x02. Deserialization begins at the start of the file and 0x02 means undefined so it just ends there.

Usually bytecode will represent a function, so the first byte will usually be 0x0b. For example:

function(a: number, b: number) {
  return a + b;
}

is serialized as:

00: 0b05 0204 0e02 0e03 0000

with comments:

00: 0b   // function
01: 05   // 5 registers
02: 02   // 2 parameters
03: 04   // op+
04: 0e02 // read register %a
06: 0e03 // read register %b
08: 00   // write the op+ output to register %return (0e is not needed in this
         // context)
09: 00   // end

Pointers

Pointers allow referencing other values. Pointers aren't exposed to the program though, they're just an implementation detail.

For example, the following two assemblies are equivalent:

@main = [1, 2]

@main = [1, @two]
@two = 2

In ValueScript, it is impossible to construct a circular object. However, if the following assembly was allowed, it would represent a circular object:

@main = [@other]
@other = [@main]

In fact, the following assembly might cause the runtime to loop infinitely when trying to load it:

@main = @main

To avoid these problems, non-function pointers must point to a location greater than the current location. This forces pointer chains to always progress through the bytecode and never form loops.

Function pointers are an important exception here to allow for recursion. For example, here's an assembly for the factorial function:

@factorial = function(%n) {
  op!== %n 0 %notZero
  jmpif %notZero :recurse
  mov 1 %return
  end
recurse:
  op- %n 1 %nMinus1
  call @factorial [%nMinus1] %previousFactorial
  op* %n %previousFactorial %return
}

3.5 KiB Raw Permalink Blame History

Bytecode Notes

Values

Instructions

Examples

Pointers

3.5 KiB

Raw Permalink Blame History