Beam has two different instructions sets, an internal instructions set, called specific, and an external instruction set, called generic.
The generic instruction set is what could be called the official instruction set, this is the set of instructions used by both the compiler and the Beam interpreter. If there was an official Erlang Virtual Machine specification it would specify this instruction set. If you want to write your own compiler to the Beam, this is the instruction set you should target. If you want to write your own EVM this is the instruction set you should handle.
The external instruction set is quite stable, but it does change between Erlang versions, especially between major versions.
This is the instruction set which we will cover in this chapter.
The other instruction set, the specific, is an optimized instruction set used by the Beam to implement the external instruction set. To give you an understanding of how the Beam works we will cover this instruction set in [CH-Internal_instructions]. The internal instruction set can change without warning between minor version or even in patch releases. Basing any tool on the internal instruction set is risky.
In this chapter I will go through the general syntax for the instructions and some instruction groups in detail, a complete list of instructions with short descriptions can be found in [AP-Instructions].
The names and opcodes of the generic instructions are defined
in lib/compiler/src/genop.tab
.
The file contains a version number for the Beam instruction format, which
also is written to .beam
files. This number has so far never changed
and is still at version 0. If the external format would be changed in a
non backwards compatible way this number would be changed.
The file genop.tab is used as input by beam_makeops
which is a perl script
which generate code from the ops tabs. The generator is used both to generate
Erlang code for the compiler (beam_opcodes.hrl and beam_opcodes.erl) and to
generate C code for the emulator ( TODO: Filenames).
Any line in the file starting with "#" is a comment and ignored by
beam_makeops
. The file can contain definitions, which turns into a
binding in the perl script, of the form:
NAME=EXPR
Like, e.g.:
BEAM_FORMAT_NUMBER=0
The Beam format number is the same as the instructionset
field in
the external beam format. It is only bumped when a backwards
incompatible change to the instruction set is made.
The main content of the file are the opcode definitions of the form:
OPNUM: [-]NAME/ARITY
Where OPNUM and ARITY are integers, NAME is an identifier starting with a lowercase letter (a-z), and :, -, and / are literals.
For example:
1: label/1
The minus sign (-) indicates a deprecated function. A deprecated function keeps its opcode in order for the loader to be sort of backwards compatible (it will recognize deprecated instructions and refuse to load the code).
In the rest of this Chapter we will go through some BEAM instructions in detail. For a full list with brief descriptions see: [AP-Instructions].
As we saw in [CH-Compiler] we can give the option 'S' to the
Erlang compiler to get a .S
file with the BEAM code for the module
in a human and machine readable format (actually as Erlang terms).
Given the file beamexample1.erl:
-module(beamexample1).
-export([id/1]).
id(I) when is_integer(I) -> I.
When compiled with erlc -S beamexample.erl
we get the following
beamexmaple.S file:
{module, beamexample1}. %% version = 0
{exports, [{id,1},{module_info,0},{module_info,1}]}.
{attributes, []}.
{labels, 7}.
{function, id, 1, 2}.
{label,1}.
{line,[{location,"beamexample1.erl",5}]}.
{func_info,{atom,beamexample1},{atom,id},1}.
{label,2}.
{test,is_integer,{f,1},[{x,0}]}.
return.
{function, module_info, 0, 4}.
{label,3}.
{line,[]}.
{func_info,{atom,beamexample1},{atom,module_info},0}.
{label,4}.
{move,{atom,beamexample1},{x,0}}.
{line,[]}.
{call_ext_only,1,{extfunc,erlang,get_module_info,1}}.
{function, module_info, 1, 6}.
{label,5}.
{line,[]}.
{func_info,{atom,beamexample1},{atom,module_info},1}.
{label,6}.
{move,{x,0},{x,1}}.
{move,{atom,beamexample1},{x,0}}.
{line,[]}.
{call_ext_only,2,{extfunc,erlang,get_module_info,2}}.
In addition to the actual beam code for the integer identity function we also get some meta instructions.
The first line {module, beamexample1}. %% version = 0
tells
us the module name beamexample1
and the version number for
the instruction set 0
.
Then we get a list of exported functions id/1
, module_info/0
,
module_info/1
. As we can see the compiler has added two auto
generated functions to the code. These two functions are just
dispatchers to the generic module info BIFs (erlang:module_info/1
and
erlang:module_info/2
) with the name of the module added as the first
argument.
The line {attributes, []}
list all defined compiler attributes, none in
our case.
Then we get to know that there are less than 7 labels in the module,
{labels, 7}
, which makes it easy to do code loading in one pass.
The last type of meta instruction is the function
instruction on
the format {function, Name, Arity, StartLabel}
. As we can see with
the id
function the start label is actually the second label in the
code of the function.
The instruction {label, N}
is not really an instruction, it does not
take up any space in memory when loaded. It is just to give a local
name (or number) to a position in the code. Each label potentially
marks the beginning of a basic block since it is a potential
destination of a jump.
The first two instructions following the first label ({label,1}
)
are actually error generating code which adds the line number and
module, function and arity information and throws an exception.
That are the instructions line
and func_info
.
The meat of the function is after {label,2}
, the instruction
{test,is_integer,{f,1},[{x,0}]}
. The test instruction tests if its
arguments (in the list at the end, that is variable {x,0}
in this
case) fulfills the test, in this case is an integer (is_integer
).
If the test succeeds the next instruction (return
) is executed.
Otherwise the functions fails to label 1 ({f,1}
), that is,
execution continues at label one where a function clause exception
is thrown.
The other two functions in the file are auto generated. If we look at
the second function the instruction {move,{x,0},{x,1}}
moves the
argument in register x0
to the second argument register x1
. Then the
instruction {move,{atom,beamexample1},{x,0}}
moves the module name
atom to the first argument register x0
. Finally a tail call is made to
erlang:get_module_info/2
({call_ext_only,2,{extfunc,erlang,get_module_info,2}}
). As we will
see in the next section there are several different call instructions.
As we have seen in [CH-Calls] there are several different types
of calls in Erlang. To distinguish between local and remote calls
in the instruction set, remote calls have _ext
in their instruction
names. Local calls just have a label in the code of the module, while
remote calls takes a destination of the form {extfunc, Module, Function,
Arity}
.
To distinguish between ordinary (stack building) calls and
tail-recursive calls, the latter have either _only
or _last
in
their name. The variant with _last
will also deallocate as many
stack slots as given by the last argument.
There is also a call_fun Arity
instruction that calls the closure
stored in register {x, Arity}
. The arguments are stored in x0
to {x,
Arity-1}
.
For a full listing of all types of call instructions see [AP-Instructions].
The stack and the heap of an Erlang process on Beam share the same memory area see [CH-Processes] and [CH-Memory] for a full discussion. The stack grows toward lower addresses and the heap toward higher addresses. Beam will do a garbage collection if more space than what is available is needed on either the stack or the heap.
- A leaf function
-
A leaf function is a function which doesn’t call any other function.
- A non leaf function
-
A non leaf function is a function which may call another function.
On entry to a non leaf function the continuation pointer (CP) is saved on
the stack, and on exit it is read back from the stack. This is done by the
allocate
and deallocate
instructions, which are used for setting up
and tearing down the stack frame for the current instruction.
A function skeleton for a leaf function looks like this:
{function, Name, Arity, StartLabel}.
{label,L1}.
{func_info,{atom,Module},{atom,Name},Arity}.
{label,L2}.
...
return.
A function skeleton for a non leaf function looks like this:
{function, Name, Arity, StartLabel}.
{label,L1}.
{func_info,{atom,Module},{atom,Name},Arity}.
{label,L2}.
{allocate,Need,Live}.
...
call ...
...
{deallocate,Need}.
return.
The instruction allocate StackNeed Live
saves the continuation
pointer (CP) and allocate space for StackNeed
extra words on the
stack. If a GC is needed during allocation save Live
number of X
registers. E.g. if Live
is 2 then registers x0
and x1
are saved.
When allocating on the stack, the stack pointer (E) is decreased.
Before After | xxx | | xxx | E -> | xxx | | xxx | | | | ??? | caller save slot ... E -> | CP | ... ... HTOP -> | | HTOP -> | | | xxx | | xxx |
For a full listing of all types of allocate and deallocate instructions see [AP-Instructions].
Sending a message is straight forward in beam code. You just use the
send
instruction. Note though that the send instruction does not
take any arguments, it is more like a function call. It assumes that
the arguments (the destination and the message) are in the argument
registers x0
and x1
. The message is also copied from x1
to x0
.
Receiving a message is a bit more complicated since it involves both selective receive with pattern matching and introduces a yield/resume point within a function body. (There is also a special feature to minimize message queue scanning using refs, more on that later.)
A minimal receive loop, which accepts any message and has no timeout
(e.g. receive _ -> ok end
) looks like this in BEAM code:
{label,1}.
{loop_rec,{f,2},{x,0}}.
remove_message.
{jump,{f,3}}.
{label,2}.
{wait,{f,1}}.
{label,3}.
...
The loop_rec L2 x0
instruction first checks if there is any message
in the message queue. If there are no messages execution jumps to L2,
where the process will be suspended waiting for a message to arrive.
If there is a message in the message queue the loop_rec
instruction
also moves the message from the m-buf to the process heap. See
[CH-Memory] and [CH-Processes] for details of the m-buf
handling.
For code like receive _ -> ok end
, where we accept any messages,
there is no pattern matching needed, we just do a remove_message
which unlinks the next message from the message queue. (It also
removes any timeout, more on this soon.)
For a selective receive like e.g. receive [] -> ok end
we will
loop over the message queue to check if any message in the queue
matches.
{label,1}.
{loop_rec,{f,3},{x,0}}.
{test,is_nil,{f,2},[{x,0}]}.
remove_message.
{jump,{f,4}}.
{label,2}.
{loop_rec_end,{f,1}}.
{label,3}.
{wait,{f,1}}.
{label,4}.
...
In this case we do a pattern match for Nil after the loop_rec
instruction if there was a message in the mailbox. If the message
doesn’t match we end up at L3 where the instruction loop_rec_end
advances the save pointer to the next message (p->msg.save =
&(*p->msg.save)->next
) and jumps back to L2.
If there are no more messages in the message queue the process is
suspended by the wait
instruction at L4 with the save pointer pointing
to the end of the message queue. When the processes is rescheduled
it will only look at new messages in the message queue (after the save
point).
If we add a timeout to our selective receive the wait instruction is replaced by a wait_timeout instruction followed by a timeout instruction and the code following the timeout.
{label,1}.
{loop_rec,{f,3},{x,0}}.
{test,is_nil,{f,2},[{x,0}]}.
remove_message.
{jump,{f,4}}.
{label,2}.
{loop_rec_end,{f,1}}.
{label,3}.
{wait_timeout,{f,1},{integer,1000}}.
timeout.
{label,4}.
...
The wait_timeout
instructions sets up a timeout timer with the given
time (1000 ms in our example) and it also saves the address of the
next instruction (the timeout
) in p->def_arg_reg[0]
and then
when the timer is set, p->i
is set to point to def_arg_reg
.
This means that if no matching message arrives while the process is suspended a timeout will be triggered after 1 second and execution for the process will continue at the timeout instruction.
Note that if a message that doesn’t match arrives in the mailbox, the
process is scheduled for execution and will run the pattern matching
code in the receive loop, but the timeout will not be canceled. It is
the remove_message
code which also removes any timeout timer.
The timeout
instruction resets the save point of the mailbox to the
first element in the queue, and clears the timeout flag (F_TIMO) from
the PCB.
We have now come to the last version of our receive loop, where we use the ref trick alluded to earlier to avoid a long message box scan.
A common pattern in Erlang code is to implement a type of "remote
call" with send and a receive between two processes. This is for
example used by gen_server. This code is often hidden behind a library
of ordinary function calls. E.g., you call the function
counter:increment(Counter)
and behind the scene this turns into
something like Counter ! {self(), inc}, receive {Counter, Count} ->
Count end
.
This is usually a nice abstraction to encapsulate state in a process. There is a slight problem though when the mailbox of the calling process has many messages in it. In this case the receive will have to check each message in the mailbox to find out that no message except the last matches the return message.
This can quite often happen if you have a server that receives many messages and for each message does a number of such remote calls, if there is no back throttle in place the servers message queue will fill up.
To remedy this there is a hack in ERTS to recognize this pattern and avoid scanning the whole message queue for the return message.
The compiler recognizes code that uses a newly created reference (ref) in a receive (see [ref_trick_code]), and emits code to avoid the long inbox scan since the new ref can not already be in the inbox.
Ref = make_ref(),
Counter ! {self(), inc, Ref},
receive
{Ref, Count} -> Count
end.
This gives us the following skeleton for a complete receive, see [ref_receive].
{recv_mark,{f,3}}.
{call_ext,0,{extfunc,erlang,make_ref,0}}.
...
send.
{recv_set,{f,3}}.
{label,3}.
{loop_rec,{f,5},{x,0}}.
{test,is_tuple,{f,4},[{x,0}]}.
...
{test,is_eq_exact,{f,4},[{x,1},{y,0}]}.
...
remove_message.
...
{jump,{f,6}}.
{label,4}.
{loop_rec_end,{f,3}}.
{label,5}.
{wait,{f,3}}.
{label,6}.
The recv_mark
instruction saves the current position (the end
msg.last
) in msg.saved_last
and the address of the label
in msg.mark
The recv_set
instruction checks that msg.mark
points to the next
instruction and in that case moves the save point (msg.save
) to the
last message received before the creation of the ref
(msg.saved_last
). If the mark is invalid ( i.e. not equal to
msg.save
) the instruction does nothing.