-
Notifications
You must be signed in to change notification settings - Fork 1
declarations
There are 4 of them so far
There are two namespaces in Helium, one for types and one for variable and functions. Every time
you create a function you define two new namespaces for types and for the rest. At the very top
there is a global
namespace where the main
(and other functions and types) live in.
To define a function you will write something like this:
fn sum(a: int, b: int): int
{
ret a + b;
}
So, each function definition starts with fn
keyword followed by alphanumeric name, then list of
function parameters in parentheses. Each parameter MUST have a type. Function also has a type that
goes after formal parameters and defines the type of function's result. The function body is enclosed
in curly braces. To return a value from a function you will use ret
statement followed by an
expression.
Some parts of function definition can be omitted. The type of the function for example if omitted will be derived from the type of function result:
fn sum(a: int, b: int)
{
ret a + b;
}
The two functions above are equal.
//TODO:
The ret
statement and ;
at the function edge can be omitted making a hit to compiler that the
last expression is the result of the function:
fn sum(a: int, b: int)
{
a + b
}
If a function does not have any formal parameters the parentheses can be omitted, these two functions are equivalent:
fn blah() { 42 }
fn blah { 42 }
You can define a function inside a function, effectively creating a new namespace for types and variables:
def Point = { x: int, y: int }
fn first
{
let a = Point { x = 10, y = 20 };
fn blah { 42 }
fn second
{
let a = Point { x = 100 };
let b = Point { y = 200 };
ret a.x + b.y + blah();
}
second()
}
In this example there are 3 namespace levels: global
level, first
function level and second
function level. The function first
opens a child namespace relative to global
, this allows it
to access type Point
. Function second
opens a child namespace relative to first
function which
allows it to access type Point
and variable a
from that level, but it also defines a new
variable a
, which shadows the original variable at first
function level. Also it gets access to
the function blah
defined one level above, if you would define a function with the name blah
again it would shadow the original one. At any time you will use the most recent definition of a
symbol in program. You CANNOT define two variables or functions with the same name on the same
level. You CANNOT access symbols from a sibling level or its children.
TBD You can call a function before declaring it:
fn main
{
let a = sum(40, 2);
}
fn sum(a: int, b: int) { a + b }
The Main function is the very special function. The main function is the entry point for the whole program. It has three very important differences from a regular one(thus far): it always returns an integer result, if no return value is provided it returns 0 and when the last statement(or expression) of the function is reached the program terminates. Here is the very minimal program:
fn main {}
There are two ways to define(declare) a variable: locally inside a function or globally outside of any function. Global variables reside in .data segment and are accessible as long as program runs. Local variables reside inside a function definition(technically either in a register or stack), they come to live when you execute a function and die when execution falls through function end.
Global variables if not provided with a initialization value are initialized to a default value based on their type. Local variables are always uninitialized unless privided with initialization expression.
You declare a local variable like this:
fn main
{
let a: int;
}
Here variable declaration starts with the keyword let
followed by alphanumeric variable name,
followed by :
and integral type name int
ended with ;
to close the statement. The symbol
a
is an uninitialized variable of type int
, which means its value is garbage(thing
that was on stack or in register before entering this function). To initialize a variable you need
to provide an initialization expression:
fn main
{
let a: int = 10;
}
Here the variable a
is initialized with an integer literal expression 10
which is of the similar
(actually the same in this case) type as the variable a
which brings cool posibility of local type
inferring that allows us to reduce tautology in code by omitting types in places where it can be
inferred by the compiler:
fn main
{
let a = 10;
}
The code above is the same as the previous one but with variable type omitted and inferred from the initialization expression. Initialization expression is virtually any expression that yields a value.
Not implemented, tbd
Type declaration is the way to assign a name to another type, most of the type it will be anonymous record type:
def Point = { x:int, y:int }
Here record type definition is started by keyword def
followed by alphanumeric type name, then
binding symbol =
followed by list of typed names enclosed in braces. Each name in compound type
MUST have a type. Typed names order does not really matter but it does define the actual name of
the type within its namespace.
Unlike variables there is no special handling for global types they are just top level symbols. But they follow general namespace rules outlined at the top of the page.
(currently in progress)
Inline assembly allows you to embed assembler instructions within Helium code. There are two forms
of asm
statement. Simple form allows you to write direct assembler statements, define labels,
use registers etc. At the top level of your program you can use only this form. Extended form allows
you to use advanced features such as meta labels and registers, variable interpolation, function
declaration(TBD), Helium expression interpolation(TBD), function calls in both ways.
Simple form is simple, it is plain assembler instructions:
asm
{
addi $t0, $t1, 42
li $t0, 42
}
In this example there are two equivalent instructions the first one is a real MIPS instruction,
the second is a pseudo-instruction(or macro) that will expand into several real instructions. At
this point you cannot do much with the simple version of asm
since no data definitions allowed yet.
Direct usage(naming) of registers and labels inside asm
that is itself declared inside a function
is a BAD idea. Using exact registers names greatly restricts register allocation pass that can
lead to unneeded spills and slowing down the program, using explicit labels might(and probably will)
lead to label collision. Solution for this problem is to use meta symbols, from programmer's
perspective nothing much changed but for compiler there is no more restriction on what rester to use
and what label name to generate.
Meta register is an alphanumeric name preceded with backtick `
:
asm
{
addi `t0, `t1, 42
li `t0, 42
}
It is the same example as above but all $
symbols replaced with backticks allowing compiler to
use any registers available for the statement. Meta registers with same name will receive the same
real register no matter what. Meta names are not limited of course to the real registers names:
asm
{
addi `banana, `blah, 42
li `banana, 42
}
Meta label is an alphanumeric name preceded with two backticks:
asm
{
addi `cnt, $0, 0
addi `tst, $0, 5
``repeat:
addi `cnt, `cnt, 1
bne `cnt, `tst, ``repeat
}
In this example we define a counter `cnt
and test value in `tst
. On each iteration we
increment the counter by 1 and check whether it became equal to test value if not we branch to
repeat
label and if we did reach the value we fall through to exit. Label repeat
is a meta label,
in the actual code compiler will use generated name like L1
, L2
, L3
etc.
This feature allows you to use Helium variables instead of registers or meta registers inside inline assembly statement, here is previous example with small changes:
fn main
{
let tst: int = 5;
let cnt: int = 0;
asm
{
``repeat:
addi cnt, cnt, 1
bne cnt, tst, ``repeat
}
}
Here is the same example as above but instead of registers used as counter and test we use Helium variables. Variable interpolation is not limited to simple variables you can use also records and array subscript(TBD):
def Blah = { tst: int, cnt: int }
fn main
{
let b = Blah { tst = 10 };
asm
{
``repeat:
addi b.cnt, b.cnt, 1
bne b.cnt, b.tst, ``repeat
}
}
(curently in development)
Some instructions are(will be) allowed to use Helium literal interpolation to simplify definition
of data, for example la
macro that loads a 32-bit value into a register, it also accepts a label
as argument that will be replaced with a 32-bit address. Inline assembly makes this macro also
accept a Helium literal, for example string:
fn main
{
let len:int;
asm
{
la `str, "Hello, World!"
lw len, 0(`str)
}
}
This code will spawn a length-prefixed string into the .data segment and replace its occurrence with
a generated label name, the second load instruction simply reads string's length into len
variable.
- all statements are limited to instructions and labels now
- all further extensions to the
asm
will be limited to SPIM simulator ISA since it is my test bench, at least for now - there is no distinguish between assembler ISA sets, compiler will accept any instructions defined in instruction table I took from GNU binutils.