C3 language¶
Introduction¶
As an example of designing and implementing a custom language within the PPCI framework, the C3 language was created. As pointed out in c2lang, the C language is widely used, but has some strange contraptions. These include the following:
- The include system. This results in lots of code duplication and file creation. Why would you need filenames in source code?
- The comma statement: x = a(), 2; assigns 2 to x, after calling function a.
- C is difficult to parse with a simple parser. The parser has to know what a symbol is when it is parsed. This is also referred to as the lexer hack.
In part for these reasons (and of course, for fun), C3 was created.
The hello world example in C3 is:
module hello;
import io;
function void main()
{
io.println("Hello world");
}
Language reference¶
Modules¶
Modules in C3 live in file, and can be defined in multiple files. Modules can
import each other by using the import
statement.
For example:
pkg1.c3:
module pkg1;
import pkg2;
pkg2.c3:
module pkg2;
import pkg1;
Functions¶
Function can be defined by using the function
keyword, followed by a type
and the function name.
module example;
function void compute()
{
}
function void main()
{
main();
}
Variables¶
Variables require the var
keyword, and can be either global or function-local.
module example;
var int global_var;
function void compute()
{
var int x = global_var + 13;
global_var = 200 - x;
}
Types¶
Types can be specified when a variable is declared, and also typedef’ed using
the type
keyword.
module example;
var int number;
var int* ptr_num;
type int* ptr_num_t;
var ptr_num_t number2;
If statement¶
The following code example demonstrates the if
statement. The else
part
is optional.
module example;
function void compute(int a)
{
var int b = 10;
if (a > 100)
{
b += a;
}
if (b > 50)
{
b += 1000;
}
else
{
b = 2;
}
}
While statement¶
The while
statement can be used as follows:
module example;
function void compute(int a)
{
var int b = 10;
while (b > a)
{
b -= 1;
}
}
For statement¶
The for
statement works like in C. The first item
is initialized before the loop. The second is the condition
for the loop. The third part is executed when one run of the
loop is done.
module example;
function void compute(int a)
{
var int b = 0;
for (b = 100; b > a; b -= 1)
{
// Do something here!
}
}
Other¶
C3 does not contain a preprocessor. For these kind of things it might be better to use a templating engine such as Jinja2.
Module reference¶
This is the c3 language front end.
For the front-end a recursive descent parser is created.
-
class
ppci.lang.c3.
AstPrinter
¶ Prints an AST as text
-
class
ppci.lang.c3.
C3Builder
(diag, arch_info)¶ Generates IR-code from c3 source.
Reports errors to the diagnostics system.
-
build
(sources, imps=())¶ Create IR-code from sources.
Returns: A context where modules are living in and an ir-module. Raises compiler error when something goes wrong.
-
do_parse
(src, context)¶ Lexing and parsing stage (phase 1)
-
-
class
ppci.lang.c3.
CodeGenerator
(diag)¶ Generates intermediate (IR) code from a package.
The entry function is ‘genModule’. The main task of this part is to rewrite complex control structures, such as while and for loops into simple conditional jump statements. Also complex conditional statements are simplified. Such as ‘and’ and ‘or’ statements are rewritten in conditional jumps. And structured datatypes are rewritten.
Type checking is done in one run with code generation.
-
emit
(instruction, loc=None)¶ Emits the given instruction to the builder.
-
error
(msg, loc=None)¶ Emit error to diagnostic system and mark package as invalid
-
gen
(context)¶ Generate code for a whole context
-
gen_assignment_stmt
(code)¶ Generate code for assignment statement
-
gen_binop
(expr: ppci.lang.c3.astnodes.Binop)¶ Generate code for binary operation
-
gen_bool_expr
(expr)¶ Generate code for cases where a boolean value is assigned
-
gen_cond_code
(expr, bbtrue, bbfalse)¶ Generate conditional logic. Implement sequential logical operators.
-
gen_dereference
(expr: ppci.lang.c3.astnodes.Deref)¶ dereference pointer type, which means *(expr)
-
gen_expr_at
(ptr, expr)¶ Generate code at a pointer in memory
-
gen_expr_code
(expr: ppci.lang.c3.astnodes.Expression, rvalue=False) → ppci.ir.Value¶ Generate code for an expression. Return the generated ir-value
-
gen_external_function
(function)¶ Generate external function
-
gen_for_stmt
(code)¶ Generate for-loop code
-
gen_function
(function)¶ Generate code for a function. This involves creating room for parameters on the stack, and generating code for the function body.
-
gen_function_call
(expr)¶ Generate code for a function call
-
gen_global_ival
(ival, typ)¶ Create memory image for initial value
-
gen_globals
(module)¶ Generate global variables and modules
-
gen_identifier
(expr)¶ Generate code for when an identifier was referenced
-
gen_if_stmt
(code)¶ Generate code for if statement
-
gen_index_expr
(expr)¶ Array indexing
-
gen_literal_expr
(expr)¶ Generate code for literal
-
gen_local_var_init
(var)¶ Initialize a local variable
-
gen_member_expr
(expr)¶ Generate code for member expression such as struc.mem = 2 This could also be a module deref!
-
gen_module
(mod: ppci.lang.c3.astnodes.Module)¶ Generate code for a single module
-
gen_return_stmt
(code)¶ Generate code for return statement
-
gen_stmt
(code: ppci.lang.c3.astnodes.Statement)¶ Generate code for a statement
-
gen_switch_stmt
(switch)¶ Generate code for a switch statement
-
gen_type_cast
(expr)¶ Generate code for type casting
-
gen_unop
(expr)¶ Generate code for unary operator
-
gen_while
(code)¶ Generate code for while statement
-
get_debug_type
(typ)¶ Get or create debug type info in the debug information
-
get_ir_function
(function)¶ Get the proper IR function for the given function.
A new function will be created if required.
-
get_ir_type
(cty)¶ Given a certain type, get the corresponding ir-type
-
is_module_ref
(expr)¶ Determine whether a module is referenced
-
new_block
()¶ Create a new basic block into the current function
-
-
class
ppci.lang.c3.
Context
(arch_info)¶ A context is the space where all modules live in.
It is actually the container of modules and the top level scope.
-
equal_types
(a, b, byname=False)¶ Compare types a and b for structural equavalence.
if byname is True stop on defined types.
-
eval_const
(expr)¶ Evaluates a constant expression.
-
get_common_type
(a, b, loc)¶ Determine the greatest common type.
This is used for coercing binary operators.
For example:
- int + float -> float
- byte + int -> int
- byte + byte -> byte
- pointer to x + int -> pointer to x
-
get_constant_value
(const)¶ Get the constant value, calculate if required
-
get_module
(name, create=True)¶ Gets or creates the module with the given name
-
get_type
(typ, reveil_defined=True)¶ Get type given by str, identifier or type.
When reveil_defined is True, defined types are resolved to their backing types.
-
has_module
(name)¶ Check if a module with the given name exists
-
is_simple_type
(typ)¶ Determines if the given type is a simple type
-
link_imports
()¶ Resolve all modules referenced by other modules
-
modules
¶ Get all the modules in this context
-
pack_string
(txt)¶ Pack a string an int as length followed by text data
-
resolve_symbol
(ref)¶ Find out what is designated with x
-
size_of
(typ)¶ Determine the byte size of a type
-
-
class
ppci.lang.c3.
Lexer
(diag)¶ Generates a sequence of token from an input stream
-
tokenize
(text)¶ Keeps track of the long comments
-
-
class
ppci.lang.c3.
Parser
(diag)¶ Parses sourcecode into an abstract syntax tree (AST)
-
add_symbol
(sym)¶ Add a symbol to the current scope
-
parse_cast_expression
() → ppci.lang.c3.astnodes.Expression¶ Parse a cast expression.
The C-style type cast conflicts with ‘(‘ expr ‘)’ so introduce extra keyword ‘cast’.
-
parse_compound
()¶ Parse a compound statement, which is bounded by ‘{‘ and ‘}’
-
parse_const_def
()¶ Parse a constant definition
-
parse_const_expression
()¶ Parse array initializers and other constant values
-
parse_designator
()¶ A designator designates an object with a name.
-
parse_expression
(rbp=0) → ppci.lang.c3.astnodes.Expression¶ Process expressions with precedence climbing.
See also:
http://eli.thegreenplace.net/2012/08/02/ parsing-expressions-by-precedence-climbing
-
parse_for
() → ppci.lang.c3.astnodes.For¶ Parse a for statement
-
parse_function_def
(public=True)¶ Parse function definition
-
parse_id_sequence
()¶ Parse a sequence of id’s
-
parse_if
()¶ Parse if statement
-
parse_import
()¶ Parse import construct
-
parse_module
(context)¶ Parse a module definition
-
parse_postfix_expression
() → ppci.lang.c3.astnodes.Expression¶ Parse postfix expression
-
parse_primary_expression
() → ppci.lang.c3.astnodes.Expression¶ Literal and parenthesis expression parsing
-
parse_return
() → ppci.lang.c3.astnodes.Return¶ Parse a return statement
-
parse_source
(tokens, context)¶ Parse a module from tokens
-
parse_statement
() → ppci.lang.c3.astnodes.Statement¶ Determine statement type based on the pending token
-
parse_switch
() → ppci.lang.c3.astnodes.Switch¶ Parse switch statement
-
parse_top_level
()¶ Parse toplevel declaration
-
parse_type_def
(public=True)¶ Parse a type definition
-
parse_type_spec
()¶ Parse type specification. Type specs are read from right to left.
A variable spec is given by: var [typeSpec] [modifiers] [pointer/array suffix] variable_name
For example: var int volatile * ptr; creates a pointer to a volatile integer.
-
parse_unary_expression
()¶ Handle unary plus, minus and pointer magic
-
parse_variable_def
(public=True)¶ Parse variable declaration, optionally with initialization.
-
parse_while
() → ppci.lang.c3.astnodes.While¶ Parses a while statement
-
-
class
ppci.lang.c3.
Visitor
(pre=None, post=None)¶ Visitor that can visit all nodes in the AST and run pre and post functions.
-
do
(node)¶ Visit a single node
-
visit
(node)¶ Visit a node and all its descendants
-
-
ppci.lang.c3.
c3_to_ir
(sources, includes, march, reporter=None)¶ Compile c3 sources to ir-code for the given architecture.