Project 3: Compiling to Intermediate Code

Description

This project finishes the front-end of your simplified compiler by translating the source file into intermediate code. Your compiler must have a function named "generate_bad_code_from_string" that takes a single argument, a multiline string of Good Code. The multiline string from this function must be a legal Bad Code, as described below

We recommend that you implement this project in three stages:

  1. Make YACC generate an abstract syntax tree (AST) that contains all of the information from the source file.
  2. Generate routines to traverse the AST in the correct order (postfix traversal). Start at the root and recursively call its child sub-trees in order. For most nodes, when it is reached you should process all of its child sub-trees first (assuming it has any) and then process that node itself. For debugging purposes, I recommend that the "processing" just consist of printing the name of each node (you must turn off these debug statements before handing in your project!)
  3. Finish the project by having each node output the appropriate Bad Code when it is processed, and pass up through the tree any information that the parent node might need in order to make use of its results.

Operators

You have a range of operators that are already in your grammar, but may not have ensured that they are processed in the correct order. The precedence (from lowest to highest) of the operators thus far is:

  • Assignment Operators (right associative)
  • Logical OR (left associative)
  • Logical AND (left associative)
  • Relationship Operators (non-associative)
  • Add/Subtract (left associative)
  • Multiply/Divide (left associative)
  • Unary Minus (non-associative)

Note that logical operators normally short-circuit such that if their result is fixed based on the first term of the operator, the second term does not execute (e.g. "a = (b=3) || (b=4)" would result in a=b=3.) While proper short-circuiting is not required in Project 3, it will be part of Project 4.

Error Reporting

Your compiler must still catch errors from previous projects. And in the event of an error, your compiler should raise the associated exceptions. For reference the errors are as follows:

  • raise SyntaxError for lex failures
  • raise SyntaxError for yacc failures
  • raise NameError for variable usage before declaration
  • raise NameError for variable redeclaration

The print command (in Good Code) should output all values on a single line consecutively, and then place a newline at the end of the line. For example, this code:

val a = 2;
print(1, a);
print(13, a*5);

should produce Bad Code (see below) that that looks something like:

val_copy 2 s6
val_copy s6 s0
val_copy 1 s7
out_val s7
out_val s0
out_char '\n'
val_copy 13 s8
out_val s8
val_copy 5 s9
mult s0 s9 s10
out_val s10
out_char '\n'

The specific intermediate code will not be graded as long as it produces the correct output when executed with "bad_interpreter.py". In the case of the code above, the output should be:

12
1310

Bad Code (Intermediate Code)

Below is a subset of the available instructions in Bad Code. Each instruction requires a specific number of arguments (at most three). An argument can be a variable name, a literal integer, a literal character (in single quotes), or a label name (which must be defined elsewhere in the code). The two categories of arguments are:

  • NUM: These arguments use the value of a literal floating point number or character, position of a label, or the current state of a variable.
  • VAR: These arguments can only be the name of a variable (which will be written to) (must be an s followed by a non-negative integer).

For simplicity, all arguments are required.

Name Arguments Description
val_copy [NUM: from] [VAR: to] Copy the value from into the variable to.
add [NUM: num1] [NUM: num2] [VAR: result] Calculate " num1 + num2" and place the answer in result.
sub [NUM: num1] [NUM: num2] [VAR: result] Calculate " num1 - num2" and place the answer in result.
mult [NUM: num1] [NUM: num2] [VAR: result] Calculate " num1 * num2" and place the answer in result.
div [NUM: num1] [NUM: num2] [VAR: result] Calculate " num1 / num2" and place the answer in result.
test_less [NUM: num1] [NUM: num2] [VAR: result] If ( num1 < num2) set result to 1, else set result to 0.
test_gtr [NUM: num1] [NUM: num2] [VAR: result] If ( num1 > num2) set result to 1, else set result to 0.
test_equ [NUM: num1] [NUM: num2] [VAR: result] If ( num1 == num2) set result to 1, else set result to 0.
test_nequ [NUM: num1] [NUM: num2] [VAR: result] If ( num1 != num2) set result to 1, else set result to 0.
test_gte [NUM: num1] [NUM: num2] [VAR: result] If ( num1 >= num2) set result to 1, else set result to 0.
test_lte [NUM: num1] [NUM: num2] [VAR: result] If ( num1 <=< font color="blue">num2) set result to 1, else set result to 0.
jump [NUM: num1] Jump Instruction Pointer (IP) to position designated by num (typically a label).
jump_if_0 [NUM: num1] [NUM: num2] If num1 == 0, jump IP to position designated by num2.
jump_if_n0 [NUM: num1] [NUM: num2] If num1 != 0, jump IP to position designated by num2.
random [NUM: num] [VAR: result] Pick a random integer x, where 0 <=x < num, and put it in result.
out_val [NUM: num] Write a floating-point value of num to standard out.
out_char [NUM: num] Write num as char (with given ASCII code) to standard out.

Additional aspects of Bad Code are detailed in the Bad Code Specification, but at this point you don't need to pay attention to any of the other available instructions.

Testing Intermediate Code Output

The Bad Interpreter (made available as "bad_interpreter.py") will allow you to directly run Bad Code files so that you can test them. "bad_interpreter.py" takes bad_code from stdin and writes the executed output to stdout. We recommend that you first create an input file with a .good extension. For example, if you wanted to compile and test the source program "mytest.good" in your Project3/ directory, you would type:

$ python3 project.py <mytest.good >mytest.bad
$ python3 bad_interpreter.py <mytest.bad >mytest.out

Note: Your generated intermediate code does not have to match the sample code provided (though any large differences are probably a sign of a problem), but your final output when you run the intermediate code MUST match the expected output.

Handing In

When you hand in your program, you must include:

  • a project.py with the "generate_bad_code_from_string" function.
  • A README.txt file that includes and a summary of what does and does not work in your code. Your README may also include any notes that you want us to read before grading your program, such as any external resources you used in its development.