This page is not finished. Please disregard for now.

Project 7: Functions

Early and Full Projects

Unlike previous projects, Project 7 is broken into two seperate projects: Early and Full. The Early project tests the lexer and parser, and is due earlier (duh) than the full project. The Early project is also worth less points, and won't have a late period.


For this project you will further upgrade Good Lang by adding the capacity for user-defined functions. You will need to make modifications to all five stages of your compiler (lexical analysis, syntax analysis, semantic analysis, conversion to Bad Lang, and final output to Good Lang.)


You must implement two main additions to the language: function definitions, and function calls (next project we'll add in function pre-declarations).

A function definition in Good Lang begins with the keyword define, followed by the return type of the function, the name of the function, an open-parenthesis, zero or more argument declarations (separated by commas), and finally a statement that is the function body. As with while- and if-commands, the statement following a function definition can (and typically will) be a block.

Anywhere inside a function body can be a return command, followed by an expression. When a return is reached, the function should evaluate the expression and return its value to the location from which the function was called. Note that a function must have a return command (you do not need to implement void functions). A return command is not legal outside of a function. Here is an example function definition:

define val max(val value1, val value2) {
    if (value1 > value2) return value1;
    return value2;

Function definitions can occur only in the global scope, and an error should be thrown if one is defined in a deeper scope. Likewise, you should throw a "function redefinition" error if a function name is defined twice.

Function calls look just like calls to commands. In addition to incorporating your functions into your symbol table, you also need to make sure that all aspects of the function calls are semantically correct (i.e., that the return value of the function is legal in the current context, and that the arguments being passed in match the argument types in the function definition).

For example, if you were writing a program where you were trying to calculate the high score between two players, you might have something like:

val high_score = max(score1, score2);
print("And the high score is: ", high_score);

Recursive Functions

Your functions are required to be able to run recursively, so plan ahead carefully with your memory management. For example:

define val fib(val position) {
    if (position <= 1) return position;
    return fib(position - 1) + fib(position - 2);

print("The 7th Fibonacci number is: ", fib(7));

This program should print "The 7th Fibonacci number is 13".

Implementing Functions

We recommend that you take the following steps to implement functions in your compiler:

  1. Update the lexer. Your lexer should be able to recognize the "define" keyword and the "return" keyword.
  2. Prepare your symbol table to make sure that it can properly store functions (note, you will probably implement this as effectively a second symbol table for functions). The information that you want to keep track of includes the name of the function, its return type, the argument types and names, and a pointer to an AST representing the function body. In the steps below you should be prepared to add more information here, like a label name that appears before the function body in your intermediate code.
  3. Update the parser.

    Your parser should be able to recognize function definitions, as described above, as well as return statements. Note that the contents of a function should be in a new scope, including any function arguments. It is fine to have any number of additional scopes inside the function body. I recommend that you do not have a function definition as part of the main AST, just include a pointer in the symbol table to the AST associated with this function, and then make sure to generate intermediate code for them as well.

    Additionally you should allow function calls within expressions. These consist of an identifier, a '(', zero or more arguments (each an expression) separated by commas, and a ')'. The AST node that represents a function call should contain a pointer to that function's entry in the symbol table, and it should have one sub-tree for each argument.

  4. Submit the project 7 (Early) At this point you should be able to pass the Project 7 early test cases.
  5. Perform any semantic checking and print appropriate errors. Specifically:
    • function calls should be treated like their return type within expressions.
    • arguments of a function call should match the corresponding types in the definition.
    • 'return' statements should return the appropriate type for the function.
    • function definitions should only occur in the global scope.
    • 'return' statements should never be used outside of a function definition.
  6. Setup conversion to Bad Lang. Start each function with a start label and then write out its body, ending with code that will jump back to where the call was made with the appropriate return value. Once you have this piece working, you can test basic functionality. Next, you need to make sure that any time a function is called, you backup all non-global variables onto a stack, restoring them after you return from this function. This added ability will allow you to run recursive functions in Good Lang.

    Four new commands in the intermediate code will facilitate this implementation: "push", "pop", "ar_push", and "ar_pop". Push will save any scalar variable on an internal control stack. For example "push s5" will store the value from s5 onto the stack. Likewise pop will remove a scalar value from the stack. So, "pop s27" will grab the top value on the stack and put it in the s27 variable. The ar_push and ar_pop instructions work identically, but with array variables.

    The real trick here is to know when to backup variables on to the stack, and which variables to pay attention to. When to backup variables: you only need to back up variables if you are in a function definition and you are calling a function. If a new iteration of THIS function gets called again before the current one finishes, the values of all the currently live variables might change. Which variables to backup: all active variables in your symbol table above scope 0. You know that the define must happen in scope 0, so all active variables above scope zero must have been created inside this function. Back up these variables (including temporaries!) immediately before calling a new function and restore them immediately after.

  7. Setup conversion of the four push and pop instructions to Ugly Lang. You should designate a portion of your memory as a stack (with at least 10000 memory positions). If, for example, you decided to use 10000 through 19999 for your stack (and started your free memory pointer at 20000) you could put the first available position in the stack into regH (initially 10000) and update it to track the top of the stack as pushes and pops are performed.

New Errors for Project 7

  • ERROR(line 6): Attempting to re-define function 'max'.
  • ERROR(line 1): unknown function 'max'
  • ERROR(line 7): Incorrect argument type provided for function 'max'
  • ERROR(line 1): 'return' run outside of any function.
  • ERROR(line 3): Attempting to define function 'max' outside of global scope!
  • ERROR(line 5): incorrect return type for function 'max'. Expected: 'val', but found 'char')
  • ERROR(line 7): Incorrect number of arguments provided for function 'Fail6' (expected 2, received 1).

As with previous projects, we are only testing if an exceptions was raised, rather than the exact exception class. But, more informative error messages are more useful in debugging.

Testing Output

See Project 6.


When you submit your program, you must include:

  • A with the "generate_bad_code_from_string" function and the "generate_ugly_code_from_string" function.
  • The file unmodified.
  • A README.txt file that includes the your name, a brief summary of what does and does not work in your code. Your README may also include any notes that you want us to read before grading your program, such as any external resources you used in its development.
  • You may divide your project code into multiple files to do so you will need to use relative imports (if that doesn't mean anything to you, just keep everything in the file).

All graded material must be in the folder named "Project7/".