Project 4: Language Enhancements

Description

Now that you have a fully working front-end for a limited version of Good Lang, you will extend many of its features. These additions include the type "char", the flow-control commands "if", "else", "while", and "break", code blocks using braces ('{' and '}'), the Boolean logic operator ('!'), and proper short-circuiting for the other Boolean Logic operators (&& and ||). In addition to modifying your lexer, parser, and intermediate code output, these new features will require you to add in semantic checking for types and scope.

As with Project 3, your program must provide the function "generate_bad_code_from_string", which accepts a multiline string as its argument and returns a multiline string of Bad Lang code. This function must be found in project.py, however you are welcome to split your code into multiple other python files, but you are responsible for ensuring they can work together to allow "project.generate_bad_code_from_string" to function as needed.

Error Reporting

Your compiler must still catch errors from previous projects as well as some new errors. For reference, the previous errors are:

  • ERROR(line #): unknown token '@' SyntaxError
  • ERROR(line #): yacc parse error SyntaxError
  • ERROR(line #): unknown variable 'var_name' NameError
  • ERROR(line #): redeclaration of variable 'var_name' NameError

And the new errors associated with this project are:

  • ERROR(line #): 'break' command used outside of any loops SyntaxError
  • ERROR(line #): types do not match for assignment (lhs='type1', rhs='type2') TypeError
  • ERROR(line #): types do not match for relationship operator (lhs='type1', rhs='type2') TypeError
  • ERROR(line #): cannot use type 'type' in mathematical expressions TypeError

    (note that the above error should be triggered for all of +, -, *, /, &&, ||, !, +=, -=, *=, and /=)

  • ERROR(line #): condition for if statements must evaluate to type val TypeError
  • ERROR(line #): condition for while statements must evaluate to type val TypeError
  • ERROR(line #): cannot use type 'type' as an argument to random TypeError

Of course, in all error cases you should raise the appropriate exception.

We recommend that you implement this project one piece at a time. Most of the new features are independent of each other, so you can add them in any order that you find easiest. I've broken them down into four groups below.

Intermediate Code Generation (Bad Lang)

In addition to the Bad Lang instructions that you used for Project 3, you will also need to use labels and the jump instructions (jump, jump_if_0, and jump_if_n0) to successfully complete Project 4. Read through the Bad Code Specification to get more information about the proper use of jump instructions.

Group 1: Boolean "Not" and Short Circuiting

You must implement the Boolean Logic operator NOT ('!'). The NOT operator has the same precedence as unary minus. And, like unary minus, it affects only the expression to its right. NOT will return a 1 if that expression evaluates to a zero, otherwise it will return a 0.

You must also make sure that logical AND ("&&") and OR ("||") short-circuit. This means that when two expressions and combined by a Boolean operator, if the first expression guarantees the outcome, the second expression should not be executed.

For example:

val a = 1;
val b = 1;
val c = 1;
7 || a = 2;   # 7 is true, so "or" must be true; no need to run a=2.
0 && b = 2;   # 0 is false, so "and" must be false; no need to run b=2.
7 && c = 2;   # 7 is true, so we need to test the second part and we set c=2.
print(a, b, c);

Should output (upon executing resulting Bad Code with bad_interpreter.py): 112

Group 2: The "char" Type

Up until this point Good Lang has had only one variable type called "val", but for this project you must implement a second type called "char" (which handles a single character at a time). This means that you now need to pay attention to Type Checking.

Your symbol table should keep track of the type of each variable. The 'char' type is more limited than the 'val' type, and cannot be used in most expressions. The 'char' type can only be used with assignment operators and relationship operators, and then only in the case that both sides of the operator are of the 'char' type. In all other cases a type-checking error should be thrown.

You must also implement literal characters, represented in the source code by a character inside of single quotes, such as 'P'. Literal characters should behave similarly to literal integers, but are of type char. Note that in addition to basic printable characters, you must support the newline character literal '\n', the tab literal '\t', the single quote literal '\'', the backslash literal '\\'. No other escape characters need to be supported. Also note that a literal pound sign '#' should be allowed in the code and NOT treated like a comment (this should work automatically if you are lexing everything correctly).

Characters can be used in assignment and comparisons, but not in mathematical operations.

Group 3: Code Blocks and Scope

You must allow braces ('{' and '}') to create a block of code. These code blocks should be treated as a single statement outside of the block. Thus, for example, if a code block follows an "if" the whole block should be executed conditionally (see next section). Additionally, code blocks should define scope. Any variable defined inside of a code block will not be available after that code block ends (it will "fall out of scope"). You are not allowed to re-use a variable name within the same scope, but you may re-use in a different scope.

For example, the following is legal code:

char abc = 'q';
{   val abc = 17;
    abc = abc + 4;
}

However, this is not legal:

char abc = 'q';
val abc = 1;     # Defining abc again in the same scope!

Note that even though code blocks should be considered a single statement by the code around them, they do NOT need to terminate in a semicolon after the close-brace.

When two variables at different scope levels have the same name, the most recent one should take precedence. For example the code:

val my_var = 10; {
    val my_var = 20;
    print(my_var);
}
print(my_var);

Should generate intermediate code that outputs:

20
10

Group 4: Flow Control Commands

You must implement two new flow commands: "if" (followed by an optional "else") and "while" (associated with an optional "break").

The if command must be followed by a set of parentheses containing an expression. If the enclosed expression evaluates to be true (non-zero), the statement that follows it should be executed. If the expression is false (zero), the following statement should be skipped. Remember that a block always counts as a single statement, so a programmer has the option of enclosing a series of statements within braces to make them all conditionally executed based on the if.

An if-statement may optionally be followed by an else command and a second statement. The second statement is executed if and only if the condition expression evaluated to zero and the first statement was skipped. The else must bind to the most recent paired if statement. See http://epaperpress.com/lexandyacc/if.html for an example.

The while command must also be followed by an expression in parentheses and a statement. The statement (possibly a block) is continuously executed as long as the condition expression remains true. The expression is tested once before each execution of the statement.

Finally, the break command is legal only inside of a while loop. When executed, the break command exits the lowest level loop unconditionally.

For example:

val x = 3;
val z = 5;
while (x) {
    val y = 0;
    while (z) {
        if (y == z) break;
        y = y + 1;
    }
    print(y);
    x -= 1;
}

The "break" command in the above code will cause execution to jump to the "print(y)" statement, if y equals the value of z.

The output from the execution of the above code is:

5
5
5

Testing Intermediate Code Output

As with the previous project, the bad_interpreter.py will allow you to directly run Bad Lang Code so that you to test them.

Example workflow:

  1. Make a test.good file with Good Lang contents.
  2. Direct the contents of that file to your project (presuming it can read stdin and write to stdout) with cat test.good | python3 project.py >test.bad
  3. Examine test.bad to ensure the bad output looks alright.
  4. Run the bad code with bad_interpreter.py like so: cat test.bad | python3 bad_interpreter.py
  5. Alternately, you can do it all in one step: cat test.good | python3 project.py | python3 bad_interpreter.py

Submission

When you submit your program, you must include:

  • A project.py with the "generate_bad_code_from_string" function.
  • A README.txt file that includes the your name, a brief summary of what does and does not work in your code. Your README may also include any notes that you want us to read before grading your program, such as any external resources you used in its development.
  • You may divide your project code into multiple files to do so you will need to use relative imports (if that doesn't mean anything to you, just keep everything in the project.py file).