This page is not finished. Please disregard for now.

Project 2: Syntax Analysis

Description

In the previous project, you each wrote a lexical analyzer that took an input file and divided it into tokens. Now, you will write a parser that will test those tokens for syntactic correctness. Using BNF (Backus-Naur Form), you will establish a set of grammar rules that define the syntax of our language.

You must design and implement a grammar that can determine if a LOLCode program is syntactically correct. For this program, you need to write a function called "build_parser", that returns a parser object (generated using RPLY). The function should take a single arugment, a list of the names of the tokens (this is needed for the ParserGenerator to construct a parser). The object returned will be used to parse LOLcode. If the code is syntactically valid, the its parse method should sucessfully complete with no exceptions raised. If the string cannot be parsed (for example if there is a problem on line 3 of the LOLcode string), your compiler should raise an exception.

Implementation Instructions

RPLY

Your starter code contains a folder named rply, that is subtly different than the one from the previous project (there was a small change needed to make it compatible with the Mimir test environment). Please use the folder that came with Project 2, not the previous Project's folder. (Don't worry, if you try to submit with the wrong folder, you'll get a interpreter error from "parser.py" in the rply folder).

Testing your code

I've provided a parse_LOLcode function that invokes build_lexer (Project 1) and build_parser (Project 2). The Mimir test cases will call this function to determine if your parser is functional. You shouldn't modify this function. Also, there is a file named "mimir_testcase.py" which is located adjacent the Project2 folder. If you invoke this file with "python3 mimir_testcase.py", it will run your code in an identical manner to Mimir. Feel free to edit this file to create your own test cases.

Shift-Reduce and Reduce-Reduce Conflicts

Conflicts in your parser means there's ambiguity in your grammar. This is bad. You compiler may correctly parse some input, but fail unexpected on others. The starter code checks that the parser you build doesn't have any conflicts. If you introduce a conflict, look carefully at your most recent changes a fix it immediately. It is extremely difficulty to remove conflicts from a complex/completed parser.

LOLcode v. 1.450

For this course (and this project) will be implementing parts of a version of LOLcode (described here: LOLcode Spec). There are numerous variants of LOLcode, so please refer to the spec when implementing the projects. For this project, we are only implementing part of the language (more features to be added later).

Recommend Order Of Implementation

  1. File Creation
  2. Variables Declaration and Initialization (Note: we ill only be concerning ourselves with NUMBR variables in Project 2).
  3. Comments
  4. Variable usage and symbol table for usage and declaration error detection (see details later in the spec).
  5. Assignment Expressions
  6. More Operators!!! (Assignment, Math, Logical, and Comparison Operator)
  7. Input and Output
  8. Random
  9. Lastly, ensure that you only allow variable ids to be on the LHS of assignments.

While NUMBR, NUMBAR, LETTR, YARN, and TROOF are all legal types, this project will only be testing the use of NUMBR. In Project 4 (when we learn type checking) other types will be used as well.

Symbol Table

You must also implement a symbol table, a simple data structure that will track all of the identifiers that have been declared in the source program that you are compiling. Every time a variable is declared, it should be entered into the symbol table, and every time a variable is used, it should be verified with the symbol table. There are two types of errors that your program needs to identify related to symbol tables:

  1. If a variable is being used in an expression, but has never been declared. For example, if the variable my_var is used without being declared, your compiler must raise an exception.
  2. If a variable id is being declared for a second time. For example, if var1 was declared on line 2, but it's being (re)declared again on line 5, your compiler must raise an exception.

Turning In

Your submission must be in a folder named "Project2" and must include:

  • A README.txt file that includes a summary of what does and does not work in your code. Your README may also include any notes that you want us to read before grading your program, such as any external resources you used in its development.
  • A project.py Python3 file that is invoked by the test cases.
    • This file must include three functions: "build_lexer" (no parameters, Project 1), "build_parser" (one parameter, Project 2), "parse_LOLcode" (included in starter code).
    • Note: we don't check the messages nor type of the exceptions, just that they were raised appropriately.