Assignment title: Information


Principles of Programming Languages winter 2017 Assignment 6 - Programming Part Due: 11 April 2017 This is the final programming assignment, in which you will build a symbol table and an interpreter. It is due on the last day of classes. A brief overview of the interpreting process you will complete building with this assignment: (1) First you will need the symbol table: build a library of predicates for managing a symbol table. The symbol table will be built using association lists and stored in global memory space. Write the symbol table in Prolog, using predicates (no grammars) and save as a file named symbol table.pl. The file should contain the following predicates: • create empty table/0: Creates an empty association list and stores it in a global variable for use as a symbol table. • initialize functions/1: Accepts a list of functions as the argument and stores them in the symbol table. • add symbol/2: Accepts a key, value pair as arguments and adds a symbol to the table in the form: key ! [value|Rest]. • add symbol list/2 Accepts a list of keys as the first argument and a list of values as the second argument and adds every (key, value) pair to the symbol table as required by add symbol/2 • remove symbol/1: Accepts a key as an argument and removes the most recently added value for that key such that the symbol table is changed from key ! [value|Rest] to key ! Rest. get symbol/2: This predicate accepts a key as the first argument and unifies a value, as the second argument, to the first value associated with key in the symbol table. The symbol table entry should be structured: key ! [valuejRest] 12 (2) Finally, the Interpreter: The source file is a program, with a single function. int main (int input ) = input + 3 This is input into the Tokenizer, which outputs a list of tokens, stored as: [TYPE INT,IDENTIFIER,OPEN P,TYPE INT,IDENTIFIER,CLOSE P,ASSIGN,IDENTIFIER,ARITH ADD,INTEGER] This label list is used as the input to the parser (the grammar). Since the source tokens are converted into labels, it is now easy to check if the token list is grammatically correct i.e. the tokens are in the right places). The parser matches tokens in a left-to-right recursive manner. As input reduces to known terminal labels, Prolog unifies and produces an output list. If you designed your grammar?s output to have the format suggested in the assignments and tutorials, the structured list your parser outputs might look something like: [ [ [ int,id ],(,[ [ int,id], [ ] ],),=,[ [ id,[ ] ],[ +,num] ] ] ] The above list represents a list of functions, and each list within a function represents a production. Seeing which list corresponds to which production should be fairly clear. In the structured list provided by the parser, instead of variable/function names and numerical values, the list still contains the placeholders id and num. This means you have to refer back to your original code and copy the original values into the current output. Since the values stored in the original token list are known, they can be retrieved from there and place them in the appropriate places in the structured list. Some observations about this process: • The number of atoms in our structured list is equal to the number of elements in the token list: The parser does not add or remove tokens, it just verifies that the tokens are in the correct positions. • The left-to-right order of the tokens in the token list is the same as the left-toright order of the atoms in the structured list: The parser does not change any token positions. • If you created a structured list using the tokens from the language specification (so that the structured list tokens match the source file tokens) then you can copy every token from the token list into the structured list as you find them. For example, if the first atom you find in the structured list, starting from the left is int, then the first element of the token list is also an int. This means that instead of checking for the id and num tokens, you can simply copy every token as its found and the structured list will be correct.3 • As you replace terms from left-to-right you will have to recursively move into nested lists. This means you will need to pass your token list downward through every recursion and whenever you reach the end of a list you must return the remaining tokens to the level above. If you do not do this, Prolog will just continue using whatever the current token list is at higher levels, even if you have removed tokens from the current token list at the lower levels. Once you have an algorithm for infusing token list values into your structured list, you should have a new structured list with the correct names and values inserted, such as: [ [ [ int,main ],(,[ [ int, input ], [ ] ],),=,[ [ input, [ ] ],[ +,3] ] ] ] Since this is known to be a list of functions, you can use your symbol table predicates to move each function (in this case just one) into memory (See Tutorial 9 for details). Once this is done you can run the interpreter and obtain results. If the previous components are properly constructed, according to the specifications, the interpreter is quite straight forward to build. Once the functions are stored in the symbol table, the interpreter will accept a list of integers as starting parameters and use them to call the main predicate. The body of the main function will then be executed and it will perform a series of actions which reduces the body to a single integer value, which is then returned as the result. The interpreter should then verify that a proper return value is found and then return it as the final result. When constructing the interpreter, the following language rules apply: • Every source file must contain at least once function and one of those functions must be called main with a return type of int. The main function may contain any number of parameters. • The body of each function will be calculated until a single value is retrieved. This is the result for the function body. Once a result is reached, you must verify that it matches the return type of the function. A function can return either a bool or int as an integer value. If the return type is bool, the result must be 0 (false) or 1 (true). If the return type is int, the value may be any integer. Steps for calling a function: (a) You must accept a list of integers as parameters (value list) and the name of a function. (b) Retrieve the function?s data from the symbol table (return value, parameter list, function body).4 (c) Assume the value list elements will be in the same order as the parameter list entries. The parameter list entries should be in the form of [Type, Name]. You will need to: (i) Verify that for each value in the value list, the integer value is of the same type (bool or int) as the corresponding parameter type from the function data. (d) If the types match, store the value in the symbol table as a (Name ! Value) pair. (e) Once the expression produces a value, ensure its type matches the function's return type. (f) If the types match, remove each parameter in the parameter list from the symbol table. (g) Return the result of expression. When constructing the interpreter's main execution logic, note that the body of a function is an Expression (from the language grammar), which is stored as a list with a specific number of elements. For example, examine the main function body, used above: [ [ input, [ ] ],[ +,3] ] ] ] For this particular body, use the production: Expression ! Value Extra Expression. Observe that for this particular production is a list of two elements, a predicate that handles an expression of two components and produces a result is needed: expressionHandler( [ Value, ExtraExpression ], Result ) :- valueHandler( Value, ValueResult ), extraExpressionHandler( ExtraExpression, ExtraResult ), evaluate( ValueResult, ExtraResult, Result ). Since a predicate can be written for each production, the valueHandler will take in a Value production list and break that down to an integer (a value is either a number or an identifier, which is a number stored in a symbol table). An ExtraExpression can be an arithmetic function or nothing. With these in hand, you can evaluate the outcome of all of our components to produce a result. For this, the evaluate clause needed to handle the code would be: evaluate( Value1, [ '+', Value2 ], Result ) :- Result is Value1 + Value2. Writing the interpreter in this fashion means that every predicate is a black box that always returns a result. If you are in expressionHandler and you know what kind of result the decomposition predicates will produce, you should be able to handle the values and produce results with little trouble.5 The deliverables for this assignment: (a) The symbol table, saved as a file named symbol table.pl. (b) Create a file named interpreter.pl to contain all the function calling and code execution predicates. The predicates may be named however you choose and can accept any number of parameters. Ensure your interpreter follows the requirements outlined on pages 2-4 of this document. Any other additional work is at your own discretion. (c) Finally, create a language executor.pl file which includes a run program/3 predicate as follows: run program( FileName, Arguments, Result ) The first argument accepts an atom as a file-name to a source code file, the second argument is a list of integers to be passed to the main function, and the third argument should be the result of the code execution in the interpreter. This predicate should be the main logic of the pipeline as outlined in the figure on page 1.Via this predicate, tokenize the source file, move the tokens to the lexer, move the label list to the parser, etc., to finally unify Result to the result of the interpreted code provided by the file and the arguments. Note: When you submit your code, be sure to include ALL files required for execution (tokenizer, lexer, etc.)..