Assignment title: Information

Principles of Programming Languages winter 2017 Assignment 5 - Programming Part Due: (via svn), 20 March 11:59pm Programming Part: The task is to build a parser. A parser analyses the source code, using the grammar defined for the language (included in the \Language Definition" document alongside), to ensure the code is grammatically (syntactically) correct. You will need to use the using the output of the lexer from A4 as the input to the parser.Below you will find instructions and specifications for building a parser The Parser: Written using a definite clause grammar in Prolog (make sure you have read and understood the Tutorials 6 and 7, on definite clause grammars (DCG’s)), and saved in a file named grammar.pl . The parser should be capable of of verifying a list of tokens created by the lexer and produce a list of structured productions, generated by parsing the list of atoms (see below). For this, you will need the language specification, available alongside.The predicate for retrieving the structured production list should be: parse list( LexedList, StructuredList ) . This parser will require you to augment your grammar using parameterized headers as shown in tutorial 7. The StructuredList output by the grammar file does not contain any actual identifier names or numerical values. This means you will have to refer back to the token list and replace all of the ‘number’ and ‘identifier’ terminals in your structured list with the appropriate values from the token list. The number of terminal tokens will not change between the token list and the structured list. i.e. if ‘number’ is the fourth terminal in the structured list, then its value is the fourth token in the token list. You will need to write a predicate that inserts the correct integers and identifiers into your parsed list. Some comments on constructing the Structured List of Productions: It is advisable to construct the grammar in a way that the language unifies a list of terminals to be processed in the interpreter (in the next assignment). There are many ways to organize these terminals when parsing the lexed tokens. However, it is strongly recommend you use the following output style to create a list of lists, where each list represents a single production in your grammar. 12 Consider the following simple grammar for adding values: Math ! Number Operator Number Number ! number Operator ! + j− This grammar applies to any simple arithmetic expression like: 1 + 3; 14 − 2; or 0 + 11. One way of parsing using such a grammar in Prolog would be to simply return a list of terminals such as [9, +, 4]. Since the grammar is simple, such solution would be acceptable. However, if the grammar were more complex, containing =; ∗; (; ), you may end-up with more difficult to process lists like: [ (, 15, +, 1, ), *, 3, -, (, 4, /, 3, ) ], leading to confusing and disorganised code. Instead of storing the parsed list as a single, flat list, it is better store every production as a list. In the example above the parsed list should instead be stored as: [[9], [+], [4]] . This will make the code easier to organise. Notice that there are three lists inside a list, the structure of which matches the Math production in the grammar. The list is easy to calculate, because everything is broken down in terms of the grammar productions and lists, when writing predicates for the interpreter. Then the following four predicates can be written to calculate with the above parsed list: math( [ Number1, Operator, Number2 ], Result ) :- number( Number1, NumValue1 ), number( Number2, NumValue2 ), operator( Operator, OpType ), calculate( NumValue1, OpType, NumValue2, Result ). number( [ Number ], Value ) :- atom number( Number, Value ). operator( [ ‘+’ ], ‘+’ ). operator( [ ‘-’ ], ‘-’ ). calculate( Value1, ‘+’, Value2, Result ) :- Result is Value1 + Value2. calculate( Value1, ‘-’, Value2, Result ) :- Result is Value1 - Value2. The abstract quality of the above code means that the predicates are modular and changes rarely propagate across the entire program affecting other parts. The most beneficial part of processing parsed code in this fashion is that if you know how to write your production rules, then you also know how to write the predicates to process them.