SOLVED:CSC 446 C– Compiler ASSIGNMENT #1

$65.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (5 votes)

This project consists of writing a Lexical Analyzer for a subset of the C programming language.  The Lexical Analyzer is to be a module written in the language of your choice that exports the following:

 

procedure GetNextToken – a parameterless procedure

global variables Token

Lexeme

Value               {for integer tokens      }

ValueR            {for real tokens           }

Literal             {for string literals       }

 

The following are the reserved words in the language (must be lower case):

 

if, else, while, float, int, char, break, continue, float, void.

 

The notation for specifying tokens is as follows:

 

Comments begin with the symbol /* and continue to the closing delimiter */.

Comments may appear after any token.

 

Blanks between tokens are optional, with the exception of reserved words.  Reserved words must be separated by blanks, newlines, or another token.

 

Token id for identifiers matches a letter followed by letters, digits and/or the underscore having a maximum length of 27 characters.  C identifiers are case sensitive.

 

letter -> [a-z,A-Z]

digit  -> [0-9]

underscore -> _

id     -> letter(letter | digit | underscore )*

 

 

Token num matches unsigned integers or real numbers and has attribute Value for integers and ValueR for real numbers.

 

digits                        ->         digit digit*

optional_fraction     ->         . digits |   e

num                          ->         digits optional_fraction

 

String literals begin with a ” and end with a ” and should be stored in the literal variable.  Strings must begin and end on the same line.

 

The relational operators (Token relop) are:

 

==, !=, <, <= ,>, >=

 

The addop’s are: +, -, ||

 

The mulop’s are: *, /, %, &&

 

The assignop is:  =

 

The following symbols are also allowed in the language:

 

( ) { } [] ,  ; . ”

 

The C subset has the following rules:

 

Variable declarations start with either the reserved words int, float or char followed by a list of one or more (possibly initialized) variable names ended with a semicolon;

 

Function declarations start with any of the reserved words; void, int, float or char and are followed by a function name, a parameter list and then the body of the function.

 

The body of a function begins with a { and is terminated by a }.

 

The tokens for each possible symbol (or type of symbol) should be declared as an enumerated data type.

 

To test your project write a short program that imports (uses) module LexicalAnalyzer to read a source program and output the tokens encountered and the associated attributes (lexeme for identifiers and reserved words, the numeric value for token num, and the symbol itself for all others). Source code for this and all other assignment must be submitted in a single zip file to the appropriate D2L dropbox on or before the due date.