Description
Part 1: Reading Compilers – Principles, Techniques, & Tools, 2nd Edition, Sections 4.5, 4.6, 4.9 Homework Exercises – problem 4.5.3 (b) and laboratory assignment.
Introduction An example of a math calculator program (interpreter) using lex (first.l) and yacc (second.y) is found on page 295-296 of your book. You implement and test the program using the commands: lex first.l yacc second.y cc y.tab.c -ly –ll ./a.out ./a.out < math0.txt The first command creates a lex.yy.c file. The second command creates a y.tab.c file. The third line compiles the y.tab.c file into an executable file (a.out). You run the executable program with the forth command and enter a string from the console. If you enter a string (1e1+1e2) and hit return, you will get an output result (110) printed to the screen. To exit the program you must use CTRL-C to terminal the program. The fifth line shows redirecting a file to the executable to test the program and terminates when it see the end of file (EOF) character.
The way the source code from the book is formatted, lex.yy.c file from the first line is included in the bottom of the yacc (second.y) file. This is why it is not included in the third line as a separate file to compile and link. If you change the first.l file you must execute the first three commands in sequence again. It is easier to use a make file (Makefile) to compile and link each file independently. We also would like to include a main program (calc.c) so we can initialize the compiler/interpreter before calling the yacc parser (yyparse).
In this assignment you change the implementation of the math operators (+ - * /) from using a double (8 bytes) type to using a long double (16 bytes) type. Then you add the bitwise operators (& | ^ % ~), which require integer types (long long). You update the YYSTYPE structure to handle integer and floating-point numbers. You will need to handle constant type conversion for the bitwise operators. For extra credit you remember the constant format (floating point, decimal, octal, hexadecimal) for correctly displaying the output result (0xff & 0x0f - 0x0f).
For encoding integer and floating-point numbers you replace the (first.l) lex regular expression for number to use the expressions from the ansi_c compiler (shown below). number [0-9]+\e.?|[0-9]*\e.[0-9]+ %% {number} { sscanf( yytext, "%lf", &yylval); return NUMBER; } 0[xX]{H}+{IS}? // hex 0{D}+{IS}? // octal {D}+{IS}? // decimal {D}+{E}{FS}? // float {D}*"."{D}+({E})?{FS}? // float {D}+"."{D}*({E})?{FS}? // float
Laboratory Assignment
In this assignment you encode the numbers using lex regular expressions and the C sscanf() function into a static YYSTYPE data structure. You update the current lex scanner to add lex regular expressions to encode integers (decimal, octal, hexadecimal) and floating-point numbers. You then expand the math operators (+ - * /) to include the bitwise instructions (| ^ & % ~) operators. You add action code to manually handle the operand types in the parser to maintain the correct operand type.
In the calculator program the parser uses the yacc precedence operators (%left %right %prec). The precedence statements are normally located in the top of the yacc file (second.y) just below the %token statements. The precedence is defined from lower to higher precedence going down the page. The binary operators are referenced to the left (a op b) and the unary operators are referenced to the right (op b). For example: (-2+3) where minus (unary) is applied first and then plus (binary) is used next ((-2)+3).
The book version of the calculator program supports double as the yylval data type and only returns floating point numbers (1e1). The parser is designed to handle a syntax error and to resynchronize at the end of line (‘\n’). Look at the second.y file and see “lines” is defined as an expression, error or empty line. The action yyerrok clears the error and allows the parser to continue processing the input stream. The version shown below is slightly different than the version shown on page 296 and removes the extra productions. We would like one place to handle the output result. lines : lines expr '\n' { printf("%g\n", $2); } | lines error '\n' { yyerror(" reenter previous line: "); yyerrok; } | /* empty */ ; In this assignment you redefined YYSTPE to using a structure definition to hold type information, INTEGER and FLOAT values. This changes the size of the yacc stack from a single scalar value to a larger structure definition. So while the size of the yacc stack is bigger in this implementation it does provide direct memory references for increased speed compared to dynamic address pointer implementations (indirect addresses). The math operators (op) already included are: '+' addition '-' subtraction '*' multiple '/' divide '-' unary minus The new logical bit-wise operators are: '|' IOR '^' XOR '&' AND '%' MOD '~' NOT (unary)
You code the parser actions to support the correct data types. For the math instructions, if all the input operands are INTEGER then the result is an INTEGER. If any of the operands are a FLOAT then the result is a FLOAT. The data conversion methods are shown below for the math operators. INTEGER = INTEGER op INTEGER FLOAT = FLOAT op INTEGER FLOAT = INTEGER op FLOAT FLOAT = FLOAT op FLOAT INTEGER = op INTEGER FLOAT = op FLOAT
For the logical bit-wise operators all the FLOAT input operands need to be converted into INTEGER and the results is always an INTEGER. The data conversion methods are shown below for the bit-wise operators.
INTEGER = INTEGER op INTEGER INTEGER = FLOAT op INTEGER INTEGER = INTEGER op FLOAT INTEGER = FLOAT op FLOAT INTEGER = op INTEGER INTEGER = op FLOAT
To achieve maximum execution speed the YYSTYPE data structure is defined using a static structure definition and the math and logic operations are coded directly in-line. The actions could be coded into just a few functions to optimize on space. The trade-off for increased speed is more source code to maintain and gives a larger executable image. This model allows yacc programs to be used in soft real time applications such as interpreters and processing stages in routers and communication systems. In this assignment you shall implement the functions to encode constant values in the scanner. You should be able to support: integers (ddd), octal (0ooo), hexadecimal (0xhhh), and floatingpoint numbers (d.de-d). The current version defined in the code supports only floating-point numbers (math0.txt). Part 2: Laboratory From a console window, make a directory on your computer in your EECS337 directory under your Case ID and call it hw06. mkdir ~/EECS337/caseid/hw06/ ; where caseid is YOUR Case ID, enter all in lower case
Change directory to the hw06 directory. cd ~/EECS337/caseid/hw06/
Download a copy of: hw06_caseid.tar file to the hw06 directory from https://blackboard.case.edu/ from the EECS337 homework assignment area. To untar the tar file type the command: tar xvf hw06_caseid.tar
The following files will be created in the current working directory. book/book.sh book/first.l book/second.y calc.c first.l hw06_test.sh Makefile math0.txt math1.txt math2.txt math3.txt math4.txt math5.txt math6.txt math7.txt math8.txt math9.txt second.y yystype.h
The book subdirectory contains the original software from the book. In the current directory the software has been updated to use a make file, moves the YYSTYPE definition into the yystype.h file, adds a main function and adds the yyerrror and yywrap functions. Type the following commands to see the differences between the starting templates and original book version. diff first.l book/ diff second.y book/
[Optional] You can change directory into the book directory and run the book.sh file to compile, link and test the original version of the software (math0.txt). cd book/ ./book.sh cd ..
Look at the test files (math*.txt) and examine the input strings. The first file uses expressions with strings that should work with the original software. The others create syntax errors.
Look at the calc.c file. This is the main program and calls into yyparse. This version is not the same as used before in the previous assignments. Only the +yydebug flag works for debugging.
Look at the yystype.h file. Notice the book defines YYSTYPE as a double at the top of the yacc file. This version defines YYSTPE as a double inside the yystype.h file.
Look at the first.l file and notice the include files at the top of the file. The yywrap function is included at the bottom of the file.
Look at the second.y file and notice the include files at the top of the file. The yyerror function is included at the bottom of the file.
Build the calculator compiler (interpreter) calc by using the commands: make clean and make
To test this version of the calculator program from the command line type: ./calc and enter a math expression: 13e1+1e2 You should get the result (230) printed to the standard output. Notice on a syntax error the software does not exit: 2+3 To exit, type on the command line: $
To test with yydebug statements type: ./calc +yydebug and enter a math expression as before.
To test with the first example file type: ./calc < math0.txt Your output should be: for caseid start time: Mon Sep 30 09:54:13 2013 SIZEOF int: 4 INTEGER: 8 INTEGER INTEGER: 8 float: 4 FLOAT: 8, INTEGER FLOAT: 16 Enter calculator expression and $ to exit 230 6000 -170 0.15 170 -230 6000 -6000 6000.15 On $ or EOF the program exits. You are now ready to solve the laboratory assignment.
Part 3: Laboratory Assignment You will edit the current version that supports double numbers with math operations to include long long and long double values. You use type in the structure to know which value is encoded.
Edit calc.c and change all caseid to your Case ID and save the calc.c file.
Edit yystype.h and change the YYSTYPE to a structure definition to support long long and long double. /* * define yystype */ #define YYSTYPE struct yycalc YYSTYPE { int type; // holds INTEGER or FLOAT from %token long long lvalue; // holds INTEGER value long double dvalue; // holds FLOAT value }; Save the yystype.h file. Edit the lex.l file and remove the number definition. Then add to the top the lex macro definitions from the ansi_c compiler (hw03 scan.l). Insert this code before the first %{ as shown below. D [0-9] L [a-zA-Z_] H [a-fA-F0-9] E [Ee][+-]?{D}+ FS (f|F|l|L) IS (u|U|l|L)* Then between the %%s remove the lex expression {number} and action for NUMBER and insert the code below. This action encodes a hexadecimal string into the lvalue of the yylval structure. 0[xX]{H}+{IS}? { sscanf( yytext, "%Lx", &yylval.lvalue); yylval.type = INTEGER; return( INTEGER); } Then add action code to the regular expression shown below to support octal (%Lo), decimal (%Ld) and the three floating-point (%Lf) numbers. For the integer types use INTEGER for the type and return INTEGER. Encode the floating-point numbers into the dvalue field (&yylval.dvalue) and set the type to FLOAT and return the FLOAT value. 0{D}+{IS}? {D}+{IS}? {D}+{E}{FS}? {D}*"."{D}+({E})?{FS}? {D}+"."{D}*({E})?{FS}? Save the first.l file.
Edit the second.y file and remove the %token NUMBER and replace it with two new tokens. %token INTEGER %token FLOAT Change the precedence operators to include the logical bitwise operators. %left '|' %left '^' %left '&' %left '+' '-' %left '*' '/' '%' %right UMINUS '~' /* supplies precedence for unary minus */
Change the number production with NUMBER to now be INTEGER or FLOAT. number : INTEGER | FLOAT ; Then change all the production actions to handle the new operand data types. For example replace the old line to print the output result with the code shown below. Notice you need to switch on the operand type to determine the format string to use in the printf statement. lines : lines expr '\n' { switch( $2.type) { case INTEGER: printf("%Ld\n", $2.lvalue); break; case FLOAT: printf("%Lg\n", $2.dvalue); break; } } | lines error '\n' { yyerror(" reenter previous line: "); yyerrok; } | /* empty */ ; Remove and replace all the actions for the math operators and add the bitwise operators to handle the INTEGER and FLOAT operand types. One method is shown below. Notice the action is to switch on both of the operand types and type cast the values before the operations. Save the second.y file. | expr '-' expr { switch( $1.type) { case INTEGER: switch( $3.type) { case INTEGER: $$.type = INTEGER; $$.lvalue = $1.lvalue - $3.lvalue; break; case FLOAT: $$.type = FLOAT; $$.dvalue = (long double)$1.lvalue - $3.dvalue; break; } break; case FLOAT: switch( $3.type) { case INTEGER: $$.type = FLOAT; $$.dvalue = $1.dvalue - (long double)$3.lvalue; break; case FLOAT: $$.type = FLOAT; $$.dvalue = $1.dvalue - $3.dvalue; break; } break; } }
Test your calculator program using each of the test files by typing the commands:
make clean make ./calc < ../hw06/math0.txt ./calc < ../hw06/math1.txt ./calc < ../hw06/math2.txt ./calc < ../hw06/math3.txt ./calc < ../hw06/math4.txt ./calc < ../hw06/math5.txt ./calc < ../hw06/math6.txt ./calc < ../hw06/math7.txt ./calc < ../hw06/math8.txt ./calc < ../hw06/math9.txt Part 4: Extra Credit (5 points) Edit the files again and change the code to remember the format of the constants (decimal, octal, hexadecimal, floating-point). Decide on a method to determine the output results when the inputs are not using the same format (0x0ff & 1.3 - 0x01).
Generate your output the same as in Part 5 and upload ONLY your first.l file to blackboard in the assignment area. Submit it with the comments hw06 with your CaseID on the title line. The extra credit will be graded on-line and your score will be added to your assignment score. When you generate your laboratory output file in Part 5, show the improved type format handling. Mark your output file with EXTRA CREDIT and turn in with your assignment. Part 5: Laboratory Output Generation When all your lab assignments have been completed execute the homework script file “./hw06_test.sh”. To redirect the standard error (stderr) and standard output (stdout) to a file use the following command. ./hw06_test.sh & hw06_test.txt
Print out the hw06_test.txt and put your name, assignment number, date on it and turn it in with your homework exercises.
Your final directory structure for the calc compiler should be as below (using your Case ID): EECS337/caseid/hw06/Makefile EECS337/caseid/hw06/calc EECS337/caseid/hw06/calc.c EECS337/caseid/hw06/first.l EECS337/caseid/hw06/hw06_caseid.tar EECS337/caseid/hw06/hw06_test.sh EECS337/caseid/hw06/hw06_test.txt EECS337/caseid/hw06/math0.txt EECS337/caseid/hw06/math1.txt EECS337/caseid/hw06/math2.txt EECS337/caseid/hw06/math3.txt EECS337/caseid/hw06/math4.txt EECS337/caseid/hw06/math5.txt EECS337/caseid/hw06/math6.txt EECS337/caseid/hw06/math7.txt EECS337/caseid/hw06/math8.txt EECS337/caseid/hw06/math9.txt EECS337/caseid/hw06/second.y EECS337/caseid/hw06/yystype.h

