Description
Introduction
A compiler and linker translate a high level programming language into an executable image for a target processor (text to binary). Another operation of a compiler is to translate one programming language into another programming language (text to text). This is one type of a cross compiler.
For this project you design and implement an ANSI C compiler and target a PIC 16F1827 processor. You input ANSI C source and output PIC 16F1827 assembler code. You take the compiler from Assignment 3 and implement the declarations, functions and update the code generator. You can update the code generator provided or you can use your code generator from Assignment 12. The PIC assembler code shall compile and run under MPLAB. The execution of the PIC instructions shall correctly implement the C source programs. For extra credit you update the code generator to handle the mustang.c file and target a PIC16F1827 development board. The mustang.c program implements the right side of the turning signal for the Ford Mustang.
Requirements The name of the compiler shall be ansi_c and shall run as a console application under Linux or Unix. It shall input the three C source programs (../hw0[1|2]/Code*.c[pp]) and output a correct PIC assembler program (Code*.asm) file. You shall run the project_test.sh script file (./project_test.sh & project_test.txt) and the output file shall be printed out as part of the project report.
Each students shall design and implement a version of the C to PIC cross compiler. At the end of the project, each student shall compress the project code (tar) and submit the file to blackboard. Please follow the submission instructions to make grading the assignment easy on the professor. Answer the questions at the end of the project. The student shall turn in the project document on the last day of class. When you print the source code, please print multiple pages on each sheet. • Cover page • Introduction • Answer questions • project_test.txt • Code*.asm • [codegen.c] // If you wrote your own version, please include the source
Review In the previous assignment the calculator program translated the input source code into three address codes and then into PIC assembler code. The scanner reads and converts the input into tokens and attributes. The scanner installed the identifiers and constants into the symbol table and returned the index value. The parser uses the index to find the symbol table entry. The parser updates or uses the symbol table depending on a declaration or statement production. In a declaration the type and identifier are inserted into the symbol table to create a variable. In an assignment statement an identifier can be on both sides (left = right) of the production.
The output of the parser was a QUAD with an operator and up to three operands. The linked list was then passed to the code generator. The code generator translated the operator and operands into multiple PIC assembler instructions (TUPLE). The final instructions were optimized and generated (prefix, code, postfix) using an absolute address PIC memory model.
The project target machine is the PIC 16F1827 architecture. Therefore an optimized version of a compiler can be designed. For example the immediate values for the PIC are char or integer and can range from 0 to 255. It is then possible to put static checking in the scanner to audit the source code values before passing the values to the parser. The scanner packages the input values into dynamic memory and returns the values to the parser. It leaves the parser the job of installing the identifiers into the symbol table. A C program also has levels and functions. The
level information must be part of the symbol table and used in the code generation. The PIC does not allow two symbols with the same names so the code generator must handle this. The output could be in physical addresses only (0x20 to 0x7f) and the symbols sections would not need to be generated (symbol EQU address). The address only function has been included in the source code and is turned on and off with the address flag on the command line. Usage: ansi_c [[+|-]echo] [[+|-]debug] [[+|-]yydebug] [[+|-]symbol] [[+|-]address] [+test] [filename] […] The echo flag repeats the input characters to the console window. The debug flag prints the TUPLE linked list at the top of the parser and again after post processing. The yydebug flag puts the yacc parser into verbose mode. The symbol flag prints the symbol table and the free symbol table. The address flag generates the output using only physical addresses. The test flag prints the example code generator functions. Any other input is treated as an input file.
Instructions From a console window, make a directory on your computer in your EECS337 directory under your Case ID and call it project. mkdir ~/EECS337/caseid/project/ ; where caseid is YOUR Case ID, in lower case
Change directory to the project directory. cd ~/EECS337/caseid/project/
Download a copy of the file: project_caseid.tar to the project directory from https://blackboard.case.edu/ in the EECS337 homework assignment area. To untar the tar file type the command: tar xvf project_caseid.tar
The following files will be created in the current working directory. Makefile project_test.sh main.c scan.l gram.y yystype.h tokens.h audit.c mustang.c test1.c test2.c test3.c test3.txt tuple.c symbols.c codegen.c
The original C scanner (scan.l) and grammar files (gram.y) from assignment 03 have been updated to include a main program (main.c), an include file (yystype.h) a Makefile and extra source files (audit.c symbols.c tuples.c codegen.c tokens.h). The scan.l and gram.y files are the only files you need to edit to complete the first part of the project. Then update or rewrite the codegen.c file to complete the project. A number of other files have been included to perform the scanner, parser, symbol table and code generator functions. You need to update the scanner file to return tokens and attributes to the parser and insert the parser functions into the yacc file to use the symbol table and generate the PIC assembler code.
The ansi_c compiler from Assignment 03 has a scanner (scan.l) and parser (gram.y) to handle the C programming language. The scanner has no actions for the regular expressions and the parser has no actions. You update the scanner and the parser using the files included with the project. To build and test the initial ansi_c compiler type the commands: make clean and make ./ansi_c +debug +symbol +yydebug test1.c
Implementation Details Implement the source code to perform the cross compiler operations. Use the files provided or write your own. You can reuse the source code from any of the assignments. An outline of the step to complete the project is shown below. • Update the scanner to include passing attributes to the parser. • Update the parser to implement declarations for the symbol table (create and find). • Update the parser to translate functions into PIC assembler code. • Update or rewrite the PIC code generator to support all the PIC16F1827 instructions. • [Extra] Update the code generator to support the embedded development board.
Part 1: Scanner Update Open the scan.l file and find the regular expression for IDENTIFIER that returns the check_type() function. Change the action to the action shown below. The scanner functions are provided in the audit.c file. { count(); yylval.tuple = identifier( yytext, yyleng); return(check_type()); } Find the hexadecimal regular expression that returns a CONSTANT. Change the hexadecimal action to the action shown below. { count(); yylval.tuple = constant_hex( yytext, yyleng); return(CONSTANT); } Repeat this process for the octal, decimal, char, float and string_literal actions replacing the function word “hex” with each of the names. Then change all the other regular expressions that return a token to assign that token to the attribute variable (yylval.token). (first and last regular expressions are shown below) “auto” { count(); yylval.token = AUTO; return(AUTO); } … “?” { count(); yylval.token = ‘?’; return(‘?’); } Save the scan.l file.
Open the gram.y file. Find the declaration_specifiers production and the same function name in the audit.c file. In the audit.c file in the comment above the function, copy and paste the parser functions to each of the four productions in the gram.y file and be sure to include the action inside curly brackets inside the production. Save the gram.y file and close the audit.c file. The audit (static checking) allows INT, CHAR and VOID types.
Part 2: Parser Declarations Update For the symbol table each set of curly brackets (block or function body) increments the level count. At the end of each block the symbol table is saved onto the free symbol table list and the level is decremented. The free symbol table information is needed by the code generator for generating the symbol table. In this case when the level is 0 then it is a global variable. Above level 0 is a block statement or function body and the symbol can be re-declared. Therefore the code generator uses the level information in the symbol name generation. The symbol name is appended with the underscore character ‘_’ and the level number during code generation. For example the code fragment below has one global variable and three stack variables: int i = 0; main() { int i = 1; { int i = 2; { int i = 3; } } } The identifier is looked up in the symbol table to find the physical address and level information. This is printed for each symbol. If the symbol is greater than level 0 the code generator generates the level number appended to the symbol separated by the underscore character ‘_’. Else if the symbol or label is only one character long then the underscore ‘_’ symbol is also appended to the symbol name.
The symbols are used in assignment operations (left = right) and converted into “MOVWF symbol” or “MOVF symbol, W” PIC instructions. Constants (right only) are converted into “MOVLW value” PIC instructions. The code generator output for the symbol table and code fragment for the example is: ;symbol table: i_ EQU 0x20 ;symbol table free: i_1 EQU 0x21 i_2 EQU 0x22 i_3 EQU 0x23 mloop: ; here begins the main program movlw 0x00 movwf i_ movf PORTA,w call main goto mloop return main: movlw 0x01 movwf i_1 movlw 0x02 movwf i_2 movlw 0x03 movwf i_3 return ; if main does not have a return Open the gram.y file and change the %start file to code and add the production below. code : file { #ifdef YYDEBUG if( IS_FLAGS_DEBUG( data.flags)) { printf( “Debug: yacc tuples\n”); print_tuple_list( $1.tuple); } #endif code_generator_pic16f1827( $1.tuple); } ; In the gram.y file find the compound_statement production and replace the compound_statement production with the code below. Also open the symbols.c file. Find the same function names in the symbols.c file and notice in the comments above the functions are the function calls being used in the code pasted below. left_bracket : ‘{‘ { symbol_left_bracket(); } ; right_bracket : ‘}’ { symbol_right_bracket(); } ; compound_statement : left_bracket right_bracket { $$.tuple = 0; }
| left_bracket statement_list right_bracket { $$.tuple = $2.tuple; } | left_bracket declaration_list right_bracket { $$.tuple = $2.tuple; } | left_bracket declaration_list statement_list right_bracket { $$.tuple = tuple_tail_to_head( $2.tuple, $3.tuple); } ; In gram.y find the declaration production. Change the two productions to the code below. declaration : declaration_specifiers ‘;’ { $$.tuple = 0; } | declaration_specifiers init_declarator_list ‘;’ { $$.tuple = symbol_declaration( $1.token, $2.tuple); $$.tuple = tuple_declaration( $1.token, $$.tuple); } ; In gram.y find the init_declarator_list production. Insert the action to the production with the comma. $$.tuple = symbol_init_declarator_list( $1.tuple, $3.tuple); In gram.y find the init_declarator production. Insert the action to the production with the equal sign. $$.tuple = symbol_init_declarator( $1.tuple, $3.tuple); Save the gram.y. To build and test the symbol table version type the commands: make clean make ./ansi_c +debug +symbol test1.c ; automatic code generation for PIC16F1827 ; EECS337 Compiler Design ; by: caseid, date: Fall 2013 ; for PIC16F1827 processor ; CPU configuration list p=16f1827 ; list directive to define processor #include

