Description
CSC209 A1
Introduction
In this assignment we will investigate an image processing application.
Analyzing blood smear images, and extracting features of the cells is an important application in the field
of medicine to help detect and monitor various types of diseases.
In this assignment, you will implement
a simple form of red blood cell counting. This assignment was inspired by an article
(https://www.sciencedirect.com/science/article/pii/S2352340915001626?via%3Dihub#bib1) about a data set
and algorithm for counting red blood corpuscles.
The images we originally started with were stored in BMP format.
We have already taken care of the first steps, so you will be working with a text file that has already been
segmented. We have taken the above image, and produced black and white images where all pixels
have the values of 0 (black) or 255 (white).
We then converted the file to ASCII (plain text) format so that you can read it using fscanf(). Read the
man page for fscanf() by typing man fscanf at the command line.
The file contains only integers separated by a space, and you can assume that the file format is correct
when writing your code. The first two integers (first line) are the number of rows and columns of the
image in pixels. There is a newline character after each row of pixels to make it slightly easier to visually
inspect the file.
Step 0: Set up
Your program will consist of two files: count_cells.c and image.c
These files have already been added to you repo. You should be able to compile your program using the
following gcc line:
gcc -Wall -g -std=gnu99 -o count_cells count_cells.c image.c
Step 1: Reading and writing an image file
The main function in count_cells.c takes one or two command-line arguments:
The first argument the name of blood cell image in the text file format described above.
The second argument, ” -p ” is optional. If present, this argument indicates that the pixel matrix
should be printed before other program output. In image.c you will implement two functions. Their
function prototypes are given below:
/* Reads the image from the open file fp into the two-dimensional array arr
* num_rows and num_cols specify the dimensions of arr
*/
void read_image(int num_rows, int num_cols,
int arr[num_rows][num_cols], FILE *fp);
void print_image(int num_rows, int num_cols, int arr[num_rows][num_cols]);
Because we are passing in a variable-length, multi-dimensional array as an argument to a function, the
first two arguments must be the dimensions of the array, and the third the array itself. (See King pages
197-199) We could call read_image as
int p, q; // assume p and q have valid values
int a[p][q];
FILE *fp; // assume fp holds a valid open file pointer
read_image(p, q, a, fp);
This means that we need to read the dimensions of the array before calling read image. Write these two
functions and call them from the main function in count_cells.c code and test it before moving on.
Remember to add the prototypes to count_cells.c . This might also be a good time to learn the basics
of gdb.
It would also be a great idea to make sure to commit and push your work at this point.
Step 2: Count cells
Write a function count_cells to count the number of blood cells in the image. One cell is a region of
adjacent white pixels. The function will be implemented in image.c
A pixel is adjacent to another pixel if there is a pixel of the same colour above, below, left or right of the
pixel. In the example image below there are 6 cells. The “blue cell” is composed of 5 pixels, the red cell
is composed of 6 pixels, the green cell is composed of 3 pixels (and has no pixels adjacent to the red
cell), and there are three pink cells of one pixel each.
The main function will call count_cells and will print a message to stdout using the following format
string:
“Number of Cells is %d\n”
You must include a statement using this exact format string to receive credit for your cell count. Be sure
to print this only once.
Exactly how you count these cells is up to you, but I’d like to direct your attention to the flood fill
algorithm, which is a recursive algorithm that recursively visits and somehow marks all adjacent pixels
starting at some seed pixel within (in our case) a cell. Recursion terminates whenever a pixel visited
does not have the same value as the seed pixel. This is how the paint-bucket works in any basic image
creation software.
If you use any other algorithm (e.g. authored and developed yourself) be sure to add comments. If you
decide to use some other algorithm described by someone else (note, you cannot copy any code in
any of these cases, but you can look up algorithms) be sure to include a clear citation in your code with
the author(s) names and how you accessed it.
Input
You probably want to create a few smaller image files so that you can verify that your algorithms are
working correctly, and so that you can test some edge cases. For example, here is one that I wrote: (I
will leave it to you to figure out how many cells are in this image).
4 7
0 0 255 255 0 0 0
0 255 0 0 255 255 0
0 255 0 0 0 0 0
255 255 0 0 0 0 0
There are a number of .bmp files and the resulting .txt files in /u/csc209h/fall/pub/a1 on teach.cs that
you can also use as input. Note that these aren’t that useful for testing, unless you want to count by hand
the number of cells in them.
What to submit
Add, commit, and push your image.c and count_cells.c files to the a1 directory in your repository.
Please refer to the Syllabus (https://q.utoronto.ca/courses/309775/pages/syllabus?wrap=1) to make sure
you are following the expectations for assignment submissions, mainly, never pushing your program’s
binaries to the repo and ensuring that it compiles.
CSC209 A2
1. Introduction
Archive managers are an important part of the overall *NIX and computer systems infrastructure.
Most of you have probably encountered zip , or perhaps rar and 7z files. They are the modern result
of programs that integrate multiple files into a single file known as an archive, and (in the case of the
examples I mentioned) also compress the resulting file into a smaller form than the starting files. The
program known as tar can trace its roots back to the original UNIX operating system, and stands for
tape-archive when magnetic tape (things like cassettes, sometimes in very large rolls) were used to
store backups instead of solid-state or spinning plate (hard) drives.
Unlike zip , rar and 7z files, tar files are not necessarily compressed, but they do allow you to store
files and directories (directory trees – as in directories with directories inside them, and so on), while also
preserving the usual metadata for files managed by a *NIX system, like permissions, owner and creation
dates. Check out the manual for tar if you are interested. Recall this means $ man tar – as usual.
You will be implementing a less fully featured program called kar that:
1. Creates an archive of the specified files/directories
2. Extracts files and directories from an existing kar file into the current working directory (overwriting
any existing files with the same names)
The challenge of the assignment will be maintaining a data-structure for multiple possible trees, not only
in memory, but to and from a single, linear file.
2. Submission files
Your final submission will consist of the same 6 files you received in your starter code, and nothing more.
You should not change the make file called Makefile the header kar_tree.h . Otherwise, you are free to
add as you please to kar.c , kar_tree.c , archive.c and extract.c .
Update, commit and push your repository with all of your work contained in the a2 folder. You should not
push any executables (including .o files) or testing files.
3. Feature sets
This assignment can be understood through progressively more difficult features that your final program
implements. This list is in the order we suggest that you implement things and includes a rough
percentage of the overall grade for the assignment:
1. 10% – Program features three commands: create , extract , and –help and a correct 8 message
( –help or error).
2. 15% – .kar archives storing a single file, and recovering the contents of this file e.g.
1. $ kar create a2.kar f1.txt successfully adds f1.txt to a (possibly new) archive a2.kar
2. $ kar extract a2.kar , if created with the previous example, will create the file f1.txt using the
archive a2.kar , overwriting any file with the same name in the current directory.
3. 10% – kar archive with multiple regular (non-directory) files.
1. $ kar create a2.kar f1.txt f2.txt successfully adds f1.txt and f2.txt to a (possibly new)
archive a2.kar
2. $ kar extract a2.kar , if created with the previous example, will create the files f1.txt and
f2.txt using the archive a2.kar , overwriting any files with the same name in the current
directory.
4. 15% – Creating a kar archive with a single directory, containing multiple regular files.
1. $ kar create a2.kar D1/ successfully adds the (non-empty) directory D1/ to a (possibly new)
archive a2.kar .
2. $ kar extract a2.kar , if created with the previous example, will create the directory D1/ using
the archive a2.kar , overwriting any folder with the same name in the current directory. Note that
the files contained within D1/ should also be extracted.
5. 20% – Creating a kar archive with a single directory with files and potential subdirectories (which
recursively could contain more subdirectories). Example is the same as previous feature, but may
also include directories within D1/
6. 10% – Creating a kar archive with multiple files and/or directories, such that any directory may have
recursive subdirectories.
1. This is the most general case of the previous two features. Creation might look something like
this: $ kar create a2.kar f1.txt D1/ , where D1/ contains f2.txt and D2/ , and D2/ contains
f3.txt .
2. $ kar extract a2.kar , if created with the previous example, will create the files f1.txt , f2.txt
and f3.txt using the archive a2.kar . Note that the directories D1/ and D2/ should also be
extracted, f1.txt should be in the current directory, f2.txt should be in D1/ , and f3.txt
should be in D2/ .
Notice that for each stage, we expect that you should be able to archive and extract the file(s) stored in
the archive. Furthermore, your kar program should be able to extract from archives created by another
solution that also implements the file format below and the solution should be able to extract from your
archives.
4. kar archive file format
Consider the following directory tree where each file starting with a capital D is a directory, and each file
starting with f is a regular file.
[dee@sputnik test]$ ls -R *
f0.txt
D2F5:
f1.txt D1
D2F5/D1:
f5.txt D2 f2.txt
D2F5/D1/D2:
f4.txt f3.txt
Suppose now we executed kar as follows within this directory:
$ kar create archive.kar f0.txt D2F5/
The resulting file’s bytes should represent the following structure to be a valid kar archive (we’ve
assumed left-right ordering within each directory has been preserved, this is not necessarily the case,
see 5).
Beginning of file (0x0)
Header for f0.txt
All of the contents of f0.txt
Header for D2F5
Header for f1.txt
Contents of f1.txt
Header for D1
Header for f5.txt
Contents of f5.txt
Header for D2
Header for f4.txt
Contents of f4.txt
Header for f3.txt
Contents of f3.txt
Header for f2.txt
Contents of f2.txt
You can find this exact file for you to download HERE
(https://q.utoronto.ca/courses/309775/files/27853248?wrap=1)
(https://q.utoronto.ca/courses/309775/files/27853248/download?download_frd=1) . Consider using hexdump
-C <archive.kar> to investigate the contents of your archives.
A more general description of the format is as follows: regular files feature a header immediately followed
by their contents. Directories only feature a header, but are immediately followed by the appropriate
representations of the files contained within them. This can include directories, which will follow their
format recursively i.e. to complete archiving a directory, means it should have a header followed by all of
the representations of the files contained within it, before moving on to the next file in the same directory
as the original directory (that we started with). Writing recursive information can be confusing, make sure
you understand this format early on! Talk with your peers about it.
5. The “headers”
The headers in each kar file are the structs created to represent each file in our actual program. In
other words, the literal bytes that comprise the data-structures in the programwill be stored in the archive
file.
The data-structure you will build consists of a linked list to represent all the files a user would like to
archive (the command line arguments), and then, for every file that is a directory, that node will
separately also link to another linked list for the contents of that directory… and so on… recursively.
This
should naturally lend itself to creating the file format we require. The order of the files within each
directory will not affect our programs’ operation or the correctness overall, so we will just use whatever
order the operating system provides the filenames to us in. This means, that the order from the example
in 4 may be slightly different (files in the same directory, might be in a different order), but the structure of
directory headers, followed by the file headers and data for that directory must be maintained.
Each node in our archive tree will represent one file (whether it is a directory or not), and will store some
properties of this file, but not the actual contents. The contents are written after a regular file’s header,
remembering that a directory has no contents in our scheme.
Look at the arch_tree_node struct . These structs can be connected to existing linked-lists through their
next_file pointer member, but also become the parent to a new linked list (that may branch into a tree)
when the particular node in question is a directory. You can do this through the dir_contents pointer.
6. Freeing memory
Grade percentage: 10% By the end of using kar for any purpose, you will need to free all memory that
you have allocated on the heap, anything else will be considered a memory leak and will not result in full
marks. Using valgrind will allow you to test for this particular problem.
Your main challenge here, will be freeing all the arch_tree_nodes successfully, and not losing track of
any of them. You might like to review PCRS Week 3: Dynamic Memory Video 5: Nested Data Structures
for a general review of what this means.
7. Error Handling
7.1. In general
You should check for errors resulting from system calls, like opening files or any memory you are
allocating. It is sufficient to simply check if the pointer returned by these functions is 0x0, in which case
there was an error. Your program should ultimately exit if malloc ever fails, but if file I/O fails it depends
on which command is being performed. See the example code in read_node for how this might look.
7.2. Creating archives
Failing to open a file (e.g. if it doesn’t exist) means you should just skip the file. If nothing was ultimately
added to the archive, some message (your choice) should be printed to stderr and kar should exit
with a value of 1.
7.3. Extracting archives
For the sake of simplicity we will only test extracting valid archives. A valid archive is one that conforms
to the 4 and has at least a single file archived within it. Thus, you do not need to handle empty files.
We will assume that you have the appropriate permissions for the current working directory, and do not
have to gracefully handle such an error. Feel free to do this if you like, but it is not required.
8. Usage
Your program should print the following usage string if the first argument to the program is –help , or
there was some sort of error processing arguments – this includes when the command specified was
none of the three expected options. This also specifies how your program should behave more generally.
You may print this message also at other times you deem reasonable.
Usage: kar [COMMAND] [ARCHIVE]
Creates or extracts files from the kar-formatted ARCHIVE.
COMMAND may be one of the following:
create [ARCHIVE] [FILES]
extract [ARCHIVE]
–help
create:
Creates an archive with the name specified in [ARCHIVE] consisting of the listed [FILES] which c
an include directories. Paths to the files are not preserved, each listed file is part of the top-le
vel of the archive. If [ARCHIVE] already exists, it is overwritten. Directories are added recursivel
y, such that all files within the directory are added to the archive.
extract:
Extracts the files from the [ARCHIVE] file into the current directory.
–help:
Prints this message and exits.
Note that the indents are constructed using 4 sequential spaces.
9. File I/O and Helpful system calls
Don’t forget to close files when you are done with them. If you have an open file pointer called file_ptr ,
it simply means adding close(file_ptr); .
9.1. Write Buffer
When archiving and extracting, we will need to be reading from one file and writing to another. We
expect you to do this in chunks, so that at the very least, the memory used by your program doesn’t fill
the entirety of the RAM – if you are archiving a large file. We have provided the constant
WRITE_BUFFER_SIZE for this exact purpose. You should read from the incoming file at most this much, then
write what was read to the outgoing file, and then repeat if necessary.
9.2. mkdir – making directories
We will use a new system call called mkdir available to us if we #include <sys/stat.h> . This function is
similar to saying $ mkdir <folder-name> , but differs insofar as it requires two arguments:
1. The name of the folder (as used at the command line)
2. A mode_t structure that describes the permissions for the file
The first argument should be straightforward, and the second can simply be the integer 750 . Recall that
permissions can be specified using an numeric format, see man chmod for more. Or look into Kerrisk Ch.
18 section 6 for more on mkdir .
9.3. opendir – getting a (DIR *)
Closing a directory is slightly different than other files, use closedir to make sure you’ve closed it. This
system call is largely like open for files, but is for directories.
9.3.1. readdir
Once you use opendir(<dir-name>); to get a DIR * . You can use the function readdir(<open-dirpointer-variable>) to get the name of each file in that directory. Each call to readdir will return a dirent
* , yet another type, representing the next file in the directory. When readdir returns 0x0 , there are no
more files to list. One thing you need to watch out for, is that whenever you read from a directory like
this, there will be two special files named ’.’ (for a link to the current directory) and ’..’ for a link to the
parent directory. You will need to skip these files when making your archive!
You do not need to learn anything about dirent structs besides that it has a member called d_name for
the name of each file. You can see much more about this topic in Kerrisk Ch. 18 section 8.
9.4. struct stat
We have provided you with part of a function called create_tree_node in archive.c . This little snippet
includes the use of a function called stat and it populates a struct with information about the open
file. Some things you can retrieve from this include the size of the file in bytes through the member
st_size , and whether or not it is a directory, using S_ISDIR(st.st_mode); (assuming your stat variable is
called st).
You can find more information in Kerrisk Ch. 15 section 1.
10. Some notes on when you’ll see some of this in class
As you progress through the weeks of the course, topics from class will help you complete more of the
assignment.
In week 4, we will be introducing the C struct , this will go a long way towards understanding how we
will represent and create the linked list. In the subsequent week, we will consider how you might write a
struct to a file.
11. Notes on style
Total grade percentage: 10%
The style grade is ultimately at the discretion of the TA responsible for grading your work. They will be
grading you according to the following principles:
1. Variable names are clear
2. Style such as indentation and bracketing is consistent throughout (caveat, the starter code need not
change)
3. Functions are never exceedingly long (as in long strings of code, performing multiple functions in a
script-like fashion)
4. Helper functions are used when needed
Note that comments are not a necessity, but a human will be reading your work.
12. P.S. A Final Note (perhaps once you’re done)
Notice how we’ve managed to really efficiently use the space of our archive in this program. There is
minimal wasted space (the filename buffer is just about it). Some food for thought though: how difficult
would it be to add a new file to this program? We very well might have to re-write most if not all of the
bytes of the archive file, since we’ve imposed such a strict ordering to where everything goes.
Think a bit about how you might create an archive that would be more amenable to contents that
change. From this point, you aren’t very far from creating your own filesystem – where the OS manages
the tree structure, and accepts system calls like fopen and mkdir to update the directory structure.
Ideally, the directory tree structure is stored on the memory device (e.g. disk), rather than taking up
space in memory (i.e. RAM).
Somehow, we would need to be much more flexible about the creation of
new files and making good use of space that might have been freed up by deleting other files. Whenever
you are reading the Kerrisk book, you might now have a better understanding of what motivates a topic
that frequently comes up: Inodes .
Author: Demetres (dee) Kostas, PhD
CSC209 A3
1. Learning Objectives
At the end of this assignment students will be able to
write a program that creates new processes
manage communication between processes using pipes
read and add to a medium-sized C project
2. Introduction
In this assignment, you’ll build a simple pipeline for manipulating bitmap image files by applying a set of
common filters on the images, which will allow the user to do things like turn the image to grayscale, blur
the image, and increase the image size.
Each image filter will be a separate program that reads in a bitmap image from its standard input,
calculates some transformation on the image’s pixels and possibly its dimensions, and then writes a
transformed bitmap image to its standard output.
Because we want to provide a convenient interface to the user to run several filters on a single image,
you will also create a “master” program that spawns a separate process for each filter the user specifies.
This program will also perform some basic process management, waiting for all of its child processes to
complete, and reporting any errors that occur.
3. Preparation
Do a git pull in your repository to find the starter code for assignment 3.
Make sure to read the entirety of the instructions as soon as possible, and note that the code also
contains TODOs for you to follow.
Also, we strongly suggest first Lab 5 if you haven’t yet done so. This will make sure you are familiar with
the bitmap file format we are using for this assignment.
3.1. Starter code overview
Here is a brief description of the files found in the starter code for this assignment. Please note that
unlike previous assignments, you will be creating brand-new files, not just adding to existing ones!
bitmap.h and bitmap.c : definitions of the main types and functions used in this assignment. Look for
TODOs here to fill in the required functionality for processing bitmap files.
copy.c : a skeleton file for the simplest image filter you’re writing. You’ll be completing this file, and
adding four more similar ones.
image_filter.c : a very incomplete “master” program that manages a pipeline of image filters. You
don’t have to worry about this file until Part 2.
Makefile : an incomplete makefile that you (and we) will use to compile the different executables in
this assignment. We’ve also provided a basic test target that you should extend to automate some
basic tests of your filters as you work through this assignment.
images/ : a directory containing a sample bitmap image file you can use for testing. Feel free to add
your own!
4. General C code guidelines
We have the following expectations for all source code you submit for this assignment:
1. Perform error-checking on system and library calls and use perror to print good error messages.
2. Explicitly free all dynamically-allocated memory before a program terminates.
3. Explicitly close all unused pipe ends, for every process.
4. Don’t modify/delete any of the provided functions/structs in the starter code unless we say that you
can.
5. Part 1: Image filters
Your first task is to complete a set of image filter programs, described below. Each filter will have the
same base structure, but will differ in the actual pixel value calculations they perform, and how many
pixels they need to buffer for their calculations.
5.1. Bitmap file format
We are using the same standard for bitmap images as Lab 5: 24-bit bitmap files whose width and height
are divisible by 4. From the bitmap header section, you will again need to access the header size and
image width and height. In addition to these three integers, you will also need to access the image file
size, which is an integer stored at offset 2 in the bitmap header.
Your first task should be to complete the read_header function in bitmap.c . This reads in the bitmap
metadata from standard input, which is necessary to complete all of the filters for this part of the
assignment.
5.1.1. Suggestion
Even without implementing the copy filter, you should be able to run the starter program and have the
header data be written to stdout. You can use this to check your read_header by redirecting this output to
a file, and comparing it against the given bitmap’s header (they should be the same).
You can use od (https://linux.die.net/man/1/od) to inspect each file separately (use -N 54 to inspect
just the first 54 bytes). Or, you can use cmp (https://linux.die.net/man/1/cmp) to compare the two files
byte-by-byte (use -n 54 ) to compare just the first 54 bytes.
5.2. Image filter structure
Unlike Lab 5, in which you read in a bitmap image from a file, here you will read in the data from
standard input. And because each program must write out a complete bitmap file, you must write out all
of the header data as well.
Because we expect these programs to be chained together with pipes, it would be rather inefficient to
always read in the entire image file (including all its pixel data), and then perform some computations on
the pixels, and the write out the entire output file. All of the filters you’ll be implementing are local filters,
meaning each computed pixel value will be based on at most a few surrounding pixels, and will not
require reading in more than a few pixel rows of the image before being able to start computing. To take
advantage of this, you’ll use implement simple buffering approach for each filter, which is loosely
described as follows:
1. Read in all of the header data, and extract the necessary image metadata described above.
2. Write out all of the header data for the transformed image file. In most cases, the output header will
be exactly the same as the input header; for the scaling transformations, the file size and image
dimensions will need to be updated.
3. Read in a number of pixels (specified for each filter below).
4. Compute the values for the transformed pixel(s), and write out these new pixels.
5. Repeat steps 3-4 until the entire input image has been processed, and all the new pixels written out.
Of course, the key questions are:
How do we compute the values of the transformed pixels?
How many pixels do we need to read in step 3 before computing the transformed pixel values?
These questions depend on the exact filter being used, and we’ll describe each one separately. Note that
we specify the exact executable name in each case; for all of them besides copy , you should create and
submit a new C source file that compiles to the required executable, and add the compilation recipe to
your Makefile.
5.3. Individual pixel filters: copy and greyscale
The two simplest filters operate on one pixel at a time. Your loop in this case can read in and write out
exactly one pixel in each iteration.
The copy filter leaves the pixels unchanged; this is a good one to get started, because you simply need
to read in one pixel and then immediately write it back out. Running
$ ./copy < images/dog.bmp > dog_copy.bmp
should product an exact copy of the original bitmap file.
The greyscale filter turns the image into a black-and-white version by transforming each pixel p into a
new pixel q , where the blue, green, and red values of the new pixel are all equal to the average of the
blue, green, and red values of the original pixel, rounded down to the nearest integer. You should do
normal integer arithmetic on the individual pixel fields to compute the average: (p.blue + p.green +
p.red) / 3 works just fine.
5.4. Basic image convolutions: gaussian_blur and edge_detection
The next two filters compute a transformation on a pixel p from the pixel values of p and the eight
pixels surrounding p , i.e., the 3-by-3 grid of pixels centred on p . For example, the pixel at position (2,
3) in the image is transformed based on the following nine pixel values:
(1, 2) (1, 3) (1, 4)
(2, 2) (2, 3) (2, 4)
(3, 2) (3, 3) (3, 4)
But because the pixels in a bitmap file are stored in rows, we require a few rows of pixel data to be read
in before any transformation can begin. The main loop for these two filters reads in pixels by row: at each
iteration, your program should be store exactly three rows of pixels from the original image, and using
these pixels to calculate and print the transformed values for every pixel in the middle row.
We have given you the code that computes the pixel transformations for you; they involve some
technical calculations that are beyond the scope of this course, but you can learn about in a course like
CSC320 (Introduction to Visual Computing). Instead, your main job here will be to manage the buffering
of the pixel data, and figuring out how to invoke the provided functions to compute the correct pixel
transformations. Read about what these functions expect in bitmap.h .
Note about boundaries: because each of these filters compute transformations based on each pixel’s
neighbours, the pixels at the very edge of the image pose a problem (they don’t have neighbours on one
or more sides). For simplicity on this assignment, resolve this simply by making every “edge” pixel have
the exact same transformed value as its “inner” neighbour. For example, the pixel at position (0, 5)
should have the same transformed value as the pixel (1, 5), and the pixel at position (0, 0) should have
the same transformed value as the pixel (1, 1).
Consider this as, using the pixel in the ring of values
immediately inside the edge, so, if the image width is 200, the pixel at position (199, 3) should have the
same transformed value as the pixel (198, 3). You may find the provided min and max macros helpful
for handling these boundary pixels elegantly.
If you’re interested in reading more about image convolutions, try the Wikipedia page
(https://en.wikipedia.org/wiki/Kernel_(image_processing)#Convolution) .
5.5. Scaling
The last filter is used to scale an existing image. It takes one command-line argument scale_factor ,
which must be an integer greater than 0. The image produced by this filter should be the same as the
original image, except its width and height are multiplied by the provided scale factor. You may assume
this command-line argument is always provided, and is always valid (no error-checking is required for
this).
The pixel at position (i, j) in the scaled image should equal the pixel at position (i / scale_factor, j
/ scale_factor) in the original image. Note that this is integer division, which rounds down. (This is the
simplest form of scaling, and it doesn’t produce very smooth-looking images. There are more complex
algorithms for scaling you’ll learn about in courses like CSC320.)
In order for the output bitmap to be valid, you’ll need to update three parts of the bitmap header before
writing it to stdout: the image width, image height, and the total file size. Note that the header’s size does
not change with this transformation, and so you can compute the new file size by adding the header size
to the total space required to store all the pixels in the image.
Note: due to the structure of the provided run_filter function, you’ll need some way to access the input
scale factor when calculating and writing the transformed pixels. There are a few different ways to do this
(feel free to choose from the following):
1. Save this value in a global variable in your scale.c source file.
2. Modify the Bitmap struct to store this data.
3. Do not use run_filter at all, and instead use your own modified version that makes the scale factor
more accessible to the filter itself. This might require updating bitmap.h as well.
5.5.1. Suggestion
We strongly encourage you to test each image filter separately as you complete it. At the bottom of the
starter Makefile , we’ve provided a target test with a simple example command of one of the filters. If
you add more commands to this rule, you’ll have a convenient way to run simple tests on different
executables, just by running make test !
You should even be able to run multiple filters in a pipeline using standard shell syntax, e.g.
$ ./gaussian_blur < dog.bmp | ./gaussian_blur | ./gaussian_blur | ./greyscale | scale 2 > images/dog
_piped.bmp
which should produce a big, blurry, grey dog. This is the ultimate goal of Part 1.
Here are sample outputs for each of the non- copy filters run on the provided bitmap file, greyscale, blur,
edges and scaling (by 2) respectively:
(a3/images/dog_grey.bmp)
(a3/images/dog_edges.bmp)
You should be able to right click and save the original images.
6. Part 2: Managing multiple filter processes
The second part of this assignment is to complete the program image_filter , which takes at least two
command-line arguments:
1. Its first two command-line arguments specify filename for the input and output images, respectively.
2. The remaining arguments each specify one of the five image filter programs you developed in Part 1.
If the user wants to run the scaling filter, this must be specified in a command-line argument that
contains “scale” and a scale factor, e.g. “scale 2” .
If no arguments are provided, a single copy filter is performed.
image_filter is responsible for doing the following:
1. Create one new process for each filter specified by the command-line arguments.
2. Connect the processes properly to each other (and to the input and output files) using pipes.
Remember that each filter process can only communicate using its stdin/stdout, meaning that dup2
must be used to redirect each of their stdins and stdouts.
The filters must process the image in the same order the command-line arguments are specified by
the user.
3. Wait for all processes to complete, and print a success message to the user. If one or more of the
processes failed (exit with non-zero status), print a warning message to the user. We’ve provided
string constants for these messages for you in the starter code.
For example, the command
$ ./gaussian_blur < dog.bmp | ./gaussian_blur | ./gaussian_blur | ./greyscale | ./scale 2 > images/d
og_piped.bmp
should result in the same behaviour (except the output message) as
$ ./image_filter dog.bmp images/dog_piped.bmp ./gaussian_blur ./gaussian_blur ./gaussian_blur ./grey
scale “./scale 2”
7. Submission
Commit all your code to your repository. Running make should produce six executables: the five image
filters copy , greyscale , gaussian_blur , edge_detection , and scale (check their names carefully!) and
the master image_filter executable. You are welcome to commit sample test files that you used.
Note: it is generally not good practice to commit the .o or executable files to your repository, as these
files should be automatically generated from your source code. If you accidentally committed such files,
please remove them from your repo ( git rm ) before your final submission.
Original author credit to: David Liu
CSC209 A4
In this assignment, you will build a small webserver that allows users to run the image filters you created in Assignment 3 on custom images using only their web browser.
Learning Objectives
At the end of this assignment students will be able to write a program that communicates with other processes across a network using sockets parse parts of HTTP requests write a program that creates new processes read, write, and manipulate plaintext and binary data read and add to a medium-sized C program
Introduction
A web server is a program that uses sockets to listen for HTTP requests, and constructs an appropriate HTTP response. The response often contains a HTML document that the browser renders on the screen, althought you will also see other types of responses in this assignment. Take a look at main.html in the starter code to see an example of data that the server will send back to the client when the client sends a request. You can see how your web browser will render this data by using the Open File option in your web browser (E.g. Chrome, Firefox) to open main.html .
when you open the file like this, the dropdown menu beside “Select an image” is empty. That’s because the server fills in this image list based on the current contents of the images directory using a bit of JavaScript. For this assignment, we are simplifying (and hard-coding) some aspects of a web server and the HTTP protocol. In particular, your the server only considers the first line of the HTTP request that contains the type of request (GET or POST), and the resource requested.
NOTE 1: We strongly recommend that you only use Chrome or Firefox to test your work on this assignment. We tested the starter code on these two browsers. NOTE 2: It is important that you kill image_server program (especially on the lab machines) when you are not working on it. Although we have tried to make the code safe, we cannot guarantee that there are no security holes in this application.
Setup and starter code
Do a git pull in your repository; you will see the starter code under the new a4/ directory.
Like previous assignments, there’s quite a lot of provided code to read through and understand. Here is an overview of the starter code to help you get started. Source code: image_server.c is the main program. It contains the central handle_client function, and a main to actually run the program. request.h , request.c : functions to handle the parsing of requests. request.h also defines the key structs used to store data for these requests. response.h , response.c : functions to handle how the server should respond to requests (i.e., what data the server should write back to the client). socket.h and socket.c contain the basic server socket code from lecture. You shouldn’t need to change these files. Makefile : a sample makefile.
Running make will create and populate the images/ and filters/ directories required by the server, in addition to compiling the image_server executable. Other files: copy : this is an executable in case you did not complete Assignment 3 successfully. You’ll be able to use this executable to test your work on the Teaching Lab servers, but it likely won’t run on your machine. Of course, you can ignore this file completely and simply use your own version from Assignment 3. dog.bmp : a sample bitmap file. main.html : an HTML file that acts as the “welcome” page for the server. Users can use this page to submit other types of request to the server.
NOTE: Be careful which files you commit and push to your repo. In particular, you should not commit any executable files, image files, or .o files to your repo. Lab 10 and port number setup We strongly recommend working on this assignment after completing Lab 10 (https://www.teach.cs.toronto.edu/~csc209h/fall/labs/lab10-sockets.html) .
You can use any code you wrote on Lab 10 for this assignment. The first thing you should do is change the PORT number in the Makefile in the same way as Lab 10; this will help you avoid port conflicts with other students on the Teaching Lab machines.
Part 1: Parsing the first line of an HTTP request
When you type a URL into your browser, it will send a request to the server. The URL to request that the main.html page be displayed is https://localhost:/main.html if you are running the web browser on the same machine as the server, and if you replace with the port number that your server is listening on. If you want to run the server on wolf.teach.cs.toronto.edu and the web browser on your own laptop or desktop, you can use the URL https://wolf.teach.cs.toronto.edu:/main.html .
This causes your browser to send a request to the server that looks something like the following: (Note that all lines use the CRLF line endings (“\r\n”).) GET /main.html HTTP/1.1 Host: localhost:3000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:56.0) Gecko/20100101 Firefox/56.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Upgrade-Insecure-Requests: 1 Cache-Control: max-age=0 For most of this assignment, we only care about the first line of the HTTP request and will ignore all of the other data in the HTTP request.
Understanding the start line of an HTTP request
The image server we are building can respond to different kinds of requests: an initial one to load the main HTML page, a request to upload a bitmap file, and a request to run a filter. This data is encoded in the first line (called start line) of an HTTP request in the format HTTP/1.1 where:
A4 https://q.utoronto.ca/courses/309775/assignments/1187795 4/8 is either GET or POST consists of a forward slash ( / ) and resource name (e.g., main.html ). If the method is GET , the resource name can be optionally followed immediately by a ? and then zero or more name-value pairs (called query parameters), each separated by & . Each name-value pair is written in the form = ; name and values are always string. does not have any spaces. Moreover, the resource name, and query parameter names and values all do not contain ? , & , and = .
Some examples of HTTP request start lines are: GET / HTTP/1.1 GET /main.html HTTP/1.1 GET /my-resource HTTP/1.1 POST /david HTTP/1.1 GET /my-resource?name1=value1&name2=value2 HTTP/1.1 (Note: this is a simplified version of the full format of this line. Read more about the full format here (https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html) .)
Your first task is to complete the implementation of handle_client up to and including the parsing of the first line of the HTTP request according to the format above, including the implementation of parse_req_start_line and some helper functions in request.c .
Note that even though the client buffer is not null-terminated automatically, all values stored in the ReqData struct must be null-terminated strings. We’ve included a function log_request that you can use to check the values of these strings. To debug your work, you can run your server and then visit the url localhost:/main.html? name1=value1&name2=value2 in a web browser of your choice (where you should replace with your port number, configured above).
Even though you won’t see anything in the web browser, if you go to your running server, you should see the following output from log_request : Request parsed: [GET] [/main.html] name1 -> value1 name2 -> value2 Try out different URLs to ensure your parsing is working properly.
Part 2: String responses
Now that you have parsed the request start line, you can begin writing the code that will allow the server to respond to different requests. Your next task is to add to the implementation of handle_client according to its comments so that it does the following: When given a GET request with resource name MAIN_HTML (this is a defined constant), it renders the provided main.html page. Any query parameters and the rest of the HTTP request are ignored. On any other request, it renders the provided “Not Found” string.
Note that you aren’t responsible for writing any of the actual response text yourself; instead, read the provided starter code carefully to understand the functions we’ve provided, and call them appropriately to complete this part. To check your work, you should try visiting localhost:/main.html and localhost:/randompage.html in your web browser. In the first case, you should see a plain-looking webpage with the title “CSC209: Image Filter Server”; in the second, you should see the message “Page not found”.
Note: the provided code illustrates two types of responses: main.html is rendered as an HTML response, while the “Page not found” message is rendered as a plain text response. Your web browser probably displays them quite differently; cool, eh?
Part 3: Running an image filter
The main.html page has two different forms the user may use to interact with the server. The first form allows the user to select an image filter executable and a file name. When the user presses the “Run filter” button, a GET request is sent with target with resource name /image-filter , and two query parameters filter , whose value is the name of the filter to apply, and image , whose value is the filename of the bitmap image to process.
For example: GET /image-filter?filter=copy&image=dog.bmp Your task here is to implement the image_filter_response function and call it appropriately in handle_client when given a request. Once you have this working, you should be able to submit a request by pressing the “Run filter” button and download the filtered bitmap image! Input validation The “filter” dropdown in the form uses preset options for the different filters from Assignment 3 (except scale , which we avoided because it takes a command-line argument).
To make them all accessible, you can simply copy-and-paste your executables from Assignment 3 into the a4/filters/ directory. (But even if you didn’t complete Assignment 3, you can test your work here by using the provided copy executable.) The “image” dropdown is populated dynamically with the names of the files in the a4/images/ directory, which the provided Makefile populates with dog.bmp . If users always stick to using the form, then the server always receives valid requests to image-filter ; the problem lies in the fact that query parameters in a GET request are very easy to change.
For example, there’s nothing stopping the user from making the following request (typing it directly into their browser’s address bar): GET /image-filter?filter=hahaha&image=nodog.bmp
To get you thinking about this, we’re asking you to make the following checks for the query parameters in your image_filter_response implementation: 1. Both query parameters “filter” and “image” must be present. 2. Neither value can contain a slash character ‘/’ . 3. The filter value must refer to an executable file under a4/filters/ . 4. The image value must refer to a readable file under a4/images/ . If any of these conditions are violated, you should send a “bad request error” response back to the client.
Part 4: Uploading files
The second form on the main.html page enables users to upload their own bitmap files directly to the server. The previous two types of requests could be completed just by looking at their start lines, and ignoring the rest of the request data. However, for uploading files the entire request must be read in, as the actual bitmap file data sent by the client is contained in the body of the request, not just its first line. Here is an example of the format of this request if we upload toronto.bmp (toronto.bmp) .
Remember that all line endings use network newlines, \r\n . POST /image-upload HTTP 1.1 Content-Type: multipart/form-data; boundary=—7353 —–7353 Content-Disposition: form-data; name=”bitmap”; filename=”toronto.bmp” Content-Type: image/bmp —–7353– The important components are: The Content-Type: multipart/form-data header line ends with the boundary string (“—7353” in the above example), which immediately follows “boundary=”. This is used to separate sections of the request.
The characters “—–7353” (comprised of “–” plus the boundary string), which always occurs on its own line, begins the start of the form data storing the uploaded bitmap file. We’ll call this the bitmap data section. The first line of the bitmap data section contains name-value pairs. The last name-value pair is always of the form filename=”” , where is the filename of the uploaded file. The actual bitmap file data begins of the fourth line of the bitmap data section.
Note: because the file data itself may contain bytes corresponding to \r\n , we can’t can’t just search for the next “\r\n” to find the end boundary line. The sequence of characters “\r\n—–7353–\r\n” marks the end of the bitmap data (and in fact the end of the request data).
This is quite a bit, and in fact we’ve given you code that does a lot of parsing of this already. First, read through image_upload_response in response.c , which is the main function that handles this type of request. It makes use of three main helpers in request.c to parse the request and actually save the file: get_boundary and get_bitmap_filename extract the boundary string and filename of the bitmap file uploaded, respectively. save_file_upload is responsible for actually reading in the image data from the request body, and saving it into a file on the server.
Your task is to implement this function. More concretely, when this function is called, the bitmap filename has just been parsed, and the line containing it removed from the buffer. The remaining part of the request has the following structure (don’t forget about the network newlines \r\n ): Content-Type: image/bmp —–7353– Your task is to extract just the bitmap data from this request and write it to a file.
Buffering the data The tricky part here is that the entire bitmap data does not fit into the buffer! This means that you’ll need to use a loop that continually reads new data from the socket into the buffer, and then writes that data to the file. (This is similar to the copy filter from A3 except that you should read the data in chunks rather than one byte at a time.) The challenge is that you need to determine where the bitmap data stops.
You can take either of the following approaches for this assignment: 1. You may assume that the uploaded bitmap file is valid, and so extract the file size from the data to determine the exact number of bytes to read. 2. You may assume that the HTTP request is valid, and so keep reading until the sequence of characters representing the final boundary string (“\r\n—–7353–\r\n” in our above example) is detected. Submission Commit all your code to your repository. Running make image_server should produce the single executable image_server .
Our tests will be responsible for providing correct image filter executables and setting up the images/ and filters/ directories. Note: it is generally not good practice to commit the .o or executable files to your repository, as these files should be automatically generated from your source code. If you accidentally committed such files, please remove them from your repo ( git rm ) before your final submission.
Finally, give yourself a pat on the back for completing the last assignment in CSC209! And show your non-CS friend or your parents what you accomplished!

