CSC369 Assignment 1 to 4 solutions

$180.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (1 vote)

CSC369 A1: Cooperative Threads

 Cooperative, User-Level Thread Package

In this assignment, you will create a library of functions that define a user-level threads package. Using your library, a program will be able to create threads, destroy them, and run them concurrently on a single processor. 1. Background on Threads 2. Using Threads 3. Cooperative Threads API 4. Solution Requirements 5. Implementation Details 6. Hints and Tips 7. Setup and Submission

Background on Threads

Threads and processes are key abstractions for enabling concurrency in operating systems. To gain a deeper understanding of how these abstractions are constructed, you will build the core of a user-level threads package. Implementing kernel-level threads and processes is not much different, but we’ll do this assignment at user level since (a) installing new OS kernels on the teach.cs machines is problematic and (b) debugging user-level code is much easier than debugging kernel-level code.

Threads provide the illusion that different parts of your program are executing concurrently. In the de facto standard model of multithreaded execution, threads share the code, heap, and the runtime system. Each thread, however, has a separate stack and, naturally, a separate set of CPU registers.

This programming model also provides synchronization primitives so that different threads can coordinate access to shared resources, but we will leave those concerns for the next assignment.

User-Level vs. Kernel Threads

For practical reasons, this assignment is done at user level: you will construct user threads by implementing a set of functions that your program will call directly to provide the illusion of concurrency. In contrast, modern operating systems provide kernel threads, and a user program invokes the corresponding kernel thread functions via system calls. Both types of threads use the same core techniques for providing the concurrency abstraction; you would build kernel threads in essentially the

same way you build user threads in this assignment.

Also, kernel processes are built using these techniques. However, there are a few differences between kernel and user threads:

Multiprocessing

User-level threads provide the illusion of concurrency, but on machines with multiple processors kernellevel threads can provide actual concurrency. With user-level threads, the OS schedules the user process on one CPU, and the user-level threads package multiplexes the kernel thread associated with the process between one or more user-level threads. With kernel-level threads, the OS is aware of the different (kernel) threads, and it can simultaneously schedule these threads from the same process on different processors. A key simplifying assumption for this assignment is that you will allow programs to multiplex some number (e.g., m) of user-level threads on one kernel thread. This means that at most one user-level thread is running at a time and that your runtime system has complete control over the interleaving of user-level threads with each other. More sophisticated systems implement m on n threads packages where m user-level threads are multiplexed across n kernel threads.

Asynchronous I/O

When a user-level thread makes a system call that blocks (e.g., reading a file from disk), the OS scheduler moves the process to the Blocked state and will not schedule it until the I/O has completed. Thus, even if there are other user-level threads within that process, they also have to wait. In contrast, when a kernel thread blocks for a system call, the OS scheduler may choose another kernel thread from the same process to run. Thus, some kernel threads may be running while others are waiting for I/O.

Timer interrupts

The OS scheduler uses timer interrupts to preempt a running kernel thread and switch the CPU to a different runnable kernel thread. Similar to blocking system calls, this stops the execution of all userlevel threads in the process until the kernel thread is scheduled to run again. However, to switch between user-level threads that are multiplexed on a single kernel thread, we cannot rely on timer interrupts (those are delivered to the OS, and not to our thread library runtime).

Instead, you will implement a cooperative threading model, where the currently running user-level thread must explicitly yield the processor to another user-level thread by calling a function provided by your library. In the next assignment, we will simulate timer interrupts that cause the scheduler to switch from one thread or process to another by using POSIX signals.

In your implementation, the threads library will “turn off interrupts” by blocking delivery of these signals using system calls. However, there is nothing to prevent the threads, themselves, from “turning off interrupts” the same way. Thus, even though we will implement “preemptive” threads, a “malicious” thread could turn off interrupts and not be preempted until it calls yield, thus hogging the CPU. Note that kernel schedulers don’t have this problem. Only the privileged code in the kernel can turn off the real timer interrupts.

Using Threads

With your threads library, a typical program will look something like this: int main(int argc, char **argv) { // Some initialization // Create some threads // wait for threads to finish // exit } // “main” function for thread i thread_main_i (…) { // do some work // yield // do some more work // return (implicit thread exit) } Here thread_main_i is a programmer-supplied “main” function that the i thread starts executing when it is first scheduled to run after its creation. (Note that different threads may have the same “main” function.) The thread can perform useful work by calling any other functions in the program, or voluntarily yielding to other threads. A thread exits either explicitly or implicitly. It exits explicitly when it calls the thread_exit function in the thread library. It exits implicitly when its thread_main function returns. Additionally, to add more control to the program, a thread may call thread_kill to force other threads to exit as well.

Cooperative Threads API

A key simplifying assumption in this assignment is that the threads are cooperative, i.e., each thread runs until it explicitly releases the CPU to another thread by yielding or by exiting. In contrast, preemptive threading systems allow a scheduler to interrupt a running thread at any time and switch the CPU to running a different thread. The thread package provides several functions to allow application programs to perform thread management. In addition, there are a few conventions that application programs must follow to ensure proper and safe operation.

A list of the functions that constitute the user-level threads API can be found in the thread.h file. The functions that you will be implementing for this assignment are summarized here: void thread_init(void): You can use this function to perform any initialization that is needed by your threading system. Here, you should also hand-craft the first user thread in the system. To do so, you should configure your thread state data structures so that the (kernel) thread that is running when your program begins (before any calls to thread_create) will appear as the first user thread in the system (with tid = th

. You will not need to allocate a stack for this thread, because it will run on the (user) stack allocated for this kernel thread by the OS. Tid thread_id(): This function returns the thread identifier of the currently running thread. The return value should lie between 0 and THREAD_MAX_THREADS-1 (inclusive). See Solution Requirements below.

Tid thread_yield(Tid to_tid): This function suspends the caller and activates the thread given by the identifier to_tid. The caller is put on the ready queue and can be run later in a similar fashion. A reasonable policy is to add the caller to the tail of the ready queue to ensure fairness (so all other threads are run before this thread is scheduled again – see the THREAD_ANY argument below).

The value of to_tid may take the identifier of any available thread. It also can take any of the following constants: THREAD_ANY: tells the thread system to run any thread in the ready queue. A reasonable policy is to run the thread at the head of the ready queue, which ensures fairness. This policy is called FIFO (first-in, first-out), since the thread that first entered the ready queue (among the threads that are currently in the ready queue) is scheduled first. THREAD_SELF: tells the thread system to continue the execution of the caller.

This function could be implemented as a no-op, but it may be useful to explicitly switch to the current thread for debugging purposes. The thread_yield function returns the identifier of the thread that took control as a result of the function call. Note that the caller does not get to see the result until it gets its turn to run (later).

The function may also fail and the caller continues execution immediately. To indicate the reason for failure, the call returns one of these constants: THREAD_INVALID: alerts the caller that the identifier to_tid does not correspond to a valid thread. THREAD_NONE: alerts the caller that there are no more threads, other than the caller, that are available to run, in response to a call with to_tid set to THREAD_ANY. Tid thread_create(void (*fn)(void *), void *arg): This function creates a thread whose starting point is the function fn.

The second argument, arg, is a pointer that will be passed to the function fn when the thread starts executing. The created thread is put on a ready queue but does not start execution yet. The caller of the thread_create function continues to execute after thread_create returns. Upon success, thread_create returns a thread identifier of type Tid.

If thread_create fails, it returns a value that indicates the reason for failure as follows: THREAD_NOMORE: alerts the caller that the thread package cannot create more threads. See Solution Requirements below. THREAD_NOMEMORY: alerts the caller that the thread package could not allocate memory to create a stack of the desired size. See Solution Requirements below. void thread_exit(int exit_code): This function ensures that the current thread does not run after this call, i.e., this function should never return. If there are other threads in the system, one of them should be run. If there are no other threads

(this is the last thread invoking thread_exit), then the program should exit with the supplied exit_code. A thread that is created later should be able to reuse this thread’s identifier, but only after this thread has been destroyed. (Note that we will be making more use of the exit_code in the next assignment.) Tid thread_kill(Tid victim): This function kills another thread whose identifier is victim.

The victim can be the identifier of any available thread. The killed thread should not run any further and the calling thread continues to execute. Upon success, this function returns the identifier of the thread that was killed. Upon failure, it returns the following: THREAD_INVALID: alerts the caller that the identifier victim does not correspond to a valid thread, or is the current thread.

Solution Requirements

The first thread in the system (before the first call to thread_create) should have a thread identifier of 0. Your threads system should support the creation of a maximum of THREAD_MAX_THREADS concurrent threads by a program (including the initial main thread).

Thus, the maximum value of the thread identifier should be THREAD_MAX_THREADS – 1 (since thread identifiers start from 0). Note that when a thread exits, its thread identifier can be reused by another thread created later. The thread_id() function should run in constant time (i.e., the time to find the TID of the current thread should not depend on the number of threads that have been, or could be, created). Your library must maintain a “thread control block” (a thread structure) for each thread that is running in the system.

This is similar to the process control block that an operating system implements to support process management. In addition, your library must maintain a queue of the threads that are ready to run, so that when the current thread yields, the next thread in the ready queue can be run. Your library allows running a fixed number of threads (THREAD_MAX_THREADS threads), so if it is helpful, you could allocate these structures statically (e.g., as a global array).

Each thread should have a stack of at least THREAD_MIN_STACK bytes. Your implementation must not statically allocate all stacks at initialization time (e.g., using a global data structure). Instead, you must dynamically allocate a stack (e.g., using malloc()) whenever a new thread is created (and delete one each time a thread is destroyed.) Your library must use getcontext and setcontext to save and restore thread context state (see Implementation Details below), but it may not use makecontext or swapcontext or any other existing C library code to manipulate a thread’s context; you need to write the code to do that yourself.

Your code must not make calls to any existing thread libraries (e.g., Linux pthreads), or borrow code from these libraries for this assignment. Do not use any code from other students, or any code available on the Internet. When in doubt, please ask us.

Implementation Details

Thread Context Each thread has per-thread state that represents the working state of the thread — the thread’s program counter, local variables, stack, etc. A thread context is a subset of this state that must be saved/restored from the processor when switching threads. (To avoid copying the entire stack, the thread context includes a pointer to the stack, not the entire stack.)

Your library will store the thread context in a perthread data structure (this structure is sometimes called the “thread control block”). Think carefully about what you need to include in your thread control block structure, and how these structures will be used to create and manage your threads. Consider how you will implement your ready queue. Remember that the thread_yield function allows a thread to yield the CPU to a specified thread, so you may need to remove a thread from the middle of the ready queue.

Saving and Restoring Thread Context When a thread yields the CPU, the threads library must save the current thread’s context, which contains the processor register values at the time the thread yields the CPU. The library restores the saved context later when the thread gets its turn to run on the processor again. Additionally, the library creates a fresh context and a new stack when it creates a thread.

Fortunately, the C runtime system allows an application to retrieve its current context and store it in a memory location, and to set its current context to a predetermined value from a memory location. Your library will make use of these two existing C library calls: getcontext and setcontext. Study the manual pages (http://linux.die.net/man/2/setcontext) of these two calls. Notice that getcontext saves the current context into a structure of type struct ucontext_t, which is typedef’d as type ucontext_t. So, if you allocate a ucontext_t and pass a pointer to that memory to a call to getcontext, the current registers and other context will be stored to that memory.

Later, you can call setcontext to copy that state from that memory to the processor, restoring the saved state. (Hint: You almost certainly want a ‘ucontext_t’ as part of your thread control block data structure.) The struct ucontext_t is defined in /usr/include/x86_64-linux-gnu/sys/ucontext.h on the teach.cs servers. Look at the fields of this struct in detail, especially the uc_mcontext and the uc_sigmask fields. You will use getcontext and setcontext in two ways.

First, to suspend a currently running thread (to run another one), you will use getcontext to save its state and later use setcontext to restore its state. Second, to create a new thread, you will use getcontext to create a valid context, but you will leave the current thread running; you (the current thread, actually) will then change a few registers in this valid context to initialize it as a new thread, and put this new thread into the ready queue; at some point,

the new thread will be chosen by the scheduler, and it will run when setcontext is called on this new thread’s context. Modifying Thread Context As noted above, when creating a thread, you can’t just make a copy of the current thread’s context (using getcontext). You also need to make a few changes to initialize the new thread: You need to change the saved program counter register in the context to point to a stub function, described below, which should be the first function the newly created thread runs. You need to change the saved argument registers, described below, in the context to hold the arguments that are to be passed to the stub function.

You need to allocate a new per-thread stack using malloc. You need to change the saved stack pointer register in the context to point to the top of the new stack. (Warning: in x86-64, stacks grow down!) In the real world, you would take advantage of an existing library function, makecontext, to make these changes to the copy of the current thread’s context. The advantage of using this function is that it abstracts away the details of how a context is saved in memory, which simplifies things and helps portability. The disadvantage is that it abstracts away the details of how a context is saved in memory, which might leave you unclear about exactly what’s going on.

In the spirit of “there is no magic”, for this assignment you should not use makecontext or swapcontext. Instead, you must manipulate the fields in the saved ucontext_t directly. The T2 tutorial exercise will help you understand the ucontext structure. The Stub Function When you create a new thread, you want it to run the thread_main function that defines the work you want the thread to do. A thread exits implicitly when it returns from its thread_main function, much like the main program thread is destroyed by the OS when it returns from its main function in C, even when the main function doesn’t invoke the exit system call.

To implement a similar implicit thread exit, rather than having your thread begin by running the thread_main function directly, you should start the thread in a “stub” function that calls the thread_main function of the thread (much like main is actually called from the crt0 stub function in UNIX). In other words, thread_create should initialize a thread so that it starts in the thread_stub function shown below. When the thread runs for the first time, it will execute thread_stub, which will call thread_main. If the thread_main function returns, it will return to the stub function which will call thread_exit to terminate the thread. /* thread starts by calling thread_stub. The arguments to thread_stub are the * thread_main() function, and one argument to the thread_main() function. */ void thread_stub(void (*thread_main)(void *), void *arg) { thread_main(arg); // call thread_main() function with arg

thread_exit(0); } In the above code, the argument thread_main is a pointer to the thread_main function that describes the real work the thread should do. Notice that in C, a function’s name refers to the address of its code in memory.

The second argument to thread_stub (arg) is the argument to pass to the thread_main function. We’ll have the thread_main function take an argument that is a pointer to an arbitrary type so that you can pass it whatever you want. Contexts and Calling Conventions The ucontext_t structure contains many data fields, but you only need to deal with four of them when creating new threads: the stack pointer, the program counter, and two argument registers. Other than that, you don’t need to worry about the fields within the context variable, as long as you do not tamper with them.

Also, it is a good idea to use variables that have been initialized through a getcontext call in order to not have bizarre behavior. Under the Posix C calling conventions in x86-64 (http://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI) , here’s what things look like while any given function is executing:

Notice that while a procedure executes, it can allocate stack space by moving the stack pointer down (stack grows downwards). However, it can find local variables, parameters, return addresses, and the old frame pointer (old %rbp) by indexing relative to the frame pointer (%rbp) register because its value does not change during the lifetime of a function call. When a function needs to make a function call, it copies the arguments of the “callee” function (the function to be called) into the registers shown on the right in the x86-64 architecture.

For example, the %rdi register will contain the first argument, the %rsi register will contain the second argument, etc. Then the caller saves the current instruction pointer (%rip) into the stack (shown as “return address” in the figure), and changes the instruction pointer to the callee function. At this point, the stack pointer (%rsp) points to the return address (shown in the figure). Note that the stack pointer points to the last pushed value in the stack.

The callee function then starts executing. It first pushes the the frame pointer value of the caller function (shown as old %rbp) into the stack, and then sets the frame pointer (%rbp) to the current stack pointer (%rbp = %rsp), so that it points to the old frame pointer value. Then the callee function decrements the stack pointer (shown as %rsp), and uses the space between the frame pointer and the stack pointer for its local variables, for saving or spilling other registers, etc.

As an example, these three steps are performed by the first three instructions (push, mov and sub) in the main function shown below. The callee locates its variables, parameters, and the return address, by using addresses relative to the fixed frame pointer (%rbp). To return to the caller, a procedure simply copies the frame pointer (%rbp) to the stack pointer (%rsp = %rbp), effectively releasing the current frame. Then it pops the top stack item into %rbp to restore the %rbp of the caller function, and then uses the ret instruction to pop the old instruction pointer off the stack into the instruction register (%rip), returning control to the caller function.

These steps are performed by the last two instructions (leaveq, retq) in the main function shown below. This is a gdb listing for the test_basic program that we will provide you for this assignment. Run gdb on it (or on any program that you have compiled on the lab machines), as shown below, to see the instructions at the start and the end of a function (e.g., main function). Make sure that you understand what these instructions are doing, and are able to answer the questions in the listing below. $ cd threads $ gdb test_basic … (gdb) b main Breakpoint 1 at 0x4009ab: file test_basic.c, line 7. (gdb) run … Breakpoint 1, main (argc=1, argv=0x7fffffffe868) at test_basic.c:7 7 thread_init(0); (gdb) disassemble main Dump of assembler code for function main: 0x000000000040099c <+0>: push %rbp 0x000000000040099d <+1>: mov %rsp,%rbp 0x00000000004009a0 <+4>: sub $0x10,%rsp 0x00000000004009a4 <+8>: mov %edi,-0x4(%rbp) // what is happening here? 9/20/23, 6:06 PM A1: Cooperative Threads https://q.utoronto.ca/courses/315157/assignments/1136988#background 10/12 0x00000000004009a7 <+11>: mov %rsi,-0x10(%rbp) // what is happening here? => 0x00000000004009ab <+15>: mov $0x0,%edi 0x00000000004009b0 <+20>: callq 0x401edc 0x00000000004009b5 <+25>: mov $0x0,%eax 0x00000000004009ba <+30>: callq 0x4009dc 0x00000000004009bf <+35>: mov $0x0,%eax 0x00000000004009c4 <+40>: leaveq 0x00000000004009c5 <+41>: retq End of assembler dump. One complication with the Posix C x86-64

(http://en.wikipedia.org/wiki/X86_calling_conventions#List_of_x86_calling_conventions) calling convention is that it requires the frame pointer %rbp to be aligned to 16 bytes. This byte alignment (http://en.wikipedia.org/wiki/Data_structure_alignment) means that the value of %rbp, or the stack location (stack address) to which %rbp points, must be a multiple of 16. Otherwise, system libraries may crash (http://web.cecs.pdx.edu/~apt/cs322/x86-64.pdf) .

When you are creating a new thread that will execute the thread_stub function, you will need to set up the stack so that this calling convention is followed. In particular, you will need to think about the byte alignment for the stack pointer and the frame pointer when control is transferred to thread_stub. Setup and Submission Log in to MarkUs (https://markus.teach.cs.toronto.edu/2022-09/) and go to CSC369. You will find the starter files for this assignment on MarkUs. Click the button to ‘Add starter files to repository’, then clone your repository where you want to do your work. You should find the starter code below the A1/ subdirectory of your cloned repository.

Build the starter code by typing make in the A1/ directory You should only modify the thread.c files in this assignment. You can find the files you have modified by running the git status command. You can commit your modified files to your local repository as follows: git add thread.c git commit -m “Committing changes for Assignment 1” We suggest committing your changes frequently by re-running the commands above (with different meaningful messages to the commit command), so that you can go back to see the changes you have made over time, if needed.

Once you have tested your code, and committed it locally (check that by running git status), you can git push it back to MarkUs. We will collect and grade the last version pushed to MarkUs when the grace period expires. Grace tokens are consumed automatically based on the timestamp of your last push.

Hints and Advice This assignment does not require writing a large number of lines of code. It does require you to think carefully about the code you write. Before you dive into writing code, it will pay to spend time planning and understanding the code you are going to write. If you think the problem through from beginning to end, this assignment will not be too hard. If you try to hack your way out of trouble, you will spend many frustrating nights in the lab. As a start, here are some questions you should answer before you write code.

What fields will you need in your thread structure? Perhaps the most important is the thread state (e.g., running, etc.). Think about all the states that you will need to support. getcontext “returns” twice. When it is called directly, it initializes the context structure that is passed as an argument, and then execution continues after the getcontext() call. Then, when setcontext() is called later, execution returns to the instruction following the getcontext() call, which appears to be a second “return”, since the code continues running from the instruction following getcontext().

For this assignment, you will use this behavior, once when you create a context, and again when you switch to that context. What action will you take in each case? How will you tell which case you are in? (Hint: Look at the T2 tutorial exercise’s use of getcontext() and setcontext().) Most threads are created with thread_create, but the initial thread is there before your library is invoked. Nonetheless, the original thread must be able to thread_yield to let other threads run, and other threads must be able to call thread_yield and let the original thread run. How is this going to work? A hard bug to find would be an overflow or underflow of the stack you allocate.

How might such a bug manifest itself? What defensive programming strategies can you use to detect stack overflow in a more controlled manner as the system runs? Note that when the initial thread in a C process returns, it calls the exit system call, which causes the OS to destroy the process, even if you have other user-level threads in the process that want to run. It is possible for the initial tread (TID == 0) to call thread_exit() while other threads are still running.

How will you ensure that the program exits only when the last thread in your system exits? Be careful. It is dangerous to use memory once it has been freed. In particular, you should not free the stack of the currently running thread in thread_exit while it is still running. So how will you make sure that the thread stack is eventually deallocated? How will you make sure that another thread that is created in between does not start using this stack (and then you inadvertently deallocate it)? You should convince yourself that your program would work even if you used a debugging malloc library that overwrites a block with dummy data when that block is free()’d.

Note that the stack of the initial thread (TID == 0) was not dynamically allocated by the threads library, and cannot be freed when this thread calls thread_exit. How will you detect and handle this case?

Be careful. If you destroy a thread that is holding or waiting on a resource such as a lock (we will be implementing locks in the next assignment), problems can occur. For example, deadlock may occur because the thread holding the lock may not have a chance to release the lock. Similarly, if the thread waiting on a lock is destroyed, then the thread releasing the lock may wake up some other thread incorrectly (e.g., due to reusing thread id).

For this reason, it is important to ensure that when thread_kill is invoked on a thread, the target thread should not exit immediately. Instead, the next time the target thread runs, it should notice that it has been killed and exit. How will you implement this functionality? In practice, operating systems provide a signal handler mechanism that allows threads to clean up their resources (e.g., locks) before they exit. What are the similarities and differences between thread_yield and thread_exit? Think carefully. It will be useful to encapsulate all that is similar in a common function, which will help reduce bugs, and possibly make your code simpler.

We strongly recommend that your first milestone might be for thread_yield(THREAD_SELF) to work for the initial thread (where your implementation stores and then restores the caller’s state). Get this working before you try to implement thread_create or thread_exit. Use a debugger. As an exercise, put a breakpoint at the instruction after you copy the current thread’s state using getcontext. You can print the current values of the registers (in gdb, type info registers).

You can print the values stored in your thread struct and the thread context. For example, say current is a pointer to the thread structure associated with the currently running thread, and context is a field in this structure that stores the thread context. Then, in gdb, you can use p/x current->context to print the context stored by a call to getcontext.

You may find this particularly useful in making sure that the state you “restore” when you run a newlycreated thread for the first time makes sense. Start early, we mean it! Testing Your Code We have provided the program test_basic.c for testing this assignment. Use it to test your code. You can also test your code by using our auto-tester program at any time by following the testing instructions (https://q.utoronto.ca/courses/315157/pages/running-the-autotester) .

CSC369 A2: Preemptive Threads

Concurrency – A Preemptive User-Level Threads Package

In the previous assignment, you implemented a cooperative, user-level thread package in which
thread_yield causes control to be passed from one thread to the next one.

This assignment has four
goals:
1. Implement preemptive threading, in which simulated “timer interrupts” cause the system to switch
from one thread to another.
2. Implement the thread_sleep and thread_wakeup scheduling functions. These functions will enable
you to implement blocking mutual exclusion and synchronization.

3. Use the thread_sleep and thread_wakeup functions to implement thread_wait , which will block a
thread until the target thread has finished executing.
4. Implement blocking locks for mutual exclusion, and condition variables for synchronization.

For this assignment, we are not providing any additional code aside from new test cases. You will be
working with your thread.c code that you implemented in Assignment 1.
Outline
1. Background: Timer Signals
2. Setup
3. Task 1: Preemptive Threading
4. Task 2: Sleep and Wakeup
5. Task 3: Waiting for Threads to Exit
6. Task 4: Mutex Locks and Condition Variables
7. Hints and Advice
8. Frequently Asked Questions
9. Using Git and Submitting Your Assignment

Background: Timer Signals

User-level code cannot use hardware timer interrupts directly. Instead, POSIX operating systems provide
a software mechanism called signals that can be used to simulate “interrupts” at the user level. These
signals interrupt your program and give you a chance to handle that interrupt.

For example, when you hit
Ctrl-C to kill a program, that causes the OS to send the SIGINT signal to your program. Most programs
don’t handle this signal, so by default, the OS kills the process. However, if you wanted to save the state
of your program before it was killed (e.g., a text editor could save any unsaved files), you can register a

handler with the OS for SIGINT. Then, when the user hits Ctrl-C , the OS calls your handler; in your
handler, you could write out the state of your program and then exit.
More generally, signals are a form of asynchronous, inter-process communication mechanism. A signal
can be sent from one process to another, or from a process to itself. We will use the latter method to
have the process that invokes your user-level scheduler, i.e., your thread library functions, deliver timer
signals to itself.

The operating system delivers a signal to a target (recipient) process by interrupting the normal flow of
execution of the process. Execution can be interrupted after any instruction. If a process has registered a
signal handler, that handler function is invoked when the signal is delivered. After the signal handler
finishes executing, the normal flow of execution of the process is resumed. Notice the similarities
between signals and hardware interrupts.

Because signals can be delivered after any instruction, signal handling is prone to race conditions. For
example, if you increment a counter ( counter = counter + 1 ), either during normal execution or in your
signal handler code, the increment operation may not work correctly because it is not atomic. The signal
may be delivered between the instructions implementing the increment operation. To avoid this problem,
you should disable signal delivery while the counter is being updated. Note that this is essentially the old
OS strategy of disabling interrupts to protect critical sections on a uniprocessor.

Please read a short introduction to signals (http://en.wikipedia.org/wiki/Unix_signal) to understand
how they work in more detail. Make sure to read the “Risks” section or else you may not be able to
answer some questions below.

Now go over the code in the files interrupt.h and interrupt.c . (This is the same code we provide for
Tutorial 4). You do not need to understand all the code, but it will be helpful to know how these functions
should be used. We will use the terms “interrupts” and “signals” interchangeably below.
void register_interrupt_handler(bool verbose):
This function installs a timer signal handler in the calling program using the sigaction
(http://linux.die.net/man/2/sigaction) system call. When a timer signal fires, the function
interrupt_handler in the interrupt.c file is invoked. With the verbose flag, a message is printed
when the handler function runs.
bool interrupts_set(bool enable):

This function enables timer signals when enable is 1, and disables (or blocks) them when enabled is
0. We call the current enabled or disabled state of the signal the signal state. This function also returns
whether the signals were previously enabled or not (i.e., the previous signal state). Notice that the
operating system ensures that these two operations (reading previous state, and updating it) are
performed atomically when the sigprocmask (http://linux.die.net/man/2/sigprocmask) system call is
issued. Your code should use this function to disable signals when running any code that is a critical
section (http://en.wikipedia.org/wiki/Critical_section) (i.e., code that accesses data that is shared by
multiple threads).

Why does this function return the previous signal state? The reason is that it allows “nesting” calls to
this function. The typical usage of this function is as follows:
fn() {
/* disable signals, store the previous signal state in “enabled” */
int enabled = interrupts_set(false);
/* critical section */
interrupts_set(enabled);
}

The first call to interrupts_set disables signals. The second call restores the signal state to its
previous state, i.e., the signal state before the first call to interrupts_set , rather than unconditionally
enabling signals. This is useful because the caller of the function fn may be expecting signals to
remain disabled after the function fn finishes executing. For example:
fn_caller() {
int enabled = interrupts_set(false);
/* begin critical section */
fn();
/* code expects signals are still disabled */

/* end critical section */
interrupts_set(enabled);
}

Notice how signal disabling and enabling are performed in “stack” order, so that the signal state
remains disabled after fn returns.
The functions interrupts_on and interrupts_off are simple wrappers for the interrupt_set function.
bool interrupts_enabled():
This function returns whether signals are enabled or disabled currently. You can use this function to
check (i.e., assert) whether your assumptions about the signal state are correct.
void interrupts_quiet():

This function turns off printing signal handler messages.
void interrupts_loud():
This function turns on printing signal handler messages.
To help you understand how this code works, you should work through Tutorial 4
(https://q.utoronto.ca/courses/315157/assignments/1166253) .

Setup

You will be doing this assignment within the A2 directory of your MarkUs repository.
Log in to MarkUs (https://markus.teach.cs.toronto.edu/2022-01/) and go to CSC369. Select A2, and
then click the button to add the starter files for this assignment. These will be the same base files as
for Assignment 2, with the updated with the updated malloc.cpp and the addition of more test
programs, as well as an updated Makefile. The thread.c file is not included in the starter code.

Clone your MarkUs repo, or use git pull to update your existing checkout. The files for this
assignment will be in the userid/A2 subdirectory.
Copy your thread.c solution from your A1 directory to your A2 directory. Then add it to your repo ( git
add thread.c ), commit locally, and push the changes back upstream to MarkUs.

Task 1: Preemptive Threading

Now you are ready to implement preemptive threading using the timer signals described above.
However, before you start writing any code, make sure that you can answer the following questions:
1. What is the name of the signal that you will be using to implement preemptive threading?
2. Which system call is used by the process to deliver signals to itself?
3. How often is this signal delivered?

4. When this signal is delivered, which function in thread.c is invoked? What would this function do
when it is invoked in a program that uses the thread library?
5. Is the signal state enabled or disabled when the function in thread.c above is invoked? If the signal
is enabled, could it cause problems? If the signal is disabled, what code will enable them? Hint: look
for sa_mask in interrupt.c .

6. What does unintr_printf do? Why is it needed? Will you need other similar functions? Reread a
short introduction to signals (http://en.wikipedia.org/wiki/Unix_signal) to find out.

Signals can be sent to the process at any time, even when a thread is in the middle of a thread_yield ,
thread_create , or thread_exit call. It is a very bad idea to allow multiple threads to access shared
variables (such as your ready queue) at the same time. You should therefore ensure mutual exclusion
(http://en.wikipedia.org/wiki/Mutual_exclusion) , i.e., only one thread can be in a critical section
(accessing the shared variables) in your thread library at a time.

A simple way to ensure mutual exclusion is to disable signals when you enter procedures of the thread
library and restore the signal state when you leave.
Hint: think carefully about the invariants you want to maintain in your thread functions about when
signals are enabled and when they are disabled. Make sure to use the interrupts_enabled function to
check your assumptions.

Note that as a result of thread context switches, the thread that disables signals may not be the one
enables them. In particular, recall that setcontext restores the register state
(https://q.utoronto.ca/courses/315157/file_contents/course%20files/assignment1.html#context-switch)
saved by getcontext .

The signal state is saved when getcontext is called and restored by setcontext
(look at the save_interrupt function in the understand_interrupt.c file in Tutorial 4). As a result, if you
would like your code to be running with a specific signal state (i.e., disabled or enabled) when
setcontext is called, make sure that getcontext is called with the same signal state. Maintain the right
invariants, and you’ll have no trouble dealing with context switches.

It will be helpful to go over the manual pages (http://linux.die.net/man/2/setcontext) of the context save
and restore calls again.
Implement preemptive threading by adding the necessary initialization, signal disabling and
signal enabling code in your thread library in thread.c .

After you implement preemptive threading, you can test your code by running the test_preemptive
program. To check whether this program worked correctly, you can run the following tester script:
/u/csc369h/fall/pub/tester/scripts/a2-01-preemptive.py
This script is run as part of testing Assignment 2. Adding the -v option to the script above will provide
more information about what output is expected by the tester.

Task 2: Sleep and Wakeup

Now that you have implemented preemptive threading, you will extend your threading library to
implement the thread_sleep and thread_wakeup functions. These functions will allow you to implement
mutual exclusion and synchronization primitives. In real operating systems, these functions would also
be used to suspend and wake up a thread that performs IO with slow devices, such as disks and
networks.

The thread_sleep primitive blocks or suspends a thread when it is waiting on an event, such
as a mutex lock becoming available or the arrival of a network packet. The thread_wakeup primitive
awakens one or more threads that are waiting for the corresponding event.

The thread_sleep and thread_wakeup functions that you will be implementing for this assignment are
summarized here:
Tid thread_sleep(struct wait_queue *queue):
This function suspends the caller and then runs some other thread. The calling thread is put in a wait
queue passed as a parameter to the function.

The wait_queue data structure is similar to the run
queue, but there can be many wait queues in the system, one per type of event or condition. Upon
success, this function returns the identifier of the thread that took control as a result of the function call.
The calling thread does not see this result until it runs later. Upon failure, the calling thread continues
running, and returns one of these constants:
THREAD_INVALID: alerts the caller that the queue is invalid, e.g., it is NULL.
THREAD_NONE: alerts the caller that there are no more threads, other than the caller, that are ready to
run. Note that if the thread were to sleep in this case, then your program would hang because there
would be no runnable thread.
int thread_wakeup(struct wait_queue *queue, int all):

This function wakes up one or more threads that are suspended in the wait queue. The awoken threads
are put in the ready queue. The calling thread continues to execute and receives the result of the call.
When “all” is 0 (false), then one thread is woken up. In this case, you should wake up threads in FIFO
order, i.e., first thread to sleep must be woken up first. When “all” is 1 (true), all suspended threads are

woken up. The function returns the number of threads that were woken up. It should return zero if the
queue is invalid, or there were no suspended threads in the wait queue.

You will need to implement a wait_queue data structure before implementing the functions above. The
thread.h file provides the interface for this data structure. Note that each thread can be in only one
queue at a time (a run queue or any one wait queue).

When implementing thread_sleep , it will help to think about the similarities and differences between this
function and thread_yield and thread_exit . Make sure that thread_sleep suspends (blocks) the
current thread rather than spinning (running) in a tight loop. This would defeat the purpose of invoking
thread_sleep because the thread would still be using the CPU.

All the thought that you put into ensuring that thread preemption works correctly previously will apply to
these functions as well. In particular, these functions access shared data structures (which ones?), so be
sure to enforce mutual exclusion.

Implement the thread_sleep and thread_wakeup functions in your thread library in thread.c .
After you implement the sleep and wakeup functions, you can test your code by running the test_wakeup
and the test_wakeup_all programs. To check whether these programs worked correctly, you can run the
following tester commands:
/u/csc369h/fall/pub/tester/scripts/a2-02-wakeup.py
/u/csc369h/fall/pub/tester/scripts/a2-03-wakeupall.py

Recall that in the previous assignment, you had implemented thread_kill(tid) , which ensured that the
target thread (whose identifier is tid ) did not run any further, and this thread would eventually exit when
it ran the next time. In the case another thread invokes thread_kill on a sleeping thread, you should
immediately remove the thread from the associated wait queue and wake it up (i.e., make it runnable).
Then, the thread can exit when it runs the next time.

Task 3: Waiting for Threads to Exit

Now that you have implemented the thread_sleep and thread_wakeup functions for suspending and
waking up threads, you can use them to implement blocking synchronization primitives in your threads
library. You should start by implementing the thread_wait function, which will block or suspend a thread
until a target thread terminates (or exits). Once the target thread exits, the thread that invokes
thread_wait should continue operation. As an example, this synchronization mechanism can be used to
ensure that a program (using a master thread) exits only after all its worker threads have completed their
operations.

The thread_wait function is summarized below:
int thread_wait(Tid tid, int *exit_code):

This function suspends the calling thread until the thread whose identifier is tid terminates. A thread
terminates when it invokes thread_exit . Upon success, this function returns the identifier of the thread
that exited. If exit_code is not NULL, the exit status of the thread that exited (i.e., the value it passed to
thread_exit) will be copied into the location pointed to by exit_code . Upon failure, the calling thread
continues running, and returns the constant THREAD_INVALID .

Failure can occur for the following
reasons:
Identifier tid is not a feasible thread id (e.g., any negative value of tid or tid larger than the maximum
possible tid)
No thread with the identifier tid could be found.
The identifier tid refers to the calling thread.
Another thread is already waiting for the identifier tid.
You will need to associate a wait_queue with each thread. When a thread invokes thread_wait , it should
sleep on the wait_queue of the target thread. When the target thread invokes exit, and is about to be
destroyed, it should wake up the threads in its wait_queue .

There are a couple of technicalities that you need to consider:
Only exactly one caller of thread_wait(tid,…) can succeed for a target thread tid . All subsequent
callers should fail immediately with the return value THREAD_INVALID.
If a thread with id tid exits voluntarily (i.e., calls thread_exit without being killed by another thread)
before it is waited for, a subsequent call to thread_wait(tid, &tid_status) must still be able to
retrieve the exit status that thread tid passed to thread_exit .

If a thread with id tid is killed by another thread via thread_kill , set its exit code to -SIGKILL . In
this case there are two possibilities:
If tid has already been waited on at the time it is killed, the waiting thread must be woken up. If
the waiting thread provided a non-NULL pointer for the exit code, then the killed thread’s exit code
(-SIGKILL) must be stored into the location it points to.

If tid has not yet been waited on before it is killed, a subsequent call to thread_wait(tid, …)
should return THREAD_INVALID. That is, a thread cannot wait for a killed thread. (You do not
need to detect if the thread id is recycled between the kill and the wait calls. If that happens, the
thread_wait should succeed. It is a good idea to delay recycling thread ids as long as possible to
avoid having a thread accidentally wait on the wrong target thread.)

Threads are all peers. A thread can wait for the thread that created it, for the initial thread, or for any
other thread in the process. One issue this creates for implementing thread_wait is that a
deadlock may occur. For example, if Thread A waits on Thread B, and then Thread B waits on Thread
A, both threads will deadlock. We do not expect you to handle this condition for the assignment, but it
will be helpful to think about how you could implement thread_wait to avoid any deadlocks.
Implement thread_wait in your thread library and update any other relevant functions. After you
implement this functionality, you can test your code by running the test_wait , test_wait_kill ,

test_wait_exited and the test_wait_parent programs. To check whether these programs worked
correctly, you can run the following tester command:
/u/csc369h/fall/pub/tester/scripts/a2-04-wait.py

Task 4: Mutex Locks and Condition Variables

The final task is to implement mutual exclusion and synchronization primitives in your threads library.
Recall that these primitives form the basis for managing concurrency, which is a core concern for
operating systems, so your library would not really be complete without them.
For mutual exclusion, you will implement blocking locks, and for synchronization, you will implement
condition variables.

The API for the lock functions are described below:
struct lock *lock_create():
Create a blocking lock. Initially, the lock should be available. Your code should associate a wait queue
with the lock so that threads that need to acquire the lock can wait in this queue.
void lock_destroy(struct lock *lock):
Destroy the lock. Be sure to check that the lock is available when it is being destroyed.
void lock_acquire(struct lock *lock):

Acquire the lock. Threads should be suspended until they can acquire the lock, after which this function
should return.
void lock_release(struct lock *lock):
Release the lock. Be sure to check that the lock had been acquired by the calling thread, before it is
released. Wake up all threads that are waiting to acquire the lock.
The API for the condition variable functions are described below:
struct cv *cv_create():

Create a condition variable. Your code should associate a wait queue with the condition variable so that
threads can wait in this queue.
void cv_destroy(struct cv *cv):
Destroy the condition variable. Be sure to check that no threads are waiting on the condition variable.
void cv_wait(struct cv *cv, struct lock *lock):
Suspend the calling thread on the condition variable cv . Be sure to check that the calling thread had
acquired lock when this call is made. You will need to release the lock before waiting, and reacquire it
before returning from this function.

void cv_signal(struct cv *cv, struct lock *lock):
Wake up one thread that is waiting on the condition variable cv . Be sure to check that the calling
thread had acquired lock when this call is made.

void cv_broadcast(struct cv *cv, struct lock *lock):
Wake up all threads that are waiting on the condition variable cv . Be sure to check that the calling
thread had acquired lock when this call is made.
The lock_acquire , lock_release functions, and the cv_wait , cv_signal and cv_broadcast functions
access shared data structures (which ones?), so be sure to enforce mutual exclusion.
Implement these functions in your thread library in thread.c .

After you implement these functions, you can test your code by running the test_lock , test_cv_signal
and test_cv_broadcast programs. To check whether these programs worked correctly, you can run the
following tester commands:
/u/csc369h/fall/pub/tester/scripts/a2-05-lock.py
/u/csc369h/fall/pub/tester/scripts/a2-06-cv-signal.py
/u/csc369h/fall/pub/tester/scripts/a2-07-cv-broadcast.py

Hints and Advice
Start early, we mean it!
You are encouraged to reuse your own code that you might have developed in the first assignment or in
previous courses for common data structures and operations such as queues, sorting, etc. Make sure to
document the sources of anything you did not write specifically for this course.
You may not use code that subsumes the heart of this project (e.g., you should not base your solution
on wrappers of or code taken from the POSIX thread library). If in doubt, ask.

This project does not require you to write a large number of lines of code. It does require you to think
carefully about the code you write. Before you dive into writing code, it will pay to spend time planning
and understanding the code you are going to write. If you think the problem through from beginning to
end, this project will not be too hard. If you try to hack your way out of trouble, you will spend many
frustrating nights in the assignment. This project’s main difficulty is in conceptualizing the solution.

Once
you overcome that hurdle, you will be surprised at the simplicity of the implementation!
All the general hints and advice (https://q.utoronto.ca/courses/315157/assignments/1136988) from
Assignment 1 apply here as well.
Frequently Asked Questions
We have provided answers to various Frequently Asked Questions
(https://q.utoronto.ca/courses/315157/pages/a2-faq) about the assignment. Make sure to go over them. We
have provided answers to many questions that students have asked in previous years, so you will save
time by going over these answers as you start working on the assignment.

Testing Your Code

You can test your entire code by using our auto-tester program at any time by following the testing
instructions (https://q.utoronto.ca/courses/315157/pages/running-the-autotester) . Use the argument ‘a2’
to tell the tester to run the tests for this assignment.
% cd ~/csc369/userid/a2
% csc369-tester a2
Using Git and Submitting Your Assignment
You should only modify the following file in this assignment.
thread.c
You can find the files you have modified by running the git status command.
You can commit your modified files to your local repository as follows:
git add thread.c
git commit -m “Committing changes for Assignment 2”

We suggest committing your changes frequently by rerunning the commands above (with different
meaningful messages to the commit command), so that you can go back to see the changes you have
made over time, if needed.

Once you have tested your code, and committed it (check that by running git status ), you can push
the assignment to MarkUs.
git push
We suggest pushing your code back to MarkUs frequently, as you complete each part of the assignment.
Grace tokens are used automatically based on the time the last push to MarkUs within the grace period.

CSC369 A3: Virtual Memory

Introduction

In this assignment, you will investigate memory access patterns, simulate the operation of page tables and implement several page replacement algorithms. This will give you some practice working with the algorithms we have been talking about in class.

This assignment is based on a virtual memory simulator that uses the simvaddr-*.ref memory reference traces located at /u/csc369h/fall/pub/a4/traces . The first task is to implement virtual-to-physical address translation and demand paging using a page table design of your choice. Then you will implement two different page replacement algorithms: exact LRU (Least Recently Used) and Clock. Before you start work, you should complete the set of readings about memory, if you haven’t done so already: Paging: Introduction (http://pages.cs.wisc.edu/~remzi/OSTEP/vm-paging.pdf)

Tutorial 7 Exercise: Memory reference traces

The Tutorial 7 Exercise (%24CANVAS_OBJECT_REFERENCE%24/assignments/g99295e10ed471a400741e73aef899c8e) provides an introduction to the memory address traces used in the simulator.

Part 1: Virtual to physical translation

Intro video (https://web.microsoftstream.com/video/697981d9-04de-406f-9aeed38107682433) Setup Log into MarkUs to create or update your repo and get the starter code. Remember that you cannot manually create a new a3 directory in your repo or MarkUs won’t see it.

The traces from our sample programs at /u/csc369h/fall/pub/a4/traces will be interesting to run once you have some confidence that your program is working, but you will definitely want to create small traces by hand for testing. The format of the traces is reftype vaddr value as shown in the sample below. Note that the page offset part of the addresses are all between 0 and 15 (0xf) to fit in the reduced simulated physical page frames.

For a write reference type (S or M), the value will be written to the virtual address. For a read reference type (L or I) the value is the expected value that should be read from the virtual address. It should always be the same as the value in the most recent preceding write reference to the same virtual address. We use this to check that the address translations and pagein/pageout operations are working correctly. A sample trace snippet is shown below: S 309001 182 S 1fff000000 55 I 108005 0 S 308008 122 L 1fff000000 55 L 308008 122 I 4cc5000 0 L 5018008 0

Note that in our traces, the Instruction reference type is likely to always have a value of 0 because these addresses are not written to after the program starts executing. You will also see Load references with a value of 0 when the trimmed trace includes a Load from an address that has not yet been written to.

Task 1 – Address Translation and Paging

Implement virtual-to-physical address translation and demand paging using a pagetable design of your choice. The main driver for the memory simulator, sim.c , reads memory reference traces in the format produced by the simify-trace.py tool from trimmed, reduced valgrind memory traces (refer to the Tutorial 7 exercise (%24CANVAS_OBJECT_REFERENCE%24/assignments/g99295e10ed471a400741e73aef899c8e) for more information on how the traces are generated).

For each line in the trace, the program asks for the simulated physical page frame that corresponds to the given virtual address by calling find_physpage, and then reads from the simulated physical memory at the location given by the physical frame number and the page offset. If the access type is a write (‘M’ for modify or ‘S’ for store), it will also write the value from the trace to the location.

You should read sim.c so that you understand how it works but you should not modify it. The simulator is executed as ./sim -f -m -s -a where memory size and swapfile size are the number of frames of simulated physical memory and the number of pages that can be stored in the swapfile, respectively.

The swapfile size should be as large as the number of unique virtual pages in the trace, which you should be able to determine easily based on your analysis from Tutorial 7. There are four main data structures that are used: 1. unsigned char *physmem : This is the space for our simulated physical memory. We define a simulated page frame size of SIMPAGESIZE and allocate SIMPAGESIZE * “memory size” bytes for physmem. 2. struct frame *coremap : The coremap array represents the state of (simulated) physical memory. Each element of the array represents a physical page frame. It records if the physical frame is in use and, if so, a pointer to the page table entry for the virtual page that is using it. 3. struct pt_entry – a page table entry.

The format of a page table entry is up to you, but at a minimum it must record the frame number if the virtual page is in (simulated) physical memory and an offset into the swap file if the page has been written out to swap. It must also contain flags to represent whether the entry is Valid, Dirty, and Referenced. 4. swap.c : The swapfile functions are all implemented in this file, along with bitmap functions to track free and used space in the swap file, and to move virtual pages between the swapfile and (simulated) physical memory.

The swap_pagein and swap_pageout functions take a frame number and a swap offset as arguments. The simulator code creates a temporary file in the current directory where it is executed to use as the swapfile, and removes this file as part of the cleanup when it completes. It does not, however, remove the temporary file if the simulator crashes or exits early due to a detected error. You must manually remove the swapfile.XXXXXX files in this case.

To complete this task, you will have to write code in pagetable.c and pagetable.h . Read the code and comments in this file — it should be clear where implementation work is needed and what it needs to do. Basic round-robin and random replacement algorithms are already implemented for you, so you can test your translation and paging functionality independently of implementing the replacement algorithms. Efficiency: In a real operating system implementation, the memory space taken up for your page tables reduces the memory space available to store the pages of processes’ virtual address spaces. Hence, keeping page tables small is desirable.

Reducing the time complexity of page table lookups is also important. Your solution will be evaluated on correctness as well as space and time efficiency. Task 2 – Replacement Algorithms Using the starter code, implement exact LRU and CLOCK (with one ref-bit) replacement algorithms. You may find that you want to add fields to the struct frame for the different page replacement algorithms.

You can add them in pagetable_generic.h , but please label them clearly. Note: to test your page replacement algorithms, we will replace your pagetable.c with a solution version, so your page replacement algorithm must be contained to the provided functions. Once you are done implementing the algorithms you can use the provided simvaddr-*.ref traces and the autotester to check the results. For each algorithm, the tester will run the programs on memory sizes 50 and 100 and check the output against the expected results.

Efficiency: Page replacement algorithms must be fast, since page replacement operations can be critical to performance. Consequently, you must implement these policies with efficiency in mind. For example, we will give you the expected complexities for some of the policies: RR: init, evict, ref: O(1) in time and space CLOCK: init, ref: O(1) in time and space; evict: O(M) in time, O(1) in space, where M = size of memory  LRU and MRU: evict, ref: O(1) in time and space; init: O(M) in time and space, where M = size of memory Important notes When we run the autotests on your code, your page replacement algorithms will be compiled with a different pagetable.c file (the one from the solution). All the code of the page replacement algorithms must be in their separate .c files, not in pagetable.c (except for additions to struct frame in pagetable.h ).

When a page is being evicted, there should be only 2 possibilities: (i) the page is dirty and needs to be written to the swap; and (ii) the page is clean and already has a copy in the swap. A newly initialized page (zero-filled) should be marked dirty on the very first access. CLOCK must use the “Referenced” flag stored in the page table entry. All the algorithms must utilize their ref() functions (if necessary) instead of adding any algorithm-specific code to pagetable.c .

There are functions in pagetable.c to get the values of the Valid, Dirty, and Referenced flags given a page table entry, which you should implement. Use these functions in your replacement algorithm implementations if you need to check any of these flags — do not assume a particular format for page table entries, or your replacement algorithms are unlikely to work with our solution version of pagetable.c .

The simulator and the page replacement algorithms must not produce any additional or different (from starter code) output (except for errors that should be printed to stderr ), otherwise the tests will fail. The simulator writes a value into the simulated physical memory pages for Store or Modify references, and checks that simulated physical memory contains the last written value on Load or Instruction references. If there is a mismatch, the simulator prints an error message.

These errors indicate that there is something wrong with the address translation implementation. For debugging, you will find it useful to implement the print_pagetable() function in pagetable.c. The function should print (at least) one line for each in-use page in the pagetable (valid in memory, or currently evicted to swap).

Other than that, what information it displays is up to you — we will only be testing that it produces at least the expected number of outputs lines. You can use the debug flag in sim.c to control extra output during development. (NOTE: remember to set it to false in your final submission).

CSC369 A4: FUSE File Systems

Introduction

You will be implementing a version of the Very Simple File System (https://q.utoronto.ca/courses/315157/file_contents/course%20files/tutorials/A5-vsfs-dumptool.pdf? canvas_=1&canvas_qs_wrap=1) (VSFS) from the OSTEP text and lectures.

We will be using FUSE to interact with your file system. FUSE allows you to implement a file system in user space by implementing the callback functions that the libfuse library will call. The Tutorial 7 Exercise (https://q.utoronto.ca/courses/315157/assignments/1146491) should give you some practice with using FUSE.

Your tasks include: Implement the code to format an empty disk into a VSFS file system by completing mkfs.c (this will be compiled into the mkfs.vsfs executable). This part of the assignment does not need FUSE at all. Implement the FUSE functions to list the root directory, and to create, remove, read, write, and resize files, as well as get status information about files, directories, or the overall file system, by completing vsfs.c .

We are providing a set of formatted VSFS disk images so that you can work on these two parts of the assignment independently.

Using FUSE

Refer to the Tutorial 7 Exercise (https://q.utoronto.ca/courses/315157/assignments/1146491) handout for instructions on getting started with FUSE. If you would like to learn more about FUSE: libfuse GitHub repository: https://github.com/libfuse/libfuse (https://github.com/libfuse/libfuse) FUSE wiki: https://github.com/libfuse/libfuse/wiki (https://github.com/libfuse/libfuse/wiki) (https://github.com/libfuse/libfuse) FUSE API header file for the version we’re using: https://github.com/libfuse/libfuse/blob/fuse_2_9_bugfix/include/fuse.h (https://github.com/libfuse/libfuse/blob/fuse_2_9_bugfix/include/fuse.h)

Additional Setup

Unlike the passthrough file system of the tutorial exercise, your VSFS file system will operate on a disk image file. A disk image (https://en.wikipedia.org/wiki/Disk_image) is simply an ordinary file that holds the content of a disk partition or storage device.

To allow you to test your file system operations independently of your file system format code ( mkfs.vsfs ), we have provided some simple VSFS-formatted disk images in the course pub directory on teach.cs at /u/csc369h/fall/pub/a4/images: vsfs-empty.disk – Small, empty file system (64 inodes, 1 MB size). Contains just root directory with ‘.’ and ‘..’ entries. vsfs-empty2.disk – Another small, empty file system (256 inodes, 1MB size).

Contains just root directory with ‘.’ and ‘..’ entries. vsfs-maxfs.disk – Maximum size VSFS file system (512 inodes, 128 MB size). Contains just root directory with ‘.’ and ‘..’ entries. vsfs-1file.disk – Small file system (64 inodes, 1 MB size) containing a single small file (only 1 data block) in the root directory. vsfs-3file.disk – Medium file system (128 inodes, 16 MB size) containing 3 files (small – only direct blocks, medium – some indirect blocks, and maximum VSFS file size). vsfs-42file.disk – Medium file system (128 inodes, 16 MB size) containing 42 small files (root directory inode uses multiple direct blocks). vsfs-many.disk – Small file system (256 inodes, 2 MB size) containing lots of small files (root directory inode uses indirect block pointer).

You will need to make your own copies of these disk images to use them, since you will need to be able to write to them. You will also need to create your own empty disk images that you can format using your mkfs program. To do so, you will run the following commands: truncate -s ./mkfs.vsfs -i The truncate command will create the image file if it doesn’t exist and will set its size; mkfs.vsfs will format it into your vsfs file system (after you have completed the implementation).

Once you have a formatted vsfs disk image (one of ours, or your own) the next step is to mount your file system. We assume that you will be using /tmp/userid as in the Tutorial 7 exercise as the mountpoint, and that you will want to keep it running in the foreground so that you can see your output as it runs: ./vsfs -f The image file is the disk image formatted by mkfs.vsfs . Not only does vsfs mount the disk image into the local file system, it also sets up callbacks and then calls fuse_main() so that FUSE can do its work. Both vsfs and mkfs.vsfs have additional options – run them with -h to see their descriptions.

After the file system is mounted, you can access it using standard tools (ls, cp, rm, etc.). To unmount the file system, run: fusermount -u Note that you should be able to unmount the file system after any sequence of operations, such that when it is mounted again, it has the same contents.

Consistency Checkers

The name fsck comes from the common tool (https://en.wikipedia.org/wiki/Fsck) for checking the consistency of file systems in Unix-like operating systems. We provide two executables on teach.cs servers for checking the consistency of images, in the /u/csc369h/fall/pub/a4/tools/ directory: fsck.mkfs checks that your mkfs.vsfs implementation correctly formats the disk. fsck.vsfs checks that your code that performs various file system operations (written in vsfs.c ) has not corrupted the file system.

Simplifying Assumptions

For this assignment, we make a number of simplifying assumptions: VSFS file systems are always small enough that they can be entirely mmap’d into the vsfs process’s virtual address space. The underlying operating system will handle all write-back of dirty pages to the vsfs disk image. If the file system crashes, the disk image may be inconsistent. Your code should not crash, but it does not need to make any special effort to maintain crash consistency.

There is a flat namespace. All files are located in the root directory and there are no subdirectories. You do not need to implement mkdir or rmdir. All paths are absolute (they all start with ‘/’). If you see a path that is not absolute, or that has more than one component, you can return an error.

Understanding the starter code

First read through all of the starter code to understand how it fits together, and which files contain helper functions that will be useful in your implementation. mkfs.c – contains the program to format your disk image.

You need to write part of this program. You will also find it helpful to read the code to see how we access parts of the file system after using mmap() to map the entire disk image into the process virtual address space.

vsfs.h – contains the data structure definitions and constants needed for the file system. You may add other definitions or constants that you find useful, but you should not change the file system metadata. That is, do not add or modify fields in the superblock, inode, or direntry structures and do not change the existing definitions. vsfs.c – contains the program used to mount your file system. This includes the callback functions that will implement the underlying file system operations.

Each function that you will implement is preceded by detailed comments and has a “TODO” in it. Please read this file carefully. NOTE: It is very important to return the correct error codes (or 0 on success) from all the FUSE callback functions, according to the “Errors” section in the comment above the function. The FUSE library, the kernel, and the user-space tools used to access the file system all rely on these return codes for correctness of operation.

Note: You will see many lines like (void)fs; . Their purpose is to prevent the compiler from warning about unused variables. You should delete these lines as you make use of the variables. fs_ctx.h and fs_ctx.c – The fs_ctx struct contains runtime state of your mounted file system. Any time you think you need a global variable, it should go in this struct instead. We have cached some useful global state in this structure already (e.g. pointers to superblock, bitmaps, and inode table), but you may find there is additional state that you want to add, instead of recomputing it on every operation. map.h and map.c – contain the map_file() function used by vsfs and mkfs.vsfs to map the image file into memory and determine its size. You should not need to change anything here, or make any additional calls to the map_file() function beyond what is in the starter code. options.h and options.c – contain the code to parse command line arguments for the vsfs program. You should not need to change anything here. util.h – contains some handy functions: is_powerof2(x) – returns true if x is a power of two. is_aligned(x, alignment) – returns true if x is a multiple of alignment (which must be a power of 2). align_up(x, alignment) – returns the next multiple of alignment that is greater than or equal to x. div_round_up(x, y) – returns the integer ceiling of x divided by y. bitmap.h and bitmap.c – contain code to initialize bitmaps, and to allocate or free items tracked by the bitmaps. You will use these to allocate and free inodes and data blocks, so make sure you read the functions and understand how to use them. You may notice that the bitmap_alloc function can be slow, since it always starts the search for a 0 bit from the start of the bitmap. You are free to improve on this if you wish, but you do not need to do so. You are welcome to put some of the helper functions in separate files instead of keeping everything in vsfs.c . Make sure to update the Makefile to compile those files and add/commit/push them to your git repository.

Recommended progression of your work

You should tackle this project in stages so that you can be confident that each piece works before moving on to the next step. The creation of a new file system (mkfs.c) and operations on a formatted file system (vsfs.c) can be handled independently however, so you can do Steps A and B in either order.

Step A1: Write enough of mkfs.vsfs so that you can mount the file system and check the superblock.

We have implemented vsfs_statfs() in vsfs.c so that you can mount the file system in your disk image and then run stat -f on the root directory to check that the superblock is initialized correctly by your mkfs . Step A2: Complete the implementation of mkfs.vsfs . Use the provided fsck.mkfs tool to check the correctness of the file system as you proceed.

Step B1: Write vsfs_getattr() . You have probably seen from the tutorial exercise that FUSE calls getattr() a lot. Implementing this function is the key to the rest of the operations. You will want to write a helper function that takes a path and returns a pointer to the inode (or the inode number) for the last component in the path. Remember that you only need to handle paths that are of the form “/” or “/somefile” – all paths are absolute and there are no subdirectories in our vsfs file systems.

Step B2: Write vsfs_readdir() so that you can run ls -la on the root directory when the root directory entries fit within a single data block. You should be able to mount vsfs-empty.disk, vsfs-maxfs.disk, vsfs1file.disk and vsfs-3file.disk and list their root directories on completion of this step. Step B3: Add the ability to create and remove empty files by implementing vsfs_create() and vsfs_unlink() . On completion of this step, you should be able to mount vsfs-empty.disk and use ‘ touch /tmp/userid/anewfile’ to create a new empty file.

The new file should be visible and the mode and timestamps should be appropriate when you ‘ls -l’ on the root directory. You should also be able to delete the new file you created (e.g. ‘ rm /tmp/userid/anewfile ‘). Step B4: Add the ability to grow a file up to the limit of the inode’s direct block pointers, or shrink a file to empty, using truncate. Implement vsfs_truncate() . This operation shares functionality with writing to a file (increasing the file size) or removing a file (freeing all the blocks allocated to the file), so think about how you can avoid duplicating code.

Step B5: Add the ability to write to, and read from, small files, first where the data is stored in a single data block, and then when the data can be stored using only the direct block pointers in the inode. Implement vsfs_write() and vsfs_read() . Step B6: Add the ability to remove small files (where the file data uses only the direct block pointers in the inode).

Step B7: Enhance your implementation of vsfs_readdir() to list larger directories, first using just the direct blocks in the directory inode, and then using the directory inode’s indirect block to read all of the directory data blocks. You should be able to mount vsfs-42file.disk (direct blocks only) and vsfsmany.disk (direct and indirect blocks) and list the root directory on completion of this step.

Step B8: Enhance your implementations of vsfs_truncate() , vsfs_write() , vsfs_read() , and vsfs_unlink() to support large files, where the indirect block in the file’s inode is used to locate some of the file’s data blocks. Tip: Comment your code well. It will help you keep track of what is implemented and your understanding of how things work. Refactor your code during development (not after) and keep your functions short and well-structured.

Tip: Check that there is enough space before making any changes to the file system. This will save you from having to roll back changes if you discover that an operation cannot be completed due to lack of space. Tip: Remember to update fields in the superblock (e.g. free_inodes, free_blocks) as you operate on the file system.

Testing and debugging recommendations

You can use standard Unix tools to manipulate directories and files in a mounted vsfs image in order to test your implementation.

System call tracing with strace can help understand what syscalls they invoke to access the file system. You can, in general, use the behaviour of the host file system (ext4) as a reference – your vsfs should have the same observable behaviour for operations that vsfs needs to support. You can also write your own C programs that invoke relevant syscalls directly.

You will find it useful to run vsfs under gdb : gdb –args ./vsfs -d You can then run file system operations in a separate terminal window. You can set breakpoints at the start of your FUSE callback functions (e.g. break vsfs_getattr ) to help you understand what callbacks are invoked when you execute a file system operation (e.g. ls), in what order, and with what arguments. The debugger is also helpful in investigating crashes (e.g., segfaults) and stepping through the execution of the callback functions so that you can check your the state of the filesystem as the operations execute.

Off-by-one errors are common but can be catastrophic when they lead to accessing the wrong block of file system metadata. You might also find it useful to view the binary contents of your vsfs image files using xxd . See man 1 xxd for documentation. To avoid errors when mounting the file system, make sure that the mount point is not in use (e.g. by a previous vsfs mount that didn’t finish cleanly). If fusermount fails to unmount because the mount point directory is “busy”, you can use the lsof command (see man lsof ) to identify the process that keeps it open.

One common error message you might see when running operations on the mounted file system is “transport endpoint is not connected”. This error usually means that the file system is still mounted, but  the vsfs program has terminated (e.g. crashed). In this case you need to manually unmount it with fusermount -u . One of the most common errors you might see at the early stages of the implementation is ls -la reporting an “I/O error” and displaying “???” entries.

This error usually means that your getattr() callback returns invalid data in the stat structure and/or an invalid return value. To test reads at a given file offset, you can use the tail -c command (see man 1 tail ). To test either reads or writes at a given file offset, you can write your own C programs that use pread() and pwrite() .

Limits and details

The maximum number of inodes in the system is a parameter to mkfs.vsfs , the image size is also known to it, and the block size is VSFS_BLOCK_SIZE (4096 bytes – declared in vsfs.h). Many parameters of your file system can be computed from these three values. We will not test your code on an image smaller than 64 KiB (16 blocks) with 4 inodes.

You should be able to fit the root directory and a non-empty file in an image of this size and configuration. You shouldn’t pre-allocate additional space for metadata (beyond the fixed metadata defined for VSFS, the space needed to store the inode table and the root directory) in your mkfs.vsfs implementation. Indirect blocks should only be allocated on demand, when a file or directory grows large enough to need it. The maximum path component length is VSFS_NAME_MAX (252 bytes including the null terminator). This value is chosen to fit the directory entry structure into 256 bytes (see vsfs.h ).

Names stored in directory entries are null-terminated strings so that you can use standard C string functions on them. The maximum full path length is _POSIX_PATH_MAX (4096 bytes including the null terminator). This allows you to use fixed-size buffers for operations like splitting a path into a directory name and a file name. The maximum file size is dictated by the number of direct block pointers in a vsfs inode (VSFS_NUM_DIRECT) and the number of block pointers in an indirect block (VSFS_BLOCK_SIZE / sizeof(vsfs_blk_t)).

The number of directory entries is limited by the maximum number of directory entry data blocks (same as the limit on file blocks). The number of blocks in your file system is limited by the number of bits in a single VSFS block, since we use only 1 block for the data bitmap. You can assume that read and write operations are performed one block at a time. Each read() and write() call your file system receives will only cover a range within a single block.

NOTE: this does not apply to truncate() – a single call needs to be able to extend or shrink a file by an arbitrary number of blocks. Sample disk configurations that must work include: 64KiB size and 4 inodes 64KiB size and 16 inodes

1MiB size and 64 inodes 128MiB size and 512 inodes We will not be testing your code under extreme circumstances so don’t get carried away by thinking about corner cases. However, we do expect you to properly handle “out of space” conditions in your code. Any operation that cannot be completed because there are not enough free blocks or inodes must be cleanly aborted – no blocks or inodes can “leak” in the process. The simplest way to ensure this is to check that there is enough space to complete the operation before modifying any file system metadata.

The formatting program ( mkfs ) must also check that the image file is large enough to accommodate the requested number of inodes. Other implementation notes: Although the “.” and “..” directory entries can be manually listed by the vsfs_readdir() callback (as in the starter code), you should create actual entries for these when you initialize the root directory in mkfs(). The only timestamp you need to store for each file and directory is mtime (modification time) – you don’t need to store atime and ctime .

You can use the touch command to set the modification timestamp of a file or directory to the current time. Any data and metadata blocks (other than the fixed metadata) should only be allocated on demand. Read and write I/O should be performed by reading/writing the virtual memory where the disk image is mmap’d. It should NOT be performed byte-by-byte (which is very inefficient); use memcpy() . Your implementation shouldn’t use any floating point arithmetic.

See the helper functions in util.h – if you need other, similar, functions (like floor), they can also be easily implemented using integer arithmetic.

Documentation

It is recommended that you include a README.txt file that describes any aspects of your code that do not work well. Code that works well and implements a subset of the functionality will get a higher mark than code that attempts to implement more functionality but doesn’t work.

What to submit

Add all the starter files from MarkUs to your a4 repository.

Also add to your repository all additional source code files that you create as part of your implementation. Your a4 repository must contain all the files necessary to compile and run mkfs.vsfs and vsfs . It may include a README file as described above.

Do NOT add and commit virtual machine images, executables, .o or .d files, disk image files, or any other unnecessary files – you will lose code style marks if you do submit those. You are welcome to commit test code and other text files. You should use a .gitignore file to help ensure you only commit and push files you should.