Description
CS 352 Network Time Protocol Assignment
In this assignment, you will develop a Python3 client that uses the Network Time Protocol (NTP)
to compute the current time from a server.
You can work either alone or with 1 other CS 352 student.
We will use Gradescope’s programming project handin in order to both hand-in and grade the
project.
1. Background on the Network Time Protocol
The Network TIme Protocol is a protocol to allow computers to compute the current time over
the Internet. NTP clients can then set their local clocks, or other functions needing time if their
local A description of the protocol is at the link below. It is included here to make the document
self-contained.
See:
https://techhub.hpe.com/eginfolib/networking/docs/switches/5820x-5800/5998-7395r_nmm_cg/c
ontent/441755722.htm
The figure below shows the messaging protocol of NTP. Device A and Device B are connected
over a network. They have their own independent system clocks, which need to be
automatically synchronized through NTP.
Assume that:
● Prior to system clock synchronization between Device A and Device B, the clock of
Device A is set to 10:00:00 am while that of Device B is set to 11:00:00 am.
● Device B is used as the NTP time server, so Device A synchronizes to Device B.
● It takes 1 second for an NTP message to travel from one device to the other.
The time synchronization process is as follows:
● Device A sends Device B an NTP message, which is timestamped when it leaves Device
A. The timestamp is 10:00:00 am (T1).
● When this NTP message arrives at Device B, it is timestamped by Device B. The
timestamp is 11:00:01 am (T2).
● When the NTP message leaves Device B, Device B timestamps it. The timestamp is
11:00:02 am (T3).
● When Device A receives the NTP message, the local time of Device A is 10:00:03 am
(T4).
Up to now, Device A can calculate the following parameters based on the timestamps:
● The roundtrip delay of NTP message: Delay = (T4–T1) – (T3-T2) = 2 seconds.
● Time difference (the offset) between Device A and Device B: Offset = ((T2-T1) +
(T3-T4))/2 = 1 hour.
Based on these parameters, Device A can synchronize its own clock to the clock of Device B.
For more information, see RFC 1305.
1.2 What is Unix time?
The Unix operating system and its derivatives, such as Linux, represent time as the
number of seconds since Jan. 1, 1970. This assignment will represent Unix time as a Python
floating point number. However, both NTP and C use a fixed-point representation of Unix time,
which is a fixed point number where the first 32 bits are an integer of the number of seconds,
and the second 32 bit integer is the fractions of a second.
1.3 How to get the local computer’s current time
Here is some example Python code to get the the number of seconds since 1970-01-01 from
the local clock:
from datetime import datetime
time_difference = datetime.utcnow() – datetime(1970, 1, 1, 0, 0, 0)
secs = time_difference.days*24.0*60.0*60.0 + time_difference.seconds
timestamp_float = secs + float(time_difference.microseconds / 1000000.0)
print(“The number of seconds since Jan. 1, 1970 is: %f” % (timestamp_float))
1.4 NTP Request Packet format
The format from the NTP client to the server for a time synchronization packet is defined in RFC
5905: https://www.rfc-editor.org/rfc/rfc5905.html
←————————- 32 bits ——————————->
0 byte 0 | byte 1 | byte 2 | byte 3 |
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|LI | VN |Mode | Stratum | Poll | Precision |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Delay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Root Dispersion |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reference ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Reference Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Origin Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Receive Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Transmit Timestamp (64) +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
. .
. Extension Field 1 (variable) .
. .
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
. .
. Extension Field 2 (variable) .
. .
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Key Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| dgst (128) |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure: Packet Header Format
1.5 Packet Fields:
Note: Set the first byte’s bits to binary 00,011,011 for LI = 0, VN = 3, and Mode = 3.
● LI (Leap Indicator)—A 2-bit leap indicator. When set to 11, it warns of an alarm
condition (clock unsynchronized); when set to any other value, it is not to be processed
by NTP.
● VN (Version Number)—A 3-bit version number that indicates the version of NTP. The
latest version is version 4.
● Mode—A 3-bit code that indicates the work mode of NTP. This field can be set to these
values:
○ 0—reserved
○ 1—symmetric active
○ 2—symmetric passive
○ 3—client
○ 4—server
○ 5—broadcast or multicast
○ 6—NTP control message
○ 7—reserved for private use.
● Stratum—An 8-bit integer that indicates the stratum level of the local clock, with the
value ranging from 1 to 16. Clock precision decreases from stratum 1 through stratum
16. A stratum 1 clock has the highest precision, and a stratum 16 clock is not
synchronized and cannot be used as a reference clock.
● Poll—An 8-bit signed integer that indicates the maximum interval between successive
messages, which is called the poll interval.
● Precision—An 8-bit signed integer that indicates the precision of the local clock.
● Root Delay—Roundtrip delay to the primary reference source.
● Root Dispersion—The maximum error of the local clock relative to the primary
reference source.
● Reference Identifier—Identifier of the particular reference source.
● Reference Timestamp—The local time at which the local clock was last set or
corrected.
● Originate Timestamp—The local time at which the request departed from the client for
the service host.
● Receive Timestamp—The local time at which the request arrived at the service host.
● Transmit Timestamp—The local time at which the reply departed from the service host
for the client.
● Authenticator—Authentication information.
1.6 Servers you can use:
These are time servers that provide good throughput for many consecutive requests:
time.cloudflare.com
time.facebook.com
time.apple.com
clock.nyc.he.net
1.7 Packing and unpacking packets
There are two strategies for getting the integers out of the packet. Recall the packet is a
byte array (or an immutable byte sequence), and certain parts must be interpreted as 32 or 64
bit integers. To “pack” a byte array (or packet) means to fill in the bytes corresponding to the
fields in the packet. To unpack a packet means to extract the field values from the byte array,
One approach is the “C-style” where your code pulls out each byte of the integer and constructs
an int from the four bytes. The most significant byte would be shifted left by 24, then the next
byte would be shifted by 16, etc.
The second approach is to use the struct library in Python. The struct library uses format
strings, in the style of C structs,to pack or unpack the data into a byte array. See:
https://docs.python.org/3/library/struct.html
For example:
#!/usr/bin/python3
import struct
# this format is 4 8-bit characters (bytes), 2 integers and a 64-bit long long integer
# This is the example C structure
# struct packet {
# char b0; // a single byte
# char b1;
# char b2;
# char b3;
# int i0; // 32 bit integer
# int i1;
# long long q0; // 64 bit integer
# };
# this is the format string which describes the data types and their order
# ! = network byte order, c = char, i = int, q = long long int (64-bit)
format_string = ‘!cccciiq’
# get the total number of bytes in this format
bytesInFormat = struct.calcsize(format_string)
# create a byte sequence with the format and elements of the format
new_packet = struct.pack(format_string,b’3′,b’a’,b’4′,b’6′, 7841,89154,9897765654)
print(new_packet)
# given a byte sequence, extract the values given the interpretation of the data types in the
sequence
# python returns a tuple containing the types
(c0,c1,c2,c3,i0,i1,q0) = struct.unpack(format_string,new_packet)
print (“got data ” ,c0,c1,c2,c3,i0,i1,q0)
# create a mutable array from the packed bytes:
modified_packet = bytearray(new_packet)
modified_packet[0] = 6
modified_packet[1] = 8
# change the first 32-bit integer to a 1
modified_packet[4] = 0
modified_packet[5] = 0
modified_packet[6] = 0
modified_packet[7] = 1
# print the bytearray that has been modified
print(“the modifed packet is”, modified_packet)
(c0,c1,c2,c3,i0,i1,q0) = struct.unpack(format_string,modified_packet)
print (“Modified data ” ,c0,c1,c2,c3,i0,i1,q0)
Create a format string that describes a sequence of 3 bytes, followed by 2 32-bit integers:
>>> from struct import *
>>> # fs is the format string
>>> # the format string has a list of types
>>> #!= network byte order , ‘3c’=3 characters, ‘2I’=2 integers
>>> fs = ‘!3c2I’ # 3 characters and 2 integers in network byte order
>>> pack(fs,b’a’,b’b’,b’c’,63,654)
b’abc\x00\x00\x00?\x00\x00\x02\x8e’
>>> unpack(fs,b’abc\x00\x00\x00?\x00\x00\x02\x8e’)
(b’a’, b’b’, b’c’, 63, 654)
2. Functions that must be defined and their pseudocode
Your code must be in a single file called ntpclient,py. It must have the following 3 functions,
which are defined as below:
(1) ntpPktToRTTandOffset(pkt,T1,T4)
(2) getNTPTimeValue(server, port)
(3) getCurrentTime(server,port,iters=20)
Your client will be imported into a tester program, also written in Python3. The tester program
will call the 3 assignment functions with different inputs and check the outputs.
Your code can not use the ntplib Python library. If the tester program finds that this library,
or any of its functions, are imported, your client will be marked as failing all tests. Your code
must use sockets to communicate with a remote NTP server.
2.1 Basic Packet Parsing
This function takes an completed NTP data packet, as Python bytes, and input Unix
timestamps, as floating point numbers, and returns the Round Trip Time (RTT) and offset as a
Python two-tuple, with the first element being the round-trip time and the second element being
the offset. Both are floating point numbers in seconds.
Test 1: (40 points)
This test will create an NTP packet with the time values filled in, and two timestamps
from the local client, and call your function, which will return the RTT and offset. This test sees if
your code correctly parses an NTP packet.
def ntpPktToRTTandOffset(pkt,T1,T4):
# foreach of the 2 timestamps (T2,T3) in the packet do:
# get the bytes for the seconds part and convert to a
# floating point number
# get the bytes for the fraction part and convert to
# a floating point number
# combine the seconds and fraction into 1 number
# compute the RTT by: (T4-T1) – (T3-T2)
# compute the offset by: ((T2-T1) + (T3-T4))/2
# return a 2-tuple containing the RTT and offset as Python
floats
# return (RTT, offset)
2.2 Communication and Setting the Time
This function tests if you can communicate to a remote NTP server, as well as get the
local time needed to set the clock. This test will call your function and make sure the time values
in the packet are close to ones sent by the tester code. Return T1 and T2 as Unix time using
Python floating point numbers.
Test 2: (40 points)
def getNTPTimeValue(server, port):
# make an NTP packet
# take a timestamp, T1 = current_time
# send packet to the server,port address
# receive the response packet
# take a timestamp, T4 = current_time
# return a 3-tuple:
# return (pkt, T1, T4)
2.3 Setting the clock
This function combines the previous 2; communicating with the remote server, parsing a
packet, and computing the current time. The function must return the current time, computed
with an average of offset values.
Test 3: (20 points)
# Computing the current time in Unix time format (seconds with
microsecond fractions since 00:00:00 UTC on 1 January 1970)
def getCurrentTime(server,port,iters=20):
# offsets = empty list
for _ in range(iters):
# call (pkt,T1,T4) = getNTPTimeValue(server, port)
# call (RTT,offset) = ntpPktToRTTandOffsett(pkt,T1,T4)
# append offset to offsets
# currentTime = average of offsets + current time with
# microsecond granularity
# return currentTime in Unix time as a Python float
3. Stub code for the client:
#!/usr/bin/env python
”’
CS352 Assignment 1: Network Time Protocol
You can work with 1 other CS352 student
DO NOT CHANGE ANY OF THE FUNCTION SIGNATURES BELOW
”’
from socket import socket, AF_INET, SOCK_DGRAM
import struct
from datetime import datetime
def getNTPTimeValue(server=”time.apple.com”, port=123) -> (bytes, float,
float):
# fill in your code here
return (pkt, T1, T4)
def ntpPktToRTTandOffset(pkt: bytes, T1: float, T4: float) -> (float, float):
# fill in your code here
return (rtt, offset)
def getCurrentTime(server=”time.apple.com”, port=123, iters=20) -> float):
# fill in your code here
return currentTime
if __name__ == “__main__”:
print(getCurrentTime())
4. How to handin your client
Your client must be a single file named ntpclient.py
You need to upload your client in Gradescope. Log into canvas and use the Gradescope tool on
the left. The assignment is called “NTP Client”
Upload a Python file called “ntpclient.py”
How to add group members in gradescope:
https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members
CS 352 Assignment 2: Message Validating Client and Server
In this assignment you will write a client and server that validates the integrity of text messages
using a protocol based on secret keys.
In this scenario, a client has downloaded a file from a
3rd party, untrusted source that contains several text messages from the server, for example,
emails. The client also has a file of associated “signatures”, or hashes, of the messages, also
from the untrusted source. The signatures prove that the server originated the messages,
however, the client must verify these signatures.
In this project, the client will validate that the messages came from the server by assuming a
secure channel, and then contact the server and have it provide the signatures. If the signatures
match, the server must have originated the message. Note that in a more realistic scenario the
server would have a pre-computed database of messages and hashes, however, for this
assignment we’ll assume there is a trusted connection between the client and server so the
server can compute the hashes on the fly.
Figure 1 shows an overview and example protocol exchange between the client and server.
Overview:
Figure 1: Example Protocol Exchange
One way hash functions:
A one-way hash function takes a long stream of input, say a text file, and produces a fixed sized
hash value. For example, in SHA 256 it is a 256 bit number, and in MD5 it is a 128-bit number.
The server takes a string of text (potentially long) and a secret key, and hashes them together.
The resulting one-way hash validates that only the owner of the secret key produced the hash
value. If the owner produced such a hash value, it “signed” the message.
You can find examples of using SHA 256 to generate hash codes at the links below:
https://docs.python.org/3/library/hashlib.html
https://www.geeksforgeeks.org/sha-in-python/
The protocol:
The client and server communicate via TCP with a port defined as a command line argument.
The protocol works by sending ASCII characters — not UTF-8 or unicode strings.
The client connects to the server and issues a “HELLO” on a single line.
The server responds with a “260 OK” string if it receives the “HELLO” message.
Then the client sends a Command. There are only two commands: “DATA” or “QUIT”.
Each time that the client wants to send a message to the server it will send the DATA command
first. That means the client sends the DATA command separately and Msg 1 after that. Then for
sending the next message it will send another DATA command.
The server checks if it received the DATA command, then it will compute the hash using the key
and the message.
The server then sends the computed hash value and a “270 SIG” string to the client.
The client will compare the computed hash value with the signature value which is provided,
and it will send the PASS or FAIL to the server.
Then the server sends the “260 OK” string once again when it receives the PASS and FAIL
result from the client.
Server checks if it receives the QUIT command, so it will close the connection
A DATA Message:
The client sends the DATA command. Then it sends the ASCII text of the message.
The end-of-message is a single “.” (dot) on a line by itself.
At the end-of-message line, the server responds to the client with the response “270 SIG” on
one line, then the SHA256 value as a hexadecimal number is ASCII on the next line.
A QUIT Message:
After the client sends a quit message, it closes the TCP socket and exits the protocol.
Escape codes:
When using in-band signaling, that is, sending control information on the same stream as data,
escape codes are needed. Recall a ‘.’ on a line by itself signals the end of the message, it is not
part of the message data. We need a way to distinguish between a line with a single dot as part
of the message, and the control signal that the data has ended.
To differentiate a dot on a line in the data of the message from the end-of-message code, a
single dot in the message is “escaped” with a “\”. So a text message with a single dot would be
sent as: “ \.\r\n ”. Further backslashes are escaped with a “\” as well. So a message with a “\.” in
the data would be sent as “\\.”. Thus, when your code reads a line, it should unescape any
strings with a ‘.’ and a set of \\’s.
Three message files have been provided to you. The messages in the advanced message file
have multiple lines of message and dots in between. In those messages you have to
differentiate between a dot in between and at the end of the message. Therefore, you will
escape the dots in order to keep them as a regular character with no function (end of the
message). That means you will escape the dots in the middle of the message and specify the
end of the message with “ \.\r\n ”.
Message file format:
A message file is a sequence of message-lengths and messages in ASCII format.
A message length is an unsigned integer string followed by a control line feed (end of line in
Unix).
A message is a sequence of ascii characters of the length of the message.
Signature file format:
The signatures, one for each message, are stored as hexadecimal numbers in a string format,
one per line.
Key file format:
The secret keys, one for each message, are stored as strings, one per line.
Pseudo code for the client:
Start the program with the following arguments in order:
<server-name> <server-port> <message-filename> <signature-filename>
For example: python3 client.py localhost 7894 message1.txt sig1.txt
Open the message file from the name in the command line
While there is still more data in the message file:
Read in one line
Convert the string into the number of bytes
Read in the number of bytes from the message file into a byte string or byte array
Append the bytes of the message into an array of messages
Open the signature file from the name in the command line
While there is still more data in the signature file:
Read in one line
Append the string of the signature into an array of signatures
Open a TCP socket to the server using the name and port from the command line
Send a “HELLO” message to the server
Read the response
If the response is not “260 OK”, print an error and end the program.
Set a message counter variable to zero
Foreach message in the array of messages:
Send the DATA command in one line on the TCP socket
Send the message on the TCP socket
Read a line from the server from the TCP socket
If the response is not “270 SIG”, print an error and end the program.
Read another line from the server
Compare the string from the server with the signature string stored in the array
signatures for this message at the message counter number
If the strings match:
send a PASS message to server
Else:
send a FAIL message to the server
Read a line from the server
If the response is not “260 OK”, print an error and end the program.
Increment the message counter
Send a QUIT message to the server
Close the TCP socket.
Pseudo code for the server:
Start the server with the following arguments in order:
<listen-port> <key-file>
For example: python3 server.py 7894 key.txt
Read in all the keys from the key-file.
Open a TCP socket on the port
When the connection competes, read a line from the returned connected socket
If the line is not “HELLO”, print an error, close the socket, and exit the program
While there are still more messages to validate:
Read the next line from the socket
Case statement on the string in the line # use case statements, one case for each command
DATA:
Start a new SHA 265 hash
While there are more lines in the message:
Read a line from the TCP socket
Unescape the line
If the line is “.”, or it’s escaped equivalent, break from the loop
Add the line to the SHA 256 hash
Add the key to the hash, and finish the hash
Send the 270 SIG status code back to the client on one line.
Send a hexadecimal string value on the TCP socket of the signature on one line.
Read a line from the socket
If the line is not either “PASS” or “FAIL”, print an error, close the socket and exit
the program
Send a “260 OK” string on the socket
QUIT:
Close the socket and end the program
Default:
Print and error, close the socket, and end the program
Grading:
Your program will be auto-graded. Please note that your code should work for all the three
message files that have been provided.
Program names:
You must upload a zip file called submission.zip with 2 files, one called client.py
and server.py. Your code must run from a main function, not the start of the script. E.g.,
something like:
def main():
print(“Your code goes here”)
if __name__ == “__main__”:
main()
Test 1:
We will run your server against our client. The server must follow the protocol and produce
the correct hashes for all the messages.
Test 2:
We will run your client against our server. The client must follow the protocol and produce the
PASS/FAIL responses for each of the messages.
Client and Server Outputs for Grading:
The client and server should print out every message they receive using normal print
statements (to stdout). Each message should be on one line.
For example:
– Every time you receive a message print it out on its own line
– So for example if you receive a “270 SIG” code from the server followed by its computed
signature these should both be printed, each on their own lines on the client script.
CS 352 Assignment 3 HTTP Server with User Authentication and File Serving
Disclaimer: This project is meant for educational purposes only. A real HTTP server
(such as Apache) has been developed over decades by hundreds of experienced
developers.
A note: Blindly using GPT and friends to produce code will probably get you flagged for
plagiarism. The plagiarism detector will check against code structures generated by GPT.
Project description:
You will create a small HTTP server that provides basic user authentication and supports
returning contents of files. Users can log in with a username and password, and upon
successful authentication, they can query the contents of files from the users directory. The
server is designed to use cookies to manage user sessions and to ensure the security of file
retrieval.
Server functionality:
● User Authentication:
○ Users can log in with a username and password.
○ Passwords are stored as SHA-256 salted-hashed values in an “accounts.json”
file.
○ Successful logins are authenticated by hashing the plaintext password with the
salt stored in “accounts.json”.
○ User sessions are tracked using a cookie with a timeout.
○ Multiple users can be logged in at the same time.
● File Download:
○ Authenticated users can view the contents of files from a specified directory.
○ File access is restricted to the authenticated users directory only.
○ Unauthorized access to files is denied.
● Session Management:
○ User sessions are managed using randomly generated session IDs stored as a
cookie.
■ Sessions can be tracked using a Python dictionary, which maps session
IDs to usernames and login times.
○ Sessions expire after a configurable timeout period.
Server implementation:
The server should be implemented in Python3, using TCP sockets to handle incoming
HTTP requests. HTTP requests should be parsed manually, without the assistance of any
external modules (see recitation 9 slides). Use of external modules may cause the autograder to
give 0 points.
Allowed Python modules:
● socket, json, random, datetime, hashlib, sys
The server is executed with command-line arguments, specifying the IP address, port,
accounts file, session timeout, and the root directory for file downloads.
Usage:
Your server should process only the following command-line arguments:
python3 server.py [IP] [PORT] [ACCOUNTS_FILE] [SESSION_TIMEOUT] [ROOT_DIRECTORY]
Example: python3 server.py 127.0.0.1 8080 accounts.json 5 accounts/
● IP: The IP address on which the server will bind to.
● PORT: The port on which the server will listen for incoming connections.
● ACCOUNTS_FILE: A JSON file containing user accounts and their hashed passwords
along with the corresponding salt.
● SESSION_TIMEOUT: The session timeout duration (in seconds).
● ROOT_DIRECTORY: The root directory containing user directories.
Included project files and scripts:
You are provided with the following goodies:
● server.py
○ This is where you will implement all of your code for the HTTP server.
● passwords.json
○ This json file contains key-value pairs of username-plaintext_password combos
for all existing user accounts. This file will be used by the client to login to an
account. The username and plaintext passwords should be interpreted as strings.
● accounts.json
○ This json file contains key-value pairs of username-[hashed_password,salt]
combos for all existing user accounts. Note that the value for a username key is a
json array where the first element is the hashed salted password and the second
element is the salt itself. This file should be used by the server to validate login
credentials. The username and password with the salt are stored as strings.
● sample.sh
○ This script contains some CURL commands for local testing of your HTTP server.
● accounts/
○ This directory contains user accounts and files accessible by the server.
Server logs:
Your server should log (print) login attempts, file downloads, and session expirations,
including timestamps. The format is as follows:
SERVER LOG: [current time as Year-month-day-hour-minute-second] [MESSAGE]
The MESSAGE field of the logged output is specified in the pseudocode for the server. Here is
an example of what is printed on a successful login for the user “Jerry”, in pseudocode we refer
to this as: log with MESSAGE “LOGIN SUCCESSFUL: {username} : {password}”
SERVER LOG: 2023-10-31-10-13-45 LOGIN SUCCESSFUL: Jerry : 4W61E0D8P37GLLX
Server Specifications:
HTTP and Requests:
● The server can use HTTP version 1.0, but the server should not care what HTTP version
the client uses.
● A POST request is used for logging in. The request must have a request target of “/”
● A GET request is used for retrieving files after logging in. The request must have a
request target of the filename from the root “/”, such as “/file.txt”.
○ The server then finds the “file.txt” file in the directory for that user only (via the
username)
○ The server must read the contents of the file as text and insert it into the HTTP
body of the response
● A sample POST request with response
○ Client sends:
POST / HTTP/1.0
Host: 127.0.0.1:8080
User-Agent: curl/7.68.0
Accept: */*
username: Jerry
password: 4W61E0D8P37GLLX
○ Server replies with:
HTTP/1.0 200 OK
Set-Cookie: sessionID=0x68938897ef8fdfc8
Logged in!
○ Server logs:
SERVER LOG: 2023-11-02-15-16-46 LOGIN SUCCESSFUL: Jerry : 4W61E0D8P37GLLX
● A corresponding sample GET request for the file “file.txt” with response. Note: This is
after logging in with the previous POST request and still within the session timeout limit.
○ Client sends:
GET /file.txt HTTP/1.0
Host: 127.0.0.1:8080
User-Agent: curl/7.68.0
Accept: */*
Cookie: sessionID=0x68938897ef8fdfc8
○ Server replies with:
HTTP/1.0 200 OK
The different snowstorm exhibits fee.
○ The server also logs:
SERVER LOG: 2023-11-02-15-23-21 GET SUCCEEDED: Jerry : /file.txt
Cookies:
HTTP is a stateless protocol, and thus some data needs to be sent on each request to
maintain a state (i.e. user session). For this, HTTP has a header field called “Cookie” which
contains a list of key-value pairs for each cookie. When using a browser, cookies are
automatically sent and maintained per website domain.
Since we are not using a browser to
access the HTTP server in this project, we will manually send cookies on each request. To first
obtain a cookie, the server must send back the header “Set-cookie” which tells the client (i.e.
browser) to set a cookie in the “Cookie” header.
Cookie Format:
The cookie should be called ”sessionID” and should consist of a random 64 bit integer in
string hexadecimal format. Here is an example of the “Set-cookie” header the server may return:
Set-Cookie: sessionID=0xfff3c577d6381f1d
How to handle passwords and salts in the password file:
Storing a plaintext password leads to security flaws in systems. What happens if the
password store is stolen, such as the case of a database breach? In order to authenticate
passwords without actually storing them, servers can hash the password and store the hash.
Authentication then works as follows:
1. The client sends the username and plaintext password to the server using a POST
request.
2. The server hashes the plaintext password.
3. The server performs a lookup into its database to check if the hashed password
corresponds to the username supplied.
4. The server either logs in or fails respectively.
But this method of storing passwords is still insecure. What happens if an attacker steals the
database of hashed passwords? It may appear our plaintext passwords are safe, as the
attacker cannot reverse the hashes. Although this is true, the attacker can perform a dictionary
attack; this is when the attacker hashes a list of commonly used words and checks if these hash
to any of the hashes in the stolen database.
For example, many people may have the “apple” as
a password. As such, many of the hashes in the database would be hash(“apple”), and the
attacker could find these. On the other hand, if a user chooses a random string, such as
“EDVcS” as a password, then it is unlikely the attacker would choose such a plaintext password
to hash for a dictionary attack.
Of course we would like this property to hold for regular
passwords such as “apple”, and for this we will use a cryptographic salt
A salt is a random piece of data appended to the password string by the server before it
is hashed. Since a small change in input leads to vast changes in output for cryptographic hash
functions, a salt that is appended to a plaintext password will hash as if the password itself was
random.
This is what we want; if many people pick the password “apple” we can generate a
random salt for each user and append it to the end of “apple” and hash this instead. The
plaintext salt is then stored alongside the hashed password + salt combo. A dictionary attack
now requires adding each stored salt to the end of phrases to correspond the hashes,
preventing hash(“apple”) from being a common hash among the leaked data.
HTTP Format
HTTP is an application layer protocol that usually consists of UTF-8 characters. HTTP
messages consist of a series of lines ending in a CRLF (“\r\n”). HTTP message lines consist of a
start-line, zero or more header fields (also known as “headers”), an empty line (i.e., a line with
nothing preceding the CRLF) indicating the end of the header fields, and possibly a
message-body.
This means you can parse an HTTP message in Python by just reading the
lines from the TCP socket. Detailed information about the formatting of an HTTP message can
be found here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages
Parsing Json
A JSON file can be turned into a Python dictionary with a couple lines of code if you use
the json module. A simple example can be found here:
https://www.geeksforgeeks.org/convert-json-to-dictionary-in-python/
Pseudocode for Server:
____________________________________________________
Import necessary libraries
Function to handle a POST request for user login:
Obtain “username” and “password” from request headers
If 1 or both fields missing:
Return HTTP status code “501 Not Implemented”, log with
MESSAGE “LOGIN FAILED”
If “username” and “password” are valid:
Set a cookie called “sessionID” to a random 64-bit
hexadecimal value
Create a session with required info for validation using the
cookie
Log with MESSAGE “LOGIN SUCCESSFUL: {username} : {password}”
Return HTTP 200 OK response with body “Logged in!”
else:
Log with MESSAGE “LOGIN FAILED: {username} : {password}”
Return HTTP 200 OK response with body “Login failed!”
Function to handle a GET requests for file downloads:
Obtain cookies from HTTP request
If cookies are missing return HTTP status code “401 Unauthorized”
If the “sessionID” cookie exists:
Get username and timestamp information for that sessionID
If timestamp within timeout period:
Update sessionID timestamp for the user to current time
If file “{root}{username}{target}” exists:
Log with MESSAGE “GET SUCCEEDED: {username} : {target}”
Return HTTP status “200 OK” with body containing
the contents of the file
Else:
Log with MESSAGE “GET FAILED: {username} : {target}”
Return HTTP status “404 NOT FOUND”
Else:
Log with MESSAGE “SESSION EXPIRED: {username} : {target}”
Return HTTP status “401 Unauthorized”
Else:
Log with MESSAGE “COOKIE INVALID: {target}”
Return HTTP status “401 Unauthorized”
Function to start the server:
Create and bind a TCP socket
Start listening for incoming connections
While True:
Accept an incoming connection
Receive an HTTP request from the client
Extract the HTTP method, request target, and HTTP version
If HTTP method is “POST” and request target is “/”:
Handle POST request and send response
Elif HTTP method is “GET”:
Handle GET request and send response
Else:
Send HTTP status “501 Not Implemented”
Close the connection
Function Main:
Call the start server function and pass command-line arguments
____________________________________________________
Using Curl as the HTTP client
Test cases used by autograder
1. No Username (POST at the root)
2. No Password (POST at the root)
3. Username incorrect (POST at the root)
4. Password incorrect (POST at the root)
5. Username (1st username) correct/password correct (POST at the root)
6. Username (1st username) correct/password correct (POST at the root) -> Generate a
new cookie
7. Invalid cookie (GET)
8. Username (1st username) (GET filename for user 1 ) correct
9. Username (2nd username) correct/password correct (POST)
10. GET file successful (GET filename for user 2 )
11. GET file not found (GET FAIL)
Sleep for 6 seconds
12. Expired cookie with username 2 (GET filename for user 2)
Every test is worth 10 points for 120 points total.
Sample Curl commands:
Provided in sample.sh
Handing in Assignment:
You need to upload your code to Gradescope. Log into canvas and use the Gradescope tool on
the left. The assignment is called “HTTP Server”. Upload a Python file called server.py.
How to add group members in gradescope:
https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members
References:
● About HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages
● HTTP Format:
https://docs.netscaler.com/en-us/citrix-adc/current-release/appexpert/http-callout/http-req
uest-response-notes-format.html#format-of-an-http-request
● Python Socket Programming: https://docs.python.org/3/library/socket.html
● Python hashlib Module: https://docs.python.org/3/library/hashlib.html
● Cookies: https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies
● Curl tutorial: https://curl.se/docs/tutorial.html
CS 352 Assignment 4 Web Server Characterization using Wireshark Traces
In this project you will write a Python3 program that measures web-server performance. Your program will read saved packet traces in the pcapng (packet-capture – next generation) format.
This format is used by many networking analysis programs, including the popular Wireshark, and its terminal-mode version, tshark, These programs can observe live network traffic, as well as record network traffic to files for further offline analysis.
Your program will filter HTTP request packets from a trace and match the corresponding HTTP response packets, and then record the round-trip latency of each HTTP request response pair. It will report the average latency, as well as the 25th, 50th, 75th, 95th and 99th percentiles of latency. Recall the percentile is the Nth latency value if we sorted the latencies.
For example, if we had 1000 response observations sorted by their values, the 25th percentile is the 250th value in the sorted list, and the 99th percentile would be the 990th value. Note that the 50th percentile is the median value. Web site performance is often characterized by “long tails”; that is, most web requests are fast, but a significant fraction are slow.
When plotted on a graph of response time vs. frequency, a long “tail” becomes visible. Thus, the average, median, 95th and 99th percentiles are often included when describing a web server’s response time performance.
The scapy Python library
The scapy Python library is used for processing packets, both ones captured live from a network interface, as well as ones read from stored trace files. The library represents a packet as a series of nested Python dictionaries. In general, a packet is a dictionary where a field name in the packet, for example, the destination address or port number, is represented by a named key and the packet’s field value is the value of the key.
The different encapsulation levels of the packet are represented by nesting dictionaries in Python. So, for example, a dictionary holding values for a TCP packet would be a child dictionary of the dictionary holding the Ethernet information.
You can see the dictionary structure using the show() method. For example, call packet.show() on a packet to see what keys are available. Installing: You can install scapy library by running pip. For example: pip3 install scapy The ilab machines support installing scapy using pip3 as above. Example program showing how to use scapy: Below is an example program to get you started using the library. #!/usr/bin/python3 from scapy.all import * import sys import time import math # make sure to load the HTTP layer load_layer(“http”) pcap_filename = “pcap1.pcap” # example counters number_of_packets_total = 0 number_of_tcp_packets = 0 number_of_udp_packets = 0 processed_file = rdpcap(pcap_filename)# read in the pcap file sessions = processed_file.sessions() #get the list of sessions/TCP connections for session in sessions: for packet in sessions[session]: # for each packet in
each session number_of_packets_total = number_of_packets_total + 1 #increment total packet count if packet.haslayer(TCP): # check is the packet is a TCP packet number_of_tcp_packets = number_of_tcp_packets + 1 # count TCP packets source_ip = packet[IP].src # note that a packet is represented as a python hash table with keys corresponding to dest_ip = packet[IP].dst # layer field names and the values of the hash table as the packet field values if (packet.haslayer(HTTP)): # test for an HTTP packet if HTTPRequest in packet: arrival_time = packet.time # get unix time of the packet print (“Got a TCP packet part of an HTTP request at time: %0.4f for server IP %s” % (arrival_time,dest_ip))
else: if packet.haslayer(UDP): number_of_udp_packets = number_of_udp_packets + 1 print(“Got %d packets total, %d TCP packets and %d UDP packets” % (number_of_packets_total, number_of_tcp_packets,number_of_udp_packets))
Approach: Your program needs to match the HTTP requests with the responses. One approach is to find an HTTP request, remember arrival time and the TCP ports and IP addresses. The code then would try to find a match to the return HTTP request using the TCP ports and IP address (recall these are inverted on the response). When a match is found, the time difference is computed as the server’s response time.
Running the program: Usage: Your program must be named: measure-webserver.py Input Arguments: a pcapng file, a destination server IP address and a destination port, as: measure-webserver.py The code must produce 2 lines of output: AVERAGE LATENCY: PERCENTILES: <float-25%> , <float-50%>, <float-75%>, <float-95%>, <float-99%>
Note: values are to be reported in seconds. Use at least 5 digits of accuracy to the left of the decimal point. The autograder will give a +/- 5% tolerance on the answer. Included project files and scripts: You are provided with the following goodies: ● example-scapy.py ○ The example code above. ● pcap1.pcap ● pcap2.pcap ● pcap3.pcap These are test input files, described below. . Test Input Files: The following input files are provided to test your program.
The output is also provided. pcap1.pcap: Website: https://example.com Destination address: 93.184.216.34 Port number: 80 AVERAGE LATENCY: 0.00290 PERCENTILES: 0.00271 0.00283 0.00300 0.00355 0.00372 pcap2.pcap: Websites 1: https://example.com Destination address: 93.184.216.34 Port number: 80 AVERAGE LATENCY: 0.00292 PERCENTILES: 0.00267 0.00278 0.00293 0.00358 0.00570 Website 2: https://apple.com Destination address: 17.253.144.10 Port number: 80 AVERAGE LATENCY: 0.00287 PERCENTILES: 0.00204 0.00218 0.00248 0.00806 0.01423 pcap3.pcap: Website 1: https://info.cern.ch Destination address: 188.184.100.182 Port number: 80 AVERAGE LATENCY: 0.08570 PERCENTILES: 0.08550 0.08564 0.08579 0.08603 0.08891 Website 2: https://neverssl.com Destination address: 34.223.124.45 Port number: 80 AVERAGE LATENCY: 0.08284 PERCENTILES: 0.08193 0.08267 0.08340 0.08534 0.08583
Handing in Assignment: You need to upload your code to Gradescope. Log into canvas and use the Gradescope tool on the left. The assignment is called “HTTP Server”. Upload a Python file called measure-webserver.py
How to add group members in gradescope: https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members Extra Credit (+50%): Computing the Kullback–Leibler divergence We could model the time it takes an HTTP server to process a request with an exponential distributed service time, with a mean service rate of λ. The exponential distribution in this scenario models the probability of the time it takes to service the request.
The rate parameter λ can be estimated from the observed average of a real web server. However, the true distribution of server responses might not be well modeled by an exponential distribution. The Kullback-Leibler divergence is a method of comparing two distributions, so we can use the KL-divergence as a metric to evaluate how closely the exponential distribution matches the observed distribution of the time it takes to service web requests. If the distributions are “close”, the exponential distribution is a good approximation of real servers.
If the KL-divergence is “far”, another distribution would be a preferable model. Recall we can model a discrete probability distribution as a number of “buckets”, where each bucket is the probability of some outcome. In our cases, a bucket is the probability that the time of an HTTP request falls in some range. For example, the probability that a web request takes between 0.02 and 0.03 seconds might be 25%.
The sum of all the bucket probabilities must add to 1. Given two discrete probability distributions, the KL-divergence is easy to compute. One distribution will come from the measured responses, which we will call the “measured distribution”. We will get the second distribution from the exponential function, which we will call the “modeled distribution”.
In this assignment we will use 10 buckets. For the measured distribution, compute a range from zero seconds to the maximum observed latency and divide this time range into 10 equal pieces. Next, assign each observed latency sample a bucket. Dividing each bucket count by the total number of observations gives the measured distribution. Use the mean response time to create an exponential model distribution. Note that the exponential parameter is rate, so you need to make the rate, λ = ( 1.0 / mean response time) .
To get the probability mass of each bucket, compute the exponential distribution’s CDF at each bucket’s boundary, and subtract the greater one from the lesser one; this difference in the CDF values gives the probability mass over the time-range. Note that for the last bucket, you may have to make the time-range at the edge of the distribution very large to get most of the mass to add up close to 1.
Pcap1: Website: https://example.com Destination address: 93.184.216.34 Port number: 80 AVERAGE LATENCY: 0.00290 PERCENTILES: 0.00271 0.00283 0.00300 0.00355 0.00372 KL DIVERGENCE: 2.87489 Pcap2: Websites 1: https://example.com Destination address: 93.184.216.34 Port number: 80 AVERAGE LATENCY: 0.00292 PERCENTILES: 0.00267 0.00278 0.00293 0.00358 0.00570 KL DIVERGENCE: 2.87489 Website 2: https://apple.com Destination address: 17.253.144.10 Port number: 80 AVERAGE LATENCY: 0.00287 PERCENTILES: 0.00204 0.00218 0.00248 0.00806 0.01423 KL DIVERGENCE: 1.45393 Pcap3: Website 1: https://info.cern.ch Destination address: 188.184.100.182 Port number: 80 AVERAGE LATENCY: 0.08570 PERCENTILES: 0.08550 0.08564 0.08579 0.08603 0.08891 KL DIVERGENCE: 1.34722 Website 2: https://neverssl.com Destination address: 34.223.124.45
Port number: 80 AVERAGE LATENCY: 0.08284 PERCENTILES: 0.08193 0.08267 0.08340 0.08534 0.08583 KL DIVERGENCE: 1.3453 References: ● Scapy Main Site: https://scapy.net/ ● Scapy documentation: https://scapy.readthedocs.io/en/latest/ ● More example code using scapy that uses trace files: https://medium.com/a-bit-off/scapy-ways-of-reading-pcaps-1367a05e98a8 ● KL-divergence definition: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence ● Another KL-divergence description: https://towardsdatascience.com/understanding-kl-divergence-f3ddc8dff254 ● Computing the KL-divergence in Python: https://machinelearningmastery.com/divergence-between-probability-distributions/ ● Modeling service delay with the exponential distribution: https://courses.lumenlearning.com/introstats1/chapter/the-exponential-distribution