CPSC 418 / MATH 318 Introduction to Cryptography ASSIGNMENTS 1 to 3 solutions

$75.00

Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment

Description

5/5 - (1 vote)

CPSC 418 / MATH 318 Introduction to Cryptography ASSIGNMENT 1

Problems 1-4 ,
Problems 7 and 8 only
Written Problems for CPSC 418 and MATH 318
Problem 1 — Password length and entropy (20 marks)
ASCII, short for American Standard Code for Information Interchange, is a method for
encoding any character into a string of 7 bits. For example, the ASCII codes for ‘A’, ‘a’,
and ‘?’ are 1000001, 1100001, and 0111111, respectively1
. ASCII has long been superseded
by the much more modern and extensive UTF-8 encoding method; ASCII codes constitute a
very small subset of UTF-8 codes.
A password can only contain printable characters that appear on a standard US English
keyboard. These include 26 upper case letters, 26 lower case letters, 10 numerical digits
and the 32 special characters `'”.,;:!?~@#$%^&*_-+=(){}[]<>\/|. Not all ASCII codes
correspond to printable characters.
a. (2 marks) What is the total number of ASCII encodings of 8-character strings?
b. What is the number of passwords of length 8 consisting of
(i) (2 marks) any printable characters?
(ii) (2 marks) lower case letters only?
c. Approximately what percentage of 8-character ASCII encodings are passwords consisting
of
(i) (2 marks) any printable characters?
(ii) (2 marks) lower case letters only?
d. Assuming that each character in a password is chosen equally likely, what is the entropy
of the space of passwords consisting of
(i) (2 marks) 8 printable characters?
(ii) (2 marks) 8 lower case letters?
e. Suppose we want a password space with entropy 128, i.e. a total of 2
128 passwords assuming that all passwords are equally likely.2 What is the minimum password length, i.e.
the minimum number of characters in a password, to achieve this entropy, assuming that
passwords consist of
(i) (3 marks) any printable characters?
(ii) (3 marks) lower case letters only?
1Source: https://www.ascii-code.com.
2For modern cryptosystems, 2
128 is a typical key space size. Of course the assumption that all passwords
are equally likely is a stretch: most of us use only a small handful of our favourite passwords.
2
Problem 2 — One-time pad without the all-zeros key (6 marks)
Recall that in the one-time pad, plaintexts, ciphertexts and keys are n-bit blocks for some
n ∈ N. For any key K, plaintext block M and ciphertext block C, the encryption of M with
key K is EK(M) = M ⊕ K and decryption of C with key K is DK(C) = C ⊕ K, where
⊕ denotes bitwise x-or (or equivalently, bitwise addition modulo 2). Assume as always that
keys are chosen with equal likelihood.
When using the one-time pad with the key K = 0n
consisting of n zeros, we have EK(M) =
M⊕K = M for every paintext M, i.e. M is left unencrypted. It has been suggested to improve
the security of the one-time pad by only encrypting with non-zero keys, i.e. removing 0
n
from
the key space.
a. (4 marks) Use either the definition of perfect secrecy,
p(M) = p(M|C) for all plaintexts M and ciphertexts C with p(C) > 0
or its equivalent characterization
p(C) = p(C|M) for all plaintexts M with p(M) > 0 and ciphertexts C with p(C) > 0
to formally prove that this modification of the one-time pad does not provide perfect
secrecy. Specifically, you will need to give a counterexample, i.e. exhibit a pair (M, C)
that violates the definition or the above characterization of perfect secrecy. Answers that
argue informally or do not use this characterization will receive very little if any credit.
b. (2 marks) Explain (informally) why allowing K = 0n does not weaken the security of the
one-time pad, even though using that key does not hide the plaintext when encrypting.
(Note: the assertion that choosing K = 0n happens so rarely that we don’t have to worry
about it is not a valid answer and will not receive any credit.)
Problem 3 — Weak collisions (16 marks)
The cryptographic relevance of this problem will become evident when we cover hash functions
in class.
For each question below, provide a brief explanation and a compact formula for your answer.
For part (d), give a numerical value that is an integer.
Let n be positive integer. Consider an experiment involving a group of participants, where
we assign each participant a number that is randomly chosen from the set {1, 2, . . . , n} (so all
these assignments are independent events). Note that we allow for the possibility of assigning
the same number to two different participants.
Now pick your favourite number N between 1 and n. When any one of the participants is
assigned the number N, we refer to this as a weak collision (with N). In this problem, we
determine how to ensure at least a 50% chance of a weak collision in our experiment.
3
a. (2 marks) What is the probability that a given participant is assigned your favourite
number N?
b. (2 marks) What is the probability that a given participant is not assigned the number N?
c. (3 marks) Suppose k people participate in the experiment (for some positive integer k).
What is the probability that none of them is assigned the number N, i.e. that there is no
weak collision?
d. (4 marks) Intuitively, the more people participate, the likelier we encounter a weak collision. We wish to find the minimum number K of participants required to ensure at least
a 50% chance of a weak collision.
Suppose n = 10. What is the threshold K in this case?
e. (5 marks) Generalizing part (d) from n = 10 to arbitrary n, prove that if the number
of participants is above log(2)n ≈ 0.69n, then there is at least a 50% chance of a weak
collision. Use (without proof) the inequality
1 − x < exp(−x) for x > 0 . (1)
(This inequality comes from the Taylor series
exp(−x) = 1 − x +
x
2
2! −
x
3
3! +
x
4
4! − · · · + · · · ,
and the sum of the terms from x
2/2 onwards is positive.)
Concluding Remark. In (1), when x is small, exp(−x) is very close to 1 − x; for example,
exp(0.01) and 1−0.01 are within 4 decimal places of each other. This implies that for large n,
the threshold 0.69n of part (e) is close to optimal, i.e. we expect the necessary and sufficient
number of participants for ensuring a weak collision with at least 50% probability to be on
the order of n. Compare this to the corresponding result for (strong) collisions derived in the
next problem.
Problem 4 — (Strong) collisions (20 marks)
The results of this problem bear relevance to Problem 7 below. Further cryptographic significance will become evident when we cover hash functions in class.
For each question below, provide a brief explanation and a compact formula for your answer.
For part (c), give a numerical value that is an integer.
We consider the same experiment as in the previous problem, although here, you don’t need
to pick a favourite number N. When at least two of the participants are assigned identical
numbers, we refer to this as a strong collision or just a collision. In this problem, we determine
how to ensure at least a 50% chance of a collision in our experiment.
4
a. (4 marks) What is the probability that among k participants, no collision occurs?
b. (2 marks) What is the probability of a collision among k participants?
c. (4 marks) Once again, intuitively, the more people participate, the likelier we encounter
a collision. We wish to find the minimum number K of participants required to ensure
at least a 50% chance of a collision.
Suppose n = 10. What is the the threshold K in this case?
d. (5 marks) Let P be the probability of no collisions as computed in part (a). Prove that
P ≤ exp 

k(k − 1
2n

.
First, if your expression obtained for P in part (a) is not already in this form, rewrite P
as
P =
k
Y−1
i=1
(1 − zi) ,
where 0 < zi < 1 for 1 ≤ i ≤ k − 1 (you’ll have to figure out the appropriate expression
for zi). Then use inequality (1) from the previous problem.
e. (5 marks) Similarly to the previous problem, when n is much larger than the number k
of participants, the upper bound on P in part (d) is a very close approximation to P.
When in addition k is not too small (but still much smaller than n), k is very close to
k − 1. This means that the quantity exp(−k
2/2n) is a very close approximation to the
probability P of part (a).
Generalizing part (c) from n = 10 to arbitrary n, prove that if the number of participants
is above p
log(4)n ≈ 1.177√
n, then there is at least a 50% chance of a collision. You
may replace the actual expression for P derived in part (a) by exp(−k
2/2n).
Note: you can solve this question even if you didn’t attempt parts (a)-(d).
Concluding Remark. As in the previous problem, for large n, the threshold 1.177√
n of part
(e) is close to optimal, i.e. we expect the necessary and sufficient number of participants for
ensuring a collision with at least 50% probability to be on the order of √
n. Compare this to
the corresponding result for weak collisions derived in the previous problem.
5
Written Problems for MATH 318 only
Submit your answers to this problem in a separate PDF file. Do not include the solution
to this problem in the PDF file containing your solutions to Problems 1-4.
Let F2 = {0, 1} with the usual arithmetic modulo 2. The set F
n
2 = {0, 1}
n
consisting of all
n-bit vectors (vectors of length n with entries 0 or 1) is an n-dimensional vector space over
F2, with the standard basis for example. Linear algebra in F
n
2
as a vector space over F2 works
just like linear algebra in R
n as a vector space over R, except that linear combinations of
vectors in F
n
2 have coefficients 0 and 1, and all arithmetic is done modulo 2.
For any n ∈ N, let Gln(F2) denote the set of invertible n × n matrices with zeros and ones
as entries. Calculations involving such matrices again work exactly the same as the familiar
matrix arithmetic over R that you learned in first year linear algebra, except that arithmetic
using real numbers is again replaced by arithmetic modulo 2.
Now fix n ∈ N and consider the class of linear ciphers with M = C = F
n
2
, K = Gln(F2), and
for all plaintexts ~m (interpreted as n-bit column vectors with entries 0 and 1), encryption
under a key matrix K is
EK(~m) = K ~m . (2)
Then for all ciphertexts ~c, decryption under K is obviously
DK(~c) = K−1
~c .
Problem 5 — Transposition ciphers are linear (8 marks)
Recall that a transposition cipher produces a ciphertext by permuting the plaintext symbols.
For a transposition cipher operating on bit strings of some length n ∈ N, encryption of a bit
string (m1, m2, . . . , mn) can thus be written as
Eσ(m1, m2, . . . , mn) = (mσ(1), mσ(2), . . . , mσ(n)
) ,
where σ is a secret key permutation on n symbols.
a. (3 marks) Let n = 4 and let σ be the permutation sending 1 to 2, 2 to 4, 3 to 3 and 4 to
1. Then transposition encryption of a 4-bit string using the key permutation σ is given
by
Eσ(m1, m2, m3, m4) = (m2, m4, m1, m3) . (3)
Now write 4-bit strings (x1, x2, x3, x4) as column vectors ~x =
 x1
x2
x3
x4

. Prove that the
cipher given in (3) is a linear cipher by explicitly writing down a permutation matrix K
that satisfies (2) for this cipher. A permutation matrix is a a square matrix whose entries
consist of zeros and ones such that there is exactly one one and n − 1 zeros in each row
and in each column.
b. (5 marks) Generalizing part (a), formally prove that a transposition cipher operating on
bit strings of any length n is a special case of a linear cipher where the matrix K in (2)
is a permutation matrix. Use formal reasoning and good notation for your proof.
6
Problem 6 — Cryptanalysis of linear ciphers (30 marks)
In this problem, we explore attacks on linear ciphers as given in (2).
a. (3 marks) If each key matrix K is chosen equally likely, does this class of linear ciphers
provide perfect secrecy? Formally prove your claim.
b. (3 marks) Explain how an attacker Eve can mount a chosen plaintext attack on a cipher
of the form (2). The goal of this attack is to chose one or more plaintexts, obtain their
encryptions under some unknown key matrix K, and derive K. How should Eve choose
her plaintexts, and how many does she need to choose in order to be successful?
c. (3 marks) Let ~m1, ~m2, . . . , ~mi be any collection of i linearly independent vectors in F
n
2
for
some i with 1 ≤ i ≤ n. Prove that there are 2
i vectors ~mi+1 ∈ F
n
2
such that the vectors
~m1, ~m2, . . . , ~mi
, ~mi+1 are linearly dependent.
d. (2 marks) As in part (c), let ~m1, ~m2, . . . , ~mi be linearly independent vectors in F
n
2
. How
many vectors ~mi+1 ∈ F
n
2
are there so that the vectors ~m1, ~m2, . . . , ~mi
, ~mi+1 are linearly
independent?
e. (5 marks) Prove that the number of sets consisting of n linearly independent vectors in
F
n
2
is
1
n!
nY−1
i=0
(2n − 2
i
) .
f. (5 marks) Prove that the probability that any set of n vectors in F
n
2
is linearly independent
is
Pn =
nY−1
i=0
(2n − 2
i
)
nY−1
i=0
(2n − i)
.
g. (2 marks) Suppose n = 4. What is the probability that any set of 4 vectors in F
4
2
is
linearly dependent? Express the result as a fraction.
h. (6 marks) Part (b) considered a chosen plaintext attack, whereas we will now consider the
scenario of mounting a known plaintext attack on a linear cipher as given in (2). Given a
set of known plaintext/ciphertext pairs where all the plaintexts were encrypted to their
respective corresponding ciphertexts using the same unknown key matrix K, the goal of
this attack is to find K.
Assume that linear dependence of any collection of n plaintexts in F
n
2
represents independent events, i.e. if p is the probability that any n plaintexts are linearly dependent,
then p
2
is the probability that any two collections of n plaintexts are linearly dependent.
Explain how Eve can use multiple attempts at a known plaintext attack to find the key
matrix. For n = 4, how often does Eve have to try to achieve a 99 percent chance of
success? In other words, what is the maximum number k of failed attempts such that
Eve has a 99% chance of finding K on the (k + 1)-st attempt?
7
Programming Problems for CPSC 418 only
Specifications. Design and implement your solutions using Python 3 and the template
programs provided on the Piazza “Resources” tab. Please do not to alter the filenames,
functions, or their input and output types. Submissions that do not comply with these
specifications will be penalized or not marked at all. Use the latest version of the Python
cryptography library found at
https://cryptography.io/en/latest/
This link can also be found on the “references” page of our course website. Use this library
for all required cryptographic primitives. This includes encryption, decryption, hashing and
the PKCS7 padding module.
The cryptography library has its own interface to which you are expected to adhere. You
must make use of the hazardous materials layer, not the recipes layer. Make sure to use
good coding practices. You can split your program up into multiple source files, so long as
the main executable retains the same name as the template program.
The testing of your program will largely be done on Gradescope using python3. An autograder will be made available to test component functions, while the overall problem solutions
will be tested by the TAs separately.
You may assume that for any input file requirements, Gradescope will use text files of at
most 1 MB in size and that all byte encoding is done using UTF-8.
Submission. Submit a description of your implementation for both problems in a separate
README file in text format. Do not include any written portions of the programming problems in the PDF file containing your solutions to any of the written
problems. Your description must include the following:
• A list of the files you have submitted that pertain to the problem, and a short description
of each file.
• A list of what is implemented in the event that you are submitting a partial solution,
or a statement that the problem is solved in full.
• A list of what is not implemented in the event that you are submitting a partial solution.
• A list of known bugs, or a statement that there are no known bugs.
• Any other answers to questions specified in the problem.
8
Problem 7 — Prefix collision finder (12 marks)
Overview. Bob is interested in storing the passwords for users of his website in hashed
form using SHA-2 224 to produce the hashed outputs. However, for space reasons, he is
considering storing only the first 5 bytes of each SHA-2 224 hash tag. As his friend, you want
to convince Bob that this is a terrible idea by demonstrating that you can easily find two
inputs whose SHA2 224 outputs have the same first 5 bytes.
Problem. Finish the template program named
prefix_collision.py
which finds two distinct bytes objects whose images under SHA-2 224 have the same 5-byte
(40-bit) prefix.
Specifications. You will need to complete three separate functions:
1. hash_bytes(), which takes a sequence of bytes and returns a SHA-2 224 hash value.
2. compare_bytes(), which compares the first n bytes of two sequences and checks if they
match.
3. find_collision(), which uses the other functions to find two byte sequences with hash
values that match for the first n bytes.3
In order for your code to run efficiently you must utilize memory in a smart way. The number
of SHA-2 224 tags that you compute and store should be on the order of √
2
b = 2b/2
, where b
is the number of bits in the prefix. You should also make use of an appropriate data structure
for quickly detecting duplicates.
Submission. Please submit a completed version of the template program, with the filename
prefix_collision.py
If you’ve spread your code across multiple source files, submit all of them. If your code takes
more than 2 minutes on Gradescope’s servers for a 5-byte prefix (n = 5), it will be timed
out, so you will need to do something smarter than hashing every single input one-by-one.
Problem 8 — Password attack on authenticated encryption (26 marks)
Overview. Suppose Bob has designed his own authenticated encryption scheme, using the
hash-then-encrypt paradigm, with AES-128 in CBC mode for encryption, and SHA-2 224 for
the key derivation function and message hash tag. On input the name of a plaintext file and
a password p, Bob’s program does the following:
1. Converts the plaintext file to a byte array B.
3
If you’re an experienced coder, you might be happy to learn that we won’t verify you actually used those
prior functions to help create this one. They exist to provide guidance and partial marks for beginning coders.
9
2. Computes a hash tag t on the plaintext by applying SHA-2 224 to the byte array B, then
appends t to B to obtain an extended byte array B0 = B||t (here, as always, “||” denotes
concatenation).
3. Derives an encryption key by applying SHA-2 224 to the password p and truncating the
result to the appropriate length for use in AES-128.
4. Generates a random 16-byte initial value IV (for use in CBC mode) and writes it to a
file F.
5. Pads the extended byte array B0 using the PKCS7 format, then encrypts the padded array
with AES-128-CBC and appends the resulting ciphertext to the file F.
Out of laziness, Bob is known to use strings of the form YYYYMMDD, which mark certain
dates from his life, as his passwords. Bob was born in the year 1984, and always includes the
string FOXHOUND in his communications.
Problem. Create a Python 3 program that takes as input a file produced by Bob’s hashthen-encrypt routine and performs the following tasks:
1. Determines the password used to derive the encryption key, and prints it out.
2. Decrypts the ciphertext using this key.
3. Checks the resulting plaintext for the phrase CODE-RED. If present, the program replaces
this phrase with the phrase CODE-BLUE and writes the modified plaintext to a new file.
If not present, the plaintext is left unchanged.
4. Computes a new hash tag on the possibly-modified plaintext.
5. Generate a new 16-byte IV.
6. Re-encrypt the possibly-modified plaintext plus new tag using the same password and
exactly the same process as Bob’s hash-then-encrypt program.
7. Writes the result to a new file. Recall that AES-128-CBC has a block size of 16 bytes.
Specifications. Fill in the empty functions in the template program encrypt_modify.py.
Once complete, you should be able to replicate Bob’s encryption scheme by running
python3 encrypt_modify.py –encrypt [PLAINTEXT] –password “YYYYMMDD”
[CYPHERTEXT]
where PLAINTEXT is the name of the file to by encrypted, YYYYMMDD is a number Bob would
use for a password, and CYPHERTEXT is name of the file to store the output. Cracking the
password, doing the text substitution, and reencrypting the result can be performed by
running
python3 encrypt_modify.py –modify [ORIGINAL CYPHERTEXT] [NEW CYPHERTEXT]
where the input file ORIGINAL CYPHERTEXT is the output of the encryption process, and NEW
CYPHERTEXT is an encrypted version of the possibly-modified plaintext. Don’t forget to print
out the password to the terminal.
In detail, the functions you’ll need to fill in are:
10
1. create_iv( length ), which creates an IV that’s length bytes long.
2. derive_key( input ), which takes a string and hashes it to form a key for AES-128.
3. pad_bytes( input ), which pads input so it can be encoded by AES-128-CBC.
4. encrypt_bytes( input, key, iv ), which encrypts input with the given key and IV.
5. hash_pad_then_encrypt( input, string, iv ), which uses the above functions to perform Bob’s encryption algorithm.
6. check_tag( input ), which verifies the tag appended to input matches the hash of the
remaining bytes.
7. decrypt_unpad_check( input, string ), which reverses Bob’s encryption algorithm,
and both verifies and strips off the tag.
8. generate_passwords( year, month, day ), which generates a list of passwords to try,
according to Bob’s format, with the earliest starting on year/month/day.
9. determine_password( input ), which tries to brute-force decrypt input using generate_passwords(
year, month, day ) and decrypt_unpad_check( input, string ).
10. attempt_substitute( input, codeword, target, substitute ), which combines all
of the above to do the decryption, substitution, and re-encryption.
Additional details are in the template program encrypt_modify.py, on the Piazza resources
page.
Submission. Please submit a completed version of the template program, with the filename
encrypt_modify.py
If you’ve spread your code across multiple source files, submit all of them.
11

CPSC 418 / MATH 318 — Introduction to Cryptography ASSIGNMENT 2

Problems 1-5 and
Problem 9 only
Written Problems for CPSC 418 and MATH 318
Problem 1 — Arithmetic in the AES MixColumns operation (20 marks)
Recall that the MixColumns operation in AES performs arithmetic on 4-byte vectors using the
polynomial M(y) = y
4 + 1. In this arithmetic, we have M(y) = 0, so y
4 = 1.
a. In this part of the problem, we consider multiplication of 4-byte vectors (viewed as polynomials
of degree ≤ 3 whose coefficients are bytes) by powers of y.
(i) (2 marks) Formally prove that in this arithmetic, multiplication of any 4-byte vector by
y is a circular left shift of the vector by one byte.
(ii) (4 marks) Generalizing part (i) to other powers of y, formally prove that multiplication
of any 4-byte vector by y
i
(0 ≤ i ≤ 3) is a circular left shift of the vector by i bytes.
b. Next, we consider arithmetic involving the coefficients of the polynomial
c(y) = (03)y
3 + (01)y
2 + (01)y + (02) ,
that appears in MixColumns, where the coefficients of c(y) are bytes written in hexadecimal
(i.e. base 16) notation. Arithmetic involving this polynomial requires the computation of products involving the bytes (01), (02) and (03) in the Rijndahl field GF(28
). Recall that in this
field, arithmetic is done modulo m(x) = x
8 + x
4 + x
3 + x + 1.
(i) (2 marks) Write the bytes (01), (02), (03) as their respective polynomial representatives
c1(x), c2(x) and c3(x) in the Rijndahl field GF(28
).
(ii) (4 marks) Let b = (b7 b6 · · · b1 b0) be any byte, and let d = (02)b be the product of
the bytes (02) and b in the Rijndahl field GF(28
). Then d is again a byte of the form
d = (d7 d6 · · · d1 d0). Provide formulas for the bits di
, 0 ≤ i ≤ 7, in terms of the bits bi
.
(iii) (3 marks) Provide analogous expressions as in part (b) (ii) for the byte product e = (03)b,
where b = (b7 b6 · · · b1 b0) is any byte.
c. The MixColumns operation performs multiplication of 4-byte vectors by the polynomial c(y)
of part (b). In this part of the problem, you will evaluate such products symbolically.
(i) (3 marks) Let s(y) = s3y
3 + s2y
2 + s1y + s0 be a polynomial whose coefficients are bytes.
Symbolically compute the product t(y) = s(y)c(y) mod y
4 + 1. The result should be a
polynomial of the form t(y) = t3y
3 + t2y
2 + t1y + t0 where t3, t2, t1, t0 are bytes. Provide
formulas for the bytes ti
, 0 ≤ i ≤ 3, in terms of the bytes si
. The equations should consist
of sums of byte products of the form 01si
, 02si
, 03si
, 0 ≤ i ≤ 3. You need not compute
these individual byte products as you did in part (b).
(ii) (2 marks) Write your solution of part (c) (i) in matrix form; i.e. give a 4 × 4 matrix C
whose entries are bytes such that


t0
t1
t2
t3

 = C


s0
s1
s2
s3

 .
Note that this yields the matrix representation of MixColumns presented (without
proof) in class.
2
Problem 2 — Error propagation in block cipher modes (12 marks)
Error propagation is often an important consideration when choosing a mode of operation in practice. In this problem, you will analyze the error propagation properties of an arbitrary block cipher
in various such modes; note that these properties are independent of the cipher used.
a. Suppose Alice wants to send a sequence of message blocks M0, M1, M2, . . . to Bob, encrypted
to ciphertext blocks C0, C1, C2, . . . using some fixed key K. Assume that an error occurs during
transmission of a particular block of ciphertext Ci
. Justifying all your answers, explain which
plaintext blocks are affected after Bob decrypts this (faulty) ciphertext block Ci when the cipher
is used in
(i) (2 marks) ECB mode?
(ii) (2 marks) CBC mode?
(iii) (2 marks) OFB mode?
(iv) (2 marks) CFB mode with one register?
(v) (2 marks) CTR mode?
b. (2 marks) Suppose now that an error occurs in a particular plaintext block Mi before Alice
encrypts it and sends the corresponding ciphertext Ci to Bob. Upon decryption of Ci
, which
plaintext blocks Mj are affected when using the cipher in CBC mode?
Problem 3 — Binary exponentiation (13 marks)
Recall the exponentiation algorithm given in class for evaluating a
n
(mod m) (a ∈ Z, m, n ∈ N):
1. Compute the binary representation of n:
n = b02
k + b12
k−1 + · · · + bk−12 + bk ,
with b0 = 1, bi ∈ {0, 1} for 1 ≤ i ≤ k, and k = blog2 nc.
2. Initialize r0 ≡ a (mod m).
3. For 0 ≤ i ≤ k − 1 compute ri+1 =
(
r
2
i
(mod m) if bi+1 = 0 ,
r
2
i
a (mod m) if bi+1 = 1 .
4. Output rk.
a. (3 marks) To warm up with a toy example, compute 1711 (mod 77) using the procedure above;
answers that don’t use the binary exponentiation algorithm will receive no credit, even if they
are correct. Show all your work, and write down all your intermediate quantities bi and ri
.
Your answer should be an integer between 0 and 76.
b. In this problem, you will formally prove that the binary exponentiation algorithm is correct.
(i) (4 marks) Define s0 = 1 and si+1 = 2si + bi+1 for 0 ≤ i ≤ k − 1. Use induction on i to
prove that
si =
X
i
j=0
bj2
i−j
for 0 ≤ i ≤ k .
3
(ii) (4 marks) Let ri
, 0 ≤ i ≤ k, be defined as in steps 2 and 3 of the exponentiation algorithm.
Use induction on i to prove that ri ≡ a
si (mod m) for 0 ≤ i ≤ k.
(iii) (2 marks) Prove that a
n ≡ rk (mod m), so the algorithm above does indeed compute a
n
(mod m) as claimed.
Problem 4 — A modified man-in-the-middle attack on Diffie-Hellman (10 marks)
Suppose Alice and Bob wish to generate a shared cryptographic key using the Diffie-Hellman
protocol. As usual, they agree on a large prime p and a primitive root g of p. Suppose also that
p = mq + 1 where q is prime and m is very small (so p − 1 = mq has a large prime factor, as is
generally required). Since g and p are public, it is easy for anyone to deduce m and q; for example
by successively trial-dividing p − 1 by m = 2, 4, 6, . . . and running a primality test such as the
Fermat test on the quotient q = (p − 1)/m until primality of q is established.
Suppose an active attacker Mallory1
intercepts g
a
(mod p) from Alice and g
b
(mod p) from Bob.
She sends (g
a
)
q
(mod p) to Bob and (g
b
)
q
(mod p) to Alice.
a. (2 marks) Show that Alice and Bob compute the same shared key K under this attack.
b. (4 marks) Show that there are m possible values for K, and that Mallory can compute them
all and hence easily guess the correct key K among them.
c. (4 marks) What is the advantage of this variation of the man-in-the-middle attack over the
version we discussed in class? Recall that for the attack from class, Mallory simply suppresses
the messages g
a
(mod p) and g
b
(mod p) between Alice and Bob and replaces them with her
own number g
e
(mod p), which results in the shared key g
ae (mod p) between Mallory and
Alice and the shared key g
be (mod p) between Mallory and Bob.
Problem 5 — A simplified password-based key agreement protocol (8 marks)
The following is a simplified (and hence problematic) version of the key generation phase of the
password-based key agreement protocol that you are being asked to implement in Problem 9 (the
programming problem). Here, a client first performs a one-time registration of their authentication
credentials with a server. These credentials can then be used to generate authenticated session keys
between server and client via communication over an insecure channel.
All participants agree on a large public prime2 N = 2q + 1, with q prime, and a public primitive
root g of N. Each client has their own password p. To register with the server, a client computes
v ≡ g
p
(mod N) and provides the server with the pair (I, v) where I is the client’s user id.3
Protocol:
1. Client generates a random value a with 0 ≤ a ≤ N − 1, computes A ≡ g
a
(mod N), and sends
(I, A) to server, where I is the Client’s user id.
1This is a standard name for active attackers and is meant to be reminiscent of the word “malicious”.
2We denote this prime by N, rather than p, because the letter p is reserved for the client’s password.
3
In practice, this needs to be done in a secure and tamper-proof manner. Also, in the computation of v, the client
would use a hash of their password p rather than just p. For details, see the protocol description in Problem 9.
4
Server generates a random value b with 0 ≤ b ≤ N − 1, computes B ≡ g
b
(mod N), and sends
B to client.
2. Client computes Kclient ≡ Ba+p
(mod N).
Server retrieves client’s authentication data (I, v) and computes Kserver ≡ (Av)
b
(mod N).
Note that this protocol is similar to Diffie-Hellman, except that the client’s password p and authentication credential v are incorporated in the key computation.
a. [2 marks] Prove that Client and Server have a shared key after executing this protocol, i.e.
prove that Kserver = Kclient.
b. [3 marks] Suppose an adversary Mallory obtains client Ian’s authentication data (I, v) (by
intercepting Ian’s transmission to the server upon his registration or by hacking into the server’s
database). Show how Mallory can masquerade as Ian, i.e. execute the protocol with the server
(using a value A of her choice) and generate a valid key Kclient that the server believes is shared
with Ian.
c. [3 marks] Consider the following two problems:
ˆ Key Recovery Problem: Given any values A ≡ g
a
(mod N) and B ≡ g
b
(mod N) and any
v ∈ Z

N , find a key K produced by the protocol above.
ˆ Diffie-Hellman Problem: Given any values A ≡ g
a
(mod N) and B ≡ g
b
(mod N), find a
Diffie-Hellman key K ≡ g
ab (mod N).
Note that the exponents a and b are assumed to be unknown for both these problems. Show
how an attacker Mallory who can solve any instance of the key recovery problem can solve any
instance of the Diffie-Hellman problem. (So informally, breaking the key agreement protocol
above is at least as hard as breaking Diffie-Hellman.)
Written Problems for MATH 318 only
Problem 6 — Discrete logs with respect to different primitive roots (6 marks)
Let p be a prime and g a primitive root of p. Recall that for any a ∈ Z

p
, the discrete logarithm of
a with respect to g is unique integer x with 0 ≤ x ≤ p − 2 and g
x ≡ a (mod p).
Recall that the discrete problem is asserted to be computationally intractable. This raises the
natural question of whether this problem might be easier to solve for some primitive roots than for
others. In this problem you will prove that that the difficulty of the discrete logarithm problem is
independent of the choice of primitive root.
Specifically, let g, h be primitive roots of p and assume that for any element in Z

p
, computing its
discrete logarithm with respect to g is easy. Give an algorithm for easily computing its discrete
logarithm with respect to h.
5
Problem 7 — Primitive roots for safe primes (6 marks)
Let q ≥ 3 be a prime such that p = 2q+1 is also prime. Let g be any primitive root of p. Prove that
with the exception of g
q
(mod p), all the odd powers of g (i.e. g, g3
(mod p), g5
(mod p), . . . , gp−2
(mod p)), are primitive roots of p.
(Hint: the following fact about divisibility, which you may use without proof, might come in handy:
for any three nonzero integers a, b, c, if a is a divisor of the product bc and gcd(a, b) = 1, then a is
a divisor of c.)
Problem 8 — An algorithm for extracting discrete logarithms (25 marks)
Let p be a large prime and g a fixed primitive root of p. Let h ∈ Z

p be the modular inverse of g,
so gh ≡ 1 (mod p). Let a ∈ Z

p
. Define the following lists of elements in Z

p
:
yi ≡ ahi
(mod p), 0 ≤ i ≤ m − 1;
zj ≡ (g
m)
j
(mod p), 0 ≤ j ≤ k − 1.
Here, m is a positive integer (an as yet unspecified parameter) and k is the smallest integer with
k ≥ (p − 1)/m, so k ≥ (p − 1)/m > k − 1.
a. (4 marks) Prove that there exist indices i, j with 0 ≤ i ≤ m − 1 and 0 ≤ j ≤ k − 1 such that
yi = zj .
(Hint: Division with remainder of x by m where x is the (unknown) discrete logarithm of a
with respect to g.)
b. (5 marks) Consider the following procedure for computing the discrete logarithm of a with
respect to g.
1. Compute the list Y = (y0, y1, . . . , ym−1)
2. If yi ≡ 1 (mod p) for some i ∈ {0, 1, . . . , m − 1}, then output i and quit.
3. Compute the list Z = (z0, z1, z2, . . . ,) until an element zj is found that appears in Y.
4. Upon finding such a match, say zj = yi
, output x ≡ jm + i (mod p − 1).
Prove that this procedure terminates and outputs the discrete logarithm of a.
c. (4 marks) The computational cost of this algorithm is determined by the number of modular
multiplications required to generate the lists Y and Z (we may ignore the cost of step 2 and
of searching the list Y for a match to an element in Z in step 3 as these tasks take negligible
time).
Assume the worst case scenario where the entire list Z = (z0, z1, . . . , zk−1) needs to be generated
before a match with an element in the list Y is found. How many modular multiplications are
required to extract the discrete logarithm of a using the procedure above? Your count should
be as accurate as possible (i.e. don’t count modular multiplications that aren’t needed). You
may assume that k and g
m (mod p) have been precomputed as they are independent of a.
6
d. (5 marks) How should m be chosen to minimize the number of modular multiplication determined in part (c)? Explain your choice.
(Hint: Your answer should be a function of p that’s close to √p.)
e. Let p = 107.
(i) (2 marks) Use the primitive root test from class to verify that 2 is a primitive root of p.
Show your work.
(ii) (4 marks) Use the procedure from part (b) and your choice of m from part (d) to compute
the discrete logarithm of 6 with respect to 2 in Z

107. Show your work. You may want to
verify that your final answer is correct.
Programming Problem for CPSC 418 only
Problem 9 — Secure password based authentication and key exchange (37 marks)
Overview. This problem considers the full, secure version of the password-based key agreement
protocol introduced in Problem 5. This protocol, executed by a Client and a Server, allows the
Client to demonstrate to the Server knowledge of a password, but neither the password nor any
other information that could be used to derive the password need to be transmitted. Additionally,
the Server does not store password-equivalent data, so someone who intercepts authentication data
or steals them from the Server’s database is unable to masquerade as the Client without bruteforcing the Client’s password.
To execute the protocol, there is an initialization and registration phase between the Client and
Server. Doing this securely is in itself a complex problem, and so we instead perform a simplified
version as described below.
The Server initializes in the following manner:
1. Generates a 512-bit safe prime N by first generating a random 511-bit prime q and checking
whether N = 2q + 1 is prime. Repeat until a valid N is found.
2. Finds a primitive root g of N, which may be up to 512 bits in size.
3. Computes the quantity k = H(N||g), where ‘||’ denotes concatenation and H is a cryptographically secure hash function.
4. Opens a socket connection on a specified IP and port and attempts to receive a single byte,
whose value indicates the Server response. This byte is limited to the UTF-8 encoding of the
characters ‘r’ indicating Client wishes to initiate registration; ‘p’ indicating the Client wishes
to initiate the protocol; and ‘q’ indicating the Client wishes the Server to shut down.4
The Server performs registration as follows:
1. Server sends the values N, g to the Client.
2. Server receives a single byte containing the length of the Client’s username, followed by username, s, v.
Server stores these values.
3. Server resumes listening on the socket, for another command.
4A true implementation would not have this last feature, but it makes debugging much easier.
7
The Client performs registration as follows:
1. The username username and a password pw are retrieved from the command line.5
2. Generates a random 16-byte salt6
s.
3. Connect to the Server’s socket and sends a one-byte character ‘r’;
4. Receives the values N, g;
5. Computes x = H(s||pw) and v ≡ g
x
(mod N);
6. Computes and sends a single byte containing the byte-length of username, followed by username
encoded in UTF-8, s, and v.
In a similar manner, if a Client connects to the Server and wishes to initiate the protocol, they will
first send a single byte corresponding to ‘p’ before commencing with the protocol described below.
Protocol.
To generate and verify a shared authenticated session key, the Server and Client perform the
following steps:
1. Client generates a random value a with 0 ≤ a ≤ N − 1, computes A ≡ g
a
(mod N), and sends
the length of the username as a single byte, followed by username, A.
Server generates a random value b with 0 ≤ b ≤ N − 1, computes B ≡ kv + g
b
(mod N), and
sends s, B, where s is the Client’s salt.
2. Client and Server compute u ≡ H(A||B) (mod N).
3. Client computes Kclient ≡ (B − kv)
a+ux (mod N).
Server computes Kserver ≡ (Avu
)
b
(mod N).
4. Client computes and sends M1 = H(A||B||Kclient).
5. Server computes H(A||B||Kserver) and compares the result to M1. If they do not match, the
authentication has failed and the socket is closed.
6. Server computes and sends M2 = H(A||M1||Kserver).
7. Client computes H(A||M1||Kclient) and compares the result to M2. If they do not match, the
authentication has failed.
8. The Server closes the socket and waits for further connections.
Steps 1-3 generate the authenticated key shared between the Client and the Server. Steps 4-7 verify
that both parties have computed the same shared key. If executed honestly, Kclient and Kserver are
equal and the Server and Client were able to authenticate each other and establish a shared session
key.
ˆ At the end of either registration or the protocol, the Server should resume listening on the
socket.
5The program template will do this for you, as well as check that the length of the username encodes to less than
255 bytes.
6
In cryptography, a salt is a random piece of data used as an additional input to a one-way function that hashes
data, a password or passphrase.
8
ˆ Note that in order to reliably receive information from the socket, the byte length of the data
must typically be known. Unless otherwise specified, the size of any value taken modulo N
should be the same length as N (up to 512 bits). Since SHA2-256 is used, any hashed value
that is not taken modulo N is therefore 256 bits in size. The encoded username may be up
to 255 bytes in length.
ˆ Please note the importance of the order that values are sent over the socket.
Problem. Your task is to complete the template program named
basic auth.py
which performs the above password-based key agreement protocol. The program consists of functions corresponding to establishing a connection and transmitting data through a socket, parameter
generation, and the actions performed by the Client and the Server.
All messages over the socket should be echoed to standard output by both the sending and receiving
party. Each echoed message should clearly indicate its sender and intended receiver. Use a template
like the following:7
Server: N = (integer value of N)
Client: Received <(hex representation of incoming data)>
Server: Authentication was successful.
Client: ERROR, the socket was closed.
For this exercise the hash function H will be SHA2-256 as implemented in the cryptography
library. Additionally, when randomness is required use either os.urandom or the secrets library.
We will allow use of the sympy library, as it is useful for handling prime numbers; however, you
are expected to implement the function prim root on your own, and you may not use sympy’s
primitive root() method in your implementation.
Specifications. Fill in the empty functions in the template program basic auth.py. Once complete, you should be able to perform both registration and key agreement between two parties, one
acting as the Client and the other as Server, by running
python3 basic auth.py –server 127.0.4.18:3180
python3 basic auth.py –client 127.0.4.18:3180
python3 basic auth.py –quit 127.0.4.18:3180
You may also combine these actions with one invocation of basic auth.py, although this makes
debugging substantially more difficult. There will be additional command-line options to change
the username and password from the defaults.
In detail, the functions you’ll need to fill in fall into roughly three categories:
a. Networking Functions:
(i) create socket(. . . ), which creates a socket connection on the specified IP and port.
7The printed text is primarily for human eyes, not computer parsing, so there is no need to match these examples
precisely.
9
(ii) send(. . . ), which sends bytes through the specified socket.
(iii) receive(. . . ), which receives a specified bytes through the socket.
b. Low-level Functions:
(i) safe prime(. . . ), which creates a random safe prime N of a specified bit-length and
whose most significant bit is 1.
(ii) prim root(. . . ), which finds a primitive root g of N for some prime N.
Note: You may not use sympy’s primitive root method for your implementation.
(iii) calc x(. . . ), which computes the value x from the registration step using the salt and
password.
(iv) calc u(. . . ), which computes the value u using the provided inputs.
(v) calc A(. . . ), which computes the value A using the provided inputs.
(vi) calc B(. . . ), which computes the value B using the provided inputs.
(vii) calc K client(. . . ), which computes the value Kclient.
(viii) calc K server(. . . ), which computes the value Kserver.
(ix) calc M1(. . . ), which computes the value M1.
(x) calc M2(. . . ), which computes the value M2.
c. High-level Functions:
(i) client register(. . . ), which sends and receives information to and from the Server as
described in the Client registration.
(ii) server register(. . . ), which receives and sends information from and to the Client
through the socket connection, as outlined in the Server registration.
(iii) client protocol(. . . ), which performs the Client’s side of the protocol.
(iv) server protocol(. . . ), which performs the Server’s side of the protocol.
(v) client prepare(. . . ), which computes the Client’s registration tuple (username, s, v).
(vi) server prepare(. . . ), which computes the Server’s registration values N, g, k.
Note that some values may be supplied as an integer or as a bytes object; your functions will need
to translate between them as the context requires. All numbers are converted to bytes via network
byte order, which is big-endian.8 Additional details and documentation of these functions can be
found in the template program found on the Piazza resources page.
Submission. Submit a completed version of the template program with filename
basic auth.py
If you’ve spread your code across multiple source files, submit all of them.
Provide a description of your implementation in a separate README file in text format. Do not
include the written portion of the programming problem in the PDF file containing
your solutions to the written problems. Your description should include the following:
ˆ The procedures you used for generating your prime N and your primitive root g of N.
ˆ A list of the files you have submitted that pertain to the problem, and a short description of
each file.
ˆ A list of what is implemented in the event that you are submitting a partial solution, or a
statement that the problem is solved in full.
ˆ A list of what is not implemented in the event that you are submitting a partial solution.
ˆ A list of known bugs, or a statement that there are no known bugs.
8Functions will be supplied with the template to help with conversion.
10

CPSC 418 / MATH 318 — Introduction to Cryptography ASSIGNMENT 3

Problems 1-5,
Problem 8 only
Written Problems for CPSC 418 and MATH 318
Problem 1 — Flawed MAC designs (11 marks)
For this problem, you need to carefully trace through the given MAC algorithms, and specifically
through the underlying iterated hash function, to understand the given attacks and explain how
computation resistance is defeated.
Recall that iterated hash functions employ a compression function f in multiple rounds; SHA-1
is an example of such a hash function. Here, f takes as input pairs of n-bit blocks and outputs
n-bit blocks, for some n ∈ N. The compression function f is assumed to be public, i.e. anyone
can compute values of f. As a result, the hash function is also public, so any one can compute
hash values. For simplicity, we assume that the bit length of any message to be hashed is a
multiple of n, i.e. messages have been padded appropriately prior to hashing. Then the input to an
iterated hash function is a message M consisting of L blocks P1, P2, . . . , PL, each of length n. An
algorithmic description of an n-bit iterated hash function is given as follows (as usual. “k” denotes
concatenation of strings).
Algorithm ItHash
Input: A message M = P1kP2k · · · kPL
Output: An n-bit hash of M
1: Initialize H := 0n (n zeros)
2: for i = 1 to L do
3: H := f(H, Pi)
4: end for
5: Output H
This problem demonstrates that the two “obvious” designs for using an iterated hash function as
the basis for a MAC — prepending or appending the key to the message and then hashing — are
not secure. You will prove this by mounting specific attacks that defeat computation resistance.
For simplicity, assume that keys also have length n, the same as the message block length.1
a. (5 marks) Define a message authentication function PHMAC (“Prepend Hash MAC ”) via
PHMACK(M) := ItHash(KkM) = ItHash(KkP1kP2k · · · kPL)
for any message M = P1kP2k · · · kPL.
Suppose an attacker knows a message/PHMAC pair (M1,PHMACK(M1)). Let X be an arbitrary n-bit block and put M2 = M1kX. Show how an attacker can compute PHMACK(M2)
without knowledge of K, thereby defeating computation resistance for PHMAC.
Hint: Let M1 = P1kP2k · · · kPL. Tracing through the ItHash algorithm, compare the results of
the first L + 1 rounds of ItHash(KkM1) and ItHash(KkM2). Then explain how the attacker
can compute PHMACK(M2) from ItHash(KkM2) without knowledge of K.
1The attack in part (a) will in fact work for any key length, but for part (b), this key length restriction is required.
2
b. (6 marks) Define a message authentication function AHMACK (“Append Hash MAC ”) via
AHMACK(M) := ItHash(MkK) = ItHash(P1kP2k · · · kPLkK)
for any message M = P1kP2k · · · kPL.
Assume that ItHash is not weakly collision resistant, and suppose an attacker knows a message/AHMAC pair (M1, AHMACK(M1)). Show how she can find (without knowledge of K) a
second message/AHMAC pair (M2, AHMACK(M2)), thereby defeating computation resistance.
Hint: Note that on input any L-bit message, the first L rounds of the computation of AHMACK(M)
do not depend on K, just on M.
Problem 2 — Fast RSA decryption using Chinese remaindering (7 marks)
In this problem, as usual, a user Alice has an RSA public key (e, n) with corresponding private key
d. Here, n = pq for distinct large primes p and q.
If Alice does not discard p and q after computing n and φ(n), she can employ an alternative
decryption procedure as described below (based on the Chinese Remainder Theorem which some
of you may have seen before). For a given ciphertext C (1 ≤ C ≤ n−1, gcd(C, n) = 1), she proceeds
as follows:
Step 1. Compute
dp ≡ d (mod p − 1), 0 ≤ dp ≤ p − 2 ,
dq ≡ d (mod q − 1), 0 ≤ dq ≤ q − 2 .
Step 2. Compute
Mp ≡ C
dp
(mod p), 1 ≤ Mp ≤ p − 1 ,
Mq ≡ C
dq
(mod q), 1 ≤ Mq ≤ q − 1 .
Step 3. Use the Extended Euclidean Algorithm to find integers x, y such that
px + qy = 1 .
(Such integers exist because gcd(p, q) = 1.)
Step 4. Set M ≡ pxMq + qyMp (mod n), 0 ≤ M ≤ n − 1.
Prove that if the above procedure decrypts correctly. That is, if C ≡ Me
(mod n) is a ciphertext
obtained by encrypting a message M the “normal” RSA way, and M0
is the result of applying the
procedure above to C, prove that M0 = M.
Hint: You may use without proof the fact that a ≡ b (mod n) if and only if a ≡ b (mod p) and
a ≡ b (mod q) for any a, b ∈ Z. The “only if” direction follows directly from the definition of
congruence; the “if” direction follows from the Chinese Remainder theorem.
Remark: This method performs two modular exponentiations for decryption (in step 2), as opposed
to just one modular exponentiation for the ordinary RSA decryption procedure. However, the
moduli p and q have only about half the bit length of n, and the exponents dp and dq generally
have about half the bit length of d (as d / n, dp / p and d1 / q). Since the computational effort of
steps 1, 3 and 4 is negligible compared to step 2, Chinese remainder decryption is generally almost
four times as fast as ordinary RSA decryption. So this is what is generally used in practice.
3
Problem 3 — RSA primes too close together (21 marks)
This problem explores a difference of squares approach to factoring an RSA modulus due to Fermat, which is hence also known as Fermat factorization. Fermat’s idea was to attempt to find a
representation of an integer n as a difference of squares n = x
2 − y
2 where 0 < y < x < n, which
leads to a factorization n = (x + y)(x − y). When n is an RSA modulus whose prime factors are
very close together, the quantity y is very small compared to x and we will see that this gives rise
to a serious factoring attack on RSA.
Let n = pq with odd primes p, q, and assume without loss of generality that p > q. As always, all
square roots are positive, i.e. when we write √
z for some z > 0, we mean the positive square root
of z.
a. (4 marks) Prove that if x, y are integers with x > y > 0 and n = x
2 − y
2
, then
x =
n + 1
2
, y =
n − 1
2
or x =
p + q
2
, y =
p − q
2
. (1)
Note that it is not enough to prove that these given values for x and y satisfy n = x
2 − y
2
. The
point is to prove that there are no integer values for x and y other than the ones given here.
b. (2 marks) Prove that p + q < n + 1.
c. (3 marks) Use the inequality p > q to prove that √
n <
p + q
2
< p.
d. (4 marks) The idea for factoring n is to iterate over integers x that are potential candidates
for (p + q)/2, starting with the smallest integer exceeding √
n (the lower bound from part (c))
and searching in increments of 1 until a value x is found for which y := √
x
2 − n is an integer.
The assertion (which requires proof!) is that x = (p + q)/2 and y = (p − q)/2, in which case
the method returns the factor x − y = q. Here is pseudocode for this factorization algorithm:
Algorithm Fermat Factorization
Input: n = pq with p > q
Output: q
1: Put x = d

n e

n rounded up to the nearest integer
2: y := √
x
2 − n
3: while y is not an integer do
4: x := x + 1; y := √
x
2 − n
5: end while
6: Output x − y.
Use parts (a)-(c) to prove that this algorithm terminates s soon as x = (p + q)/2 and outputs
q. Note that there are three things to show here:
• The “while” condition is satisfied when x = (p + q)/2;
• x = (p + q)/2 is the first value that satisfies the “while” condition;
• The algorithm outputs q.

e. (2 marks) Show that the “while” condition in step 3 is tested x − d√
n e + 1 times, where
x = (p + q)/2.
f. (4 marks) Prove that x − d√
n e <
y
2
2

n
where x = (p + q)/2 and y = (p − q)/2.
Hint: Find a suitable upper bound on (x −

n)(x +

n) and divide by x +

n. Prove that the
result yields a bound on x − d√
n e.
g. (2 marks) Finally, the coup-de-grˆace. Suppose p − q < 2B
√4 n for some integer B that is very
small compared to n; e.g. B could be on the order of a power of log2
(n) or even a constant
number. In other words, p and q are very close together; they agree in nearly half of their most
significant bits. Prove that the above algorithm factors n after at most
B2
2
+ 1
steps, where a step corresponds the an evaluation of the “while” condition in step 3.
Problem 4 — El Gamal is not semantically secure (12 marks)
This problem requires typesetting Legendre symbols. To facilitate this, include at the beginning of your assignment file, right after the line \documentclass{…} the two lines
\usepackage{amsmath}
\providecommand{\Leg}[2]{\genfrac{(}{)}{}{}{#1}{#2}}
Be sure that you copy these verbatim; the easiest is to copy and paste them right from this
PDF file. The assignment template provided on the Piazza Resourcesd page already includes
these lines. The command $\Leg{a}{n}$ will produce the typeset output
a
n

, which is much
easier than producing a fraction with large parentheses around it.
Recall that for the El Gamal public key cryptosystem, a user Alice produces her public and private
keys as follows:
Step 1. Selects a large prime p and a primitive root g of p.
Step 2. Randomly selects x such that 0 < x < p − 1 and computes
y ≡ g
x
(mod p).
Alice’s public key is {p, g, y}
Alice’s private key is {x}
To encrypt a message M ∈ Z

p
intended for Alice, Bob selects a random integer k ∈ Zp−1, computes
C1 ≡ g
k
(mod p) and C2 ≡ Myk
(mod p), and sends C = (C1, C2) to Alice.
Alice decrypts the ciphertext C = (C1, C2) by computing C2C
p−1−x
1 ≡ M (mod p).
In this problem, you will prove that the El Gamal public key cryptosystenm is not polynomially
secure, and hence not semantically secure. This is because an attacker Mallory can distinguish
messages according to whether they are quadratic residues or quadratic nonresidues modulo p.
Mallory mounts her attack with the following procedure:

Step 1. Selects two messages M1 and M2 such that M1 ∈ QRp and M2 ∈ QNp, and obtains
the ciphertext C = (C1, C2) = E(Mi) where i = 1 or 2.
(Mallory’s task is precisely to ascertain whether i = 1 or i = 2.)
Step 2. Computes the Legendre symbols
y
p

,
C1
p

and C2
p

.
Step 3. If
y
p

= 1 and C2
p

= 1, she asserts that C = E(M1).
If
y
p

= 1 and C2
p

= −1, she asserts that C = E(M2).
If
y
p

= −1, C1
p

= 1 and C2
p

= 1, she asserts that C = E(M1).
If
y
p

= −1, C1
p

= 1 and C2
p

= −1, she asserts that = E(M2).
If
y
p

= −1, C1
p

= −1 and C2
p

= 1, she asserts that C = E(M2).
If
y
p

= −1, C1
p

= −1 and C2
p

= −1, she asserts that C = E(M1).
Note that this procedure requires three Legendre symbol computations — which can be done with
modular exponentiation by Euler’s Criterion — and hence always takes polynomial time. Note also
that Mallory states her assertions with certainty, i.e. probability 1.
Prove that Mallory’s assertions are correct, so the El Gamal system is not semantically secure.
Problem 5 — An IND-CPA, but not IND-CCA secure version of RSA (12 marks)
Consider the following semantically secure variant of the RSA public key cryptosystem:
Parameters:
• m — length of plaintext messages to encrypt (in bits)
• (n, e) — Alice’s RSA public key (n has k bits)
• d — Alice’s RSA private key
• H : {0, 1}
k
7→ {0, 1}
m a public random function
Encryption of an m-bit message M ∈ Zn∗:
Step 1. Generate a random k-bit number r < n.
Step 2. Compute C = (sk t) where s ≡ r
e
(mod n) and t = H(r) ⊕ M.
Decryption of a ciphertext C = (skt):
Step 1. Separate C into s and t.
Step 2. Compute M ≡ H(s
d
(mod n)) ⊕ t.
Prove that this cryptosystem is not IND-CCA secure.
Hint: To mount her CCA, Mallory gets to choose two plaintexts and submit a ciphertext C
0
for
decryption. Almost any two plaintexts M1, M2 will do. Let
C = (sk t) = (r
e
(mod n) k H(r) ⊕ Mi)
be the encryption of one them; Mallory needs to ascertain whether i = 1 or i = 2. Mallory chooses
the ciphertext C
0 = (sk t ⊕ M1) and obtains its decryption (make sure that C
0 6= C; that’s a
requirement in the CCA). Now trace through the decryption method applied to C and use the
result to figure out how Mallory can detect whether C is the encryption of M1 or M2. (Of course
Mallory can’t actually decrypt C, but she/you can still apply the decryption method symbolically
to C.)
Written Problems for MATH 318 only
Problem 6 — A primality test of sorts (12 marks)
Let N be an integer withN > 1. In this problem, you will prove the assertion
(N − 1)! ≡ −1 (mod N) if and only if N is prime (2)
and analyze its suitability for primality testing.
For the primes N = 2 and N = 3, it is easy to verify that (2) holds, so assume henceforth that
N ≥ 4. Note that in this case, 1 and −1 are distinct elements in ZN as 1 ≡ −1 (mod N) implies
that N divides 1 − (−1) = 2, forcing N = 2.
a. (3 marks) Suppose N is prime and let g be primitive root of N. Prove that
g
(N−1)/2 ≡ −1 (mod N) .
You may use without proof the fact if a prime divides the product of two non-zero integers,
then it divides one of the factors.
b. (3 marks) Suppose N is prime. Prove that (N − 1)! ≡ −1 (mod N).
Hint: Primitive roots. Remember that for any primitive g of N, the set Z

N consists precisely
of the powers g
k
(mod N) with0 ≤ k ≤ N − 2.
c. (3 marks) Suppose N is composite. Prove that (N − 1)! 6≡ −1 (mod N).
Hint: Consider (N − 1)! modulo some prime divisor of N and draw the necessary conclusion
about (N − 1)! (mod N).
d. (3 marks) By parts (b) and (c), the congruence (N − 1)! ≡ −1 (mod N) can be used as a
primality test, actually even as a primality proof. Do you think this is a good primality test?
How does this compare, for example, to trial division (dividing N successively by 2, 3, 4, . . .)?
Give a yes/no answer and a concise, coherent and convincing explanation of your answer.
Remark: A wrong answer is “No because (N − 1)! is a really big number.” This is true,
but we are doing modular arithmetic here. The intermediate operands in the computation of
(N − 1)! (mod N) can be kept bounded by reducing modulo N in each step, i.e. by computing
FN−1 ≡ (N − 1)! (mod N) via the recurrence F0 = 1 and Fn ≡ nFn−1 (mod N) for 1 ≤ n ≤
N − 1.
7
Problem 7 — An attack on RSA with small decryption exponent (25 marks)
This problem explores an attack on RSA with small private key d.
Preliminaries. Let r be a positive rational number and write r = a/b with a, b ∈ N. Let
q0, q1, . . . qm ∈ N be the sequence of quotients produced by applying the Euclidean Algorithm to
the numerator and denominator of r:
a = q0b + r0 , 0 < r0 < b ,
b = q1r0 + r1 , 0 < r1 < r0 ,
r0 = q2r1 + r2 , 0 < r2 < r1 ,
.
.
.
rm−3 = qm−1rm−2 + rm−1 , rm−1 = gcd(a, b),
rm−2 = qmrm−1 + rm , rm = 0 .
Recall the familiar sequences
A−2 = 0, A−1 = 1, Ai = qiAi−1 + Ai−2 for 0 ≤ i ≤ m,
B−2 = 1, B−1 = 0, Bi = qiBi−1 + Bi−2 for 0 ≤ i ≤ m.
The quotients Ai/Bi (0 ≤ i ≤ m) are called the convergents of r because they oscillate around
and converge toward r as i increases, with Am = a, Bm = b, and hence Am/Bm = r. In fact, the
following theorem (which you may use without proof) asserts that any rational number sufficiently
close to r must occur as one of the convergents:
Theorem. Let r =
a
b
∈ Q with a, b > 0, and let A
B
∈ Q be a fraction in lowest terms such that

r −
A
B

<
1
2B2
. Then A = Ai and B = Bi for some i ∈ {0, 1, . . . , m}.
Now back to RSA. Let n = pq where p, q are odd primes satisfying
q < p < 2q .
These inequalities are reasonable as p and q are usually assumed to have the same bit length. Let
e, d be integers with 1 < e, d < φ(n) and ed ≡ 1 (mod φ(n)). Let k ∈ Z satisfy ed = 1 + kφ(n) and
suppose that d is small compared to n; specifically,
d <
√4 n

6
. (3)
a. (5 marks) Prove that 1 ≤ k < d and gcd(d, k) = 1.
b. (3 marks) Prove that 2 ≤ n − φ(n) < 3

n.
c. (4 marks) Use parts (a) and (b) to prove that 0 < kn − ed < 3d

n.
d. (4 marks) Conclude from part (c) and the upper bound (3) on d that 0 <
k
d

e
n
<
3

n
<
1
2d
2
.
8
e. (3 marks) Let q0, q1, . . . , qm be the quotients obtained when applying the Euclidean algorithm
to e and n, and let Ai
, Bi be the associated sequences as defined above. Use the theorem from
above to prove that k = Ai and d = Bi for some i ∈ {0, 1, . . . , m}.
f. (6 marks) Use part (e) to devise a procedure for finding d efficiently. Explain why your procedure
works and why it is efficient, i.e. argue that the number of computation steps performed by
your procedure is small.
Programming Problem for CPSC 418 only
Problem 8 — Secure file transfer with prior key agreement (37 marks)
Don’t be daunted by the long description of this problem! Most of it is very clear specifications,
including those for the autograder, to make your life easier.
Overview. Transport Layer Security (TLS) is a security protocol which aims to provide end-to-end
communication security over networks. It provides both privacy and data integrity. TLS has many
steps, but our focus will be the cipher suite, a set of algorithms that add cryptographic security to
a network connection. There are four main components to a cipher suite:
• Key Exchange Algorithm
• Authentication Algorithm (signature)
• Bulk Encryption Algorithm (block cipher and mode of operation)
• Message Authentication Algorithm
Cipher suites are specified in shorthand by a string such as
SRP-SHA3-256-RSA-AES-256-CBC-SHA3-256.
• This implies SRP-SHA3-256 as the key exchange algorithm, RSA signatures for authentication, AES-256-CBC for encryption and SHA3-256 as the MAC (used as HMAC).
• SRP is the protocol implemented in Assignment 2. The hash function used in the protocol is
specified as SHA3-256.
• SHA3-256 as the MAC implies the use of HMAC with SHA3-256.
In a TLS handshake, upon connection two parties automatically negotiate TLS settings, including
the cipher suite to use. Through an exchange of messages the two parties verify each other and
establish a session key. In this assignment, the two parties will be a server and client. The server
will authenticate itself to the client by means of a certificate, and if successful, will derive a shared
session key. The two parties are then able to use the session key to exchange encrypted and
authenticated messages.
In reality, a certificate is an electronic document which contains a public key and information about
the owner of the key. The certificate must be signed by a trusted authority to verify the contents
of the certificate. For us, the certificate will merely be a name concatenated with a public key,
concatenated with the signature of a trusted third party (TTP).
9
For this assignment, your task is to implement a toy version of the TLS handshake using the cipher
suite specified above.2 The protocol will have two parties a Client and Server, which rely on a
certificates provided by a TTP.
To execute the protocol, the Client and Server first need to obtain certificates from the TTP. We
perform a simplified version of this process as follows:
The TTP initializes in the following manner:
• Generates two RSA primes p and q of size 512-bits and computes N = pq.
• Generates its own RSA key-pair (N, e) and (N, d).
• Opens a socket connection on a specified IP and port and attempts to receive a single byte,
whose value indicates the TTP response. This byte is limited to the UTF-8 encoding of the
characters ‘s’ indicating the connecting party wishes the TTP to sign their data; ‘k’ indicating
the connecting party wishes to obtain the TTP public key; and ‘q’ indicating the party wishes
the TTP to shut down.3
The TTP performs signing as follows:
a. Receives the following: a single byte encoding of the length of name, a utf-8 encoded name and
an RSA public key (N, e) where N, e are each encoded as byte-arrays of size 128.
b. Hashes the byte array (name||N||e) using SHA3-512 to obtain a 512-bit value t, stores it, and
hashes t once again to produce t
0
.
c. Taking the byte array t||t
0
, interpret this as a big-endian integer and reduce by the TTP’s RSA
modulus to get the value S.
4
d. Compute TTP SIG, the RSA signature on S.
e. Send the values (N0
, TTP SIG) to the connected party, where N0
is the TTP’s RSA modulus.
f. Resume listening on the socket for another command.
The TTP performs key-request as follows:
a. Sends its public key (N, e).
b. Resumes listening on the socket connection to receive another command.
The Server performs signature-request as follows:
a. Connect and send a one-byte character ‘s’ to the TTP.
b. Send a single byte encoding of the length of name, a utf-8 encoded name and an RSA public
key (N, e) where N, e are each encoded as byte-arrays of size 128.
c. Receive and store the TTP’s N and TTP SIG.
2With TLS 1.3, there has been a move towards using authenticated encryption schemes such as AES-GCM. For
pedagogical reasons, we will look at a more classic cipher-suite.
3A true implementation would not have this last feature, but it makes debugging much easier.
4The purpose of this is to make full use of the RSA message space.
10
d. Close the socket connection.
The Client performs key-request as follows:
a. Connect and send a one-byte character ‘k’ to the TTP.
b. Receive and store the TTP’s N and d.
c. Close the socket connection.
The following steps must be added to the Server initialization from the previous assignment before opening a listening socket:
a. Obtain a name, server name.
b. Generates two RSA primes p
0 and q
0 of size 512-bits and computes N0 = p
0
q
0
.
c. Generates an RSA key-pair (N0
, e0
) and (N0
, d0
).
d. Connect and request a signature from the TTP
Once the Client and Server have both obtained the relevant information from the TTP and the
Client has registered with the Server, the TLS handshake proceeds (finally!) as follows:
a. Client sends the length of its username as a single byte, followed by username to the Server.
b. The Server sends a certificate consisting of the server’s name and public key as well as a signature
of these values by the TTP.
• The server sends the length of server name as a single byte, followed by
server name ||N0
||e
0
||TTP SIG
where N0
, e,0 TTP SIG are each encoded as 128-byte arrays.
• Upon receiving the certificate, the Client should verify the signature of the TTP. If verification fails, the client should close the socket.
c. Initiate the protocol from Assignment 2 with the following modifications:
• Instead of the Client sending A as plaintext, they will send Enc(A), the RSA encryption
of A under the Server’s public key.
• Upon receiving Enc(A) the server must decrypt to obtain A.
• In case you did not implement the following check in Assignment 2: the server should
ensure that the value A is not congruent to 0 under the SRP modulus and abort otherwise
(it may be fun to think about why).
• The remainder of the protocol proceeds as n Assignment 2 (derive the shared key and
verify).
d. With the shared key derived, take the first 32-bytes as key for AES-256. Use the next 32 bytes
as the key for HMAC.
e. Pad the file if necessary, then encrypt it using the appropriate key.
11
f. Also note that you will require an IV for AES-CBC. This should be randomly generated and
prepended to the encrypted message (just as in Assignment 1). The IV is considered part of
the ciphertext.
g. Generate a tag of the ciphertext bytes using HMAC-SHA3-256 with the appropriate key, and
append the tag to the encrypted bytes.5
h. Store the length of the ciphertext in a 4-byte array, then send the length followed by the
ciphertext to the Server.
i. Have the server receive the ciphertext and decrypt and verify the tag.
Problem Your task is to complete the template program
basic handshake.py
which performs the above handshake protocol. The program consists of functions corresponding to
establishing a connection and transmitting data through a socket, parameter generation and the
actions performed by the Client, Server and TTP.
It is recommended to echo messages sent over the socket to standard output, as this makes debugging the protocol substantially easier. This is not required, but in prior years students benefited
greatly from being able to watch the network traffic. The varprint(. . . ) function is useful for this.
In adherence to the specified TLS ciphersuite, the hash function H used in the protocol will be
SHA3-256 as implemented in the cryptography library. Remember to change this when
using the protocol from Assignment 2. Additionally, when randomness is required use either
os.urandom or the secrets library.
We will allow the use of the sympy library, as it is useful for handling prime numbers; however,
certain functions will not be permitted including
• sympy.primitive root()
Requirements.
a. You must implement your own version of the RSA related functions. You may not use the built
in functions related to RSA from the cryptography library for this exercise.
b. For efficiency we have chosen very small parameter sizes: The server and TTP choose RSA
primes p and q to be 512 bit safe primes. Thus the modulus N will be 1024 bits (128 bytes).
See the course notes for the specifics of the RSA encryption and signature schemes.
c. All other primitives should make use of the cryptography library.
d. Please be careful to note the changes in functions used in past assignments. In particular the
use of AES-256, SHA3-256, and HMAC in the symmetric protocol.
Specifications. Fill in the empty functions in the template program basic handshake.py. Since
this is built on top of the previous assignment, once complete, you should be able to perform the
full TLS handshake between two parties, one acting as the Client and the other as Server, using
a TTP by running
5There is some debate within the cryptography over whether it is more secure to encrypt the hash or append the
hash to the encrypted bytes. Since we used the former method on Assignment 1, we will use the latter on this one.
12
python3 basic handshake.py –ttp
python3 basic handshake.py –server
python3 basic handshake.py –client
python3 basic handshake.py –quit server –quit ttp
You may also combine these actions with one invocation of basic handshake.py, although this
makes debugging substantially more difficult. There will be additional command-line options to
change the username and password from the defaults, as well as the IP address and ports the TTP
and server listen on.
In detail, the new functions you’ll need to fill in fall into the following categories:
a. Low-level Functions:
(i) RSA key.keypair(. . . ), which generates e and d for the given p and q.
(ii) RSA key.modulus(. . . ), which generates p and q for the specified bit length.
(iii) RSA key.sign(. . . ), which signs x for a given N and d.
(iv) RSA key.encrypt(. . . ), which encrypts x for a given N and e.
(v) RSA key.decrypt(. . . ), which decrypts x for a given N and d.
(vi) pad encrypt then HMAC(. . . ), which handles the named operations given a plaintext and
two keys.
(vii) decrypt and verify(. . . ), which reverses the operations performed by pad encrypt then HMAC(. . . )
while verifying the input could be decrypted.
b. High-level Functions:
(i) ttp prepare(. . . ), which computes the TTPs RSA key pair.
(ii) ttp sign(. . . ), which signs the connecting party’s RSA key.
(iii) ttp sendkey(. . . ), which sends the TTPs public key.
(iv) sign request(. . . ), which requests a signature from the TTP.
(v) key request(. . . ), which requests the TTP’s public key.
(vi) client protocol(. . . ), which performs the Client’s side of the protocol, modified from
Assignment 2.
(vii) server protocol(. . . ), which performs the Server’s side of the protocol, modified from
Assignment 2.
(viii) server prepare(. . . ), which computes the Server’s registration values N, g, k, as well as
the Server’s RSA key pair.
Note that some values may be supplied as an integer or as a bytes object; your functions will need
to translate between them as the context requires. All numbers are converted to bytes via network
byte order, which is big-endian.6 Additional details and documentation of these functions can be
found in the template program found on the Piazza resources page.
Submission: Submit a completed version of the template program with filename
basic handshake.py
6Functions will be supplied with the template to help with conversion.
13
If you’ve spread your code across multiple source files, submit all of them.
Provide a description of your implementation in a separate README file in text format. Do not
include the written portion of the programming problem in the PDF file containing your solutions
to the written problems. Your description should include the following:
a. A list of files submitted that pertain to the problem and a short description of each file.
b. A list of what is implemented in the event that you are submitting a partial solution or a
statement that the problem is fully solved.
c. A list of what is not implemented in the event that you are submitting a partial solution,
d. A list of known bugs or a statement that there are no known bugs.
You may either use your own code or the solution files from Assignments 1 and 2.
14