## Description

Part 1: Voting

1. In the American presidential elections, while the popular vote is used up to the state level,

the electoral college decides the winner at the national level. Assuming there are only two

candidates, is this system strategy-proof? Does it elect a Condorcet winner? Justify your

answers. (Note: You can assume for simplicity that each state gets a single “vote” in a

national election, and the state then runs an election by popular vote with however many

people are in that state to determine which vote to cast at the national level.)

2. Suppose we want to run a popular vote election between two candidates A and B. There

are 1,000,000 eligible voters and suppose 52% of them prefer A to B. Instead of running

a full election where every person casts a vote, we poll m randomly selected people (with

replacement for simplicity) and ask them to report their preferred candidate. The result of

the poll is the majority of the responses received.

(a) Compute a value of m so that the result of the poll is incorrect with probability at most

1%? (Use the Chernoff bound in the book, show your work.)

(b) Let n be the number of people in the population, be defined such that (1/2 + ) · n

prefer A to B, and let δ be the desired accuracy (so the probability the result is incorrect

is at most δ). Write m as a function of n, , and δ.

If the number of people in the population increased by a factor of 10, how would that

affect m? If decrease by a factor of 2, how would that affect m? If we want to increase

our confidence by a factor of 10, how would that change m? If = 1/n (so 1 person

would be the deciding vote), what would this imply about m given your bound from

above?

(c) In practice, what might be wrong with the above assumptions (i.e. why might we not

we use polls to run our elections)?

Part 2: Stable Matchings

3. For the following setting, find (a) the male-optimal and (b) the female-optimal stable matching. For each part, simulate the Gale-Shapley algorithm (i.e. in words, clearly indicate what

happens at each step) to show that it arrives at the matching you find.

Females Preferences Males Preferences

A X > W > Y > Z W D > B > C > A

B X > W > Y > Z X D > B > A > C

C X > W > Z > Y Y C > B > D > A

D Y > W > Z > X Z D > B > C > A

4. Prove that there exists a non-bipartite matching setting (where every individual has preferences over all other individuals) for which no stable matching exists.

Part 3: Beliefs

2

5. There is a test for a certain disease that has a 15% false positive rate and a 25% false negative

rate. (So, if someone has the disease, there is a 75% chance the test will return positive; if

they do not, there is an 85% chance it will return negative.) If 1% of the population has this

disease, what is the probability that someone who tests positive for the disease actually has

it? (Use Bayes’ Rule; show your work.)

6. In the “foolishness of crowds” example from section 16.2 of the notes, let’s assume that people

decide that their own evidence should instead be weighed c times as heavily as others’ prior

guesses, where c is an integer greater than 1. Now what is the probability that a cascade

occurs and everyone is incorrect (in terms of , where the probability that a player receives

correct evidence is 1/2 + )?

7. Recall the “muddy children” example covered in class and in the notes. Section 17.2 of the

notes gives a formal argument that Claim 17.1 (that, if there are m muddy children, they

will answer “yes” on and not before round m) holds for the case where there are two total

children.

(a) Now draw the knowledge network for the case where there are three total children.

(b) Using a similar argument to the case of two children, use your network and the formal

“possible worlds” model of knowledge to formally show that Claim 17.1 holds for the

case when there are three total children (and any non-zero number of muddy children).

Part 4: PageRank and Social Networks

The objective of this question is to use the PageRank algorithm as a way to determine how

“influential” a node is in a social network based on its in-links from influential nodes. For this

question, we provided a template in python called ‘hw4.py’. You must use the template to submit

your code, as we will grade your code in a (partially) automated way. You should submit the hw4.py

file, not a .pynb or other python file.

8. Design an algorithm that runs the iterative -scaled PageRank algorithm for a specified number n of rounds on a given directed graph, with = 1/7. Run it (with n = 10) on the

examples in figures 15.1 (both left and right) and 15.2 (the two disjoint triangle graph), as

well as at least two other simple test cases with at least 10 nodes.

9. Now we’ll run PageRank on the Facebook data.

[http://snap.stanford.edu/data/egonets-Facebook.html].

The file is called “facebook combined.txt.gz”; remember that it has 4,039 nodes.

(a) Once again, remember that this is an undirected graph! Before running your algorithms

from the previous problem, implement a transformation into a directed graph, i.e. each

undirected edge corresponds to two different directed edges.

(b) Now, run the PageRank algorithms from the last problem on this new graph. You

shouldn’t need n to be much higher than 10-20 for the algorithm to converge to a fixed

point.

(c) Where did most of the score tend to end up in your experiments? Look at the nodes

that have the highest or lowest scores; is there a consistent pattern among your trials?

3

(d) Intuitively explain your results in terms of a measure of influence in a social network.

Do you think that this is an accurate measurement? How could we try to improve it

(for instance, by incorporating link strengths or other measures of popularity)?

Part 5: Essay Question

(This problem should be completed individually and not in a group. However, it will be graded

based on completion.)

Write 1/2 to 1 page discussing or analyzing one of the following prompts using any of the

concepts taught in this class:

• Read the following New York Times article adapting a recent work from Nobel Prize winners

Esther Duflo and Abhijit Banerjee:

The article makes the claim that people aren’t driven by financial incentives as much as you

would assume. In particular, they cite a study that claims that people believe that “Everyone else responds to incentives, but I dont.” Why might this undermine certain assumptions

made in this class? How can we model this situation using ideas from this class?

• Watch the following speech from Sacha Baron Cohen:

The speech discusses the relevance of the spread of (mis)information in social network. For

example, he makes the claim that “fake news outperforms real news because lies spread faster

than truth.” How can use tools from this class to explain this phenomena? How could we

use tools from this class to identify the spread of misinformation?

• Pick one or more recent news article (within the last few months) related to networks, markets,

or beliefs to analyze and discuss. (In particular, you may look at one of the above examples

and discuss a different aspect if you wish.)

4