Description

5/5 - (1 vote)

ECE 232E Project 1 Random Graphs and Random Walks

One can use igraph library1
to generate different networks and measure various properties of a given network. The library has R and
Python implementations. You may choose either language that you
prefer. However, for this project, using R is strongly recommended,
as some functions might not be implemented for the Python version of
the package.
Submission: Upload a zip file containing your report and codes to CCLE.
One submission from any member of groups is sufficient.
1https://igraph.sourceforge.net/
1
1 Generating Random Networks
1. Create random networks using Erdös-Rényi (ER) model
(a) Create an undirected random networks with n = 1000 nodes,
and the probability p for drawing an edge between two arbitrary vertices 0.003, 0.004, 0.01, 0.05, and 0.1. Plot the degree
distributions. What distribution is observed? Explain why.
Also, report the mean and variance of the degree distributions
and compare them to the theoretical values.
(b) For each p and n = 1000, answer the following questions:
Are all random realizations of the ER network connected? Numerically estimate the probability that a generated network is
connected. For one instance of the networks with that p, find
the giant connected component (GCC) if not connected. What
is the diameter of the GCC?
(c) It turns out that the normalized GCC size (i.e., the size of the
GCC as a fraction of the total network size) is a highly nonlinear function of p, with interesting properties occurring for
values where p = O(
ln n
n
). For n = 1000, sweep over values
of p in this region and create 100 random networks for each
p. Then scatter plot the normalized GCC sizes vs p. Empirically estimate the value of p where a giant connected component starts to emerge (define your criterion of “emergence”)?
Do they match with theoretical values mentioned or derived in
lectures?
(d) i. Define the average degree of nodes c = n × p = 0.5. Sweep
over number of nodes, n, ranging from 100 to 10000. Plot
the expected size of the GCC of ER networks with n nodes
and edge-formation probabilities p = c/n, as a function of n.
What trend is observed?
ii. Repeat the same for c = 1.
iii. Repeat the same for values of c = 1.1, 1.2, 1.3, and show the
results for these three values in a single plot.
2
2. Create networks using preferential attachment model
(a) Create an undirected network with n = 1000 nodes, with preferential attachment model, where each new node attaches to
m = 1 old nodes. Is such a network always connected?
(b) Use fast greedy method to find the community structure. Measure modularity.
(c) Try to generate a larger network with 10000 nodes using the
same model. Compute modularity. How is it compared to the
smaller network’s modularity?
(d) Plot the degree distribution in a log-log scale for both n =
1000, 10000, then estimate the slope of the plot.
(e) You can randomly pick a node i, and then randomly pick a
neighbor j of that node. Plot the degree distribution of nodes
j that are picked with this process, in the log-log scale. How
does this differ from the node degree distribution?
(f) Estimate the expected degree of a node that is added at time
step i for 1 ≤ i ≤ 1000. Show the relationship between the
age of nodes and their expected degree through an appropriate
plot.
(g) Repeat the previous parts for m = 2, and m = 5. Why was
modularity for m = 1 high?
(h) Again, generate a preferential attachment network with n =
1000, m = 1. Take its degree sequence and create a new network with the same degree sequence, through stub-matching
procedure. Plot both networks, mark communities on their
plots, and measure their modularity. Compare the two procedures for creating random power-law networks.
3. Create a modified preferential attachment model that penalizes
the age of a node
(a) Each time a new vertex is added, it creates m links to old vertices and the probability that an old vertex is cited depends on
its degree (preferential attachment) and age. In particular, the
3
probability that a newly added vertex connects to an old vertex
is proportional to:
P[i] ∼ (ckα
i + a)(dlβ
i + b),
where ki is the degree of vertex i in the current time step, and
li is the age of vertex i. Produce such an undirected network
with 1000 nodes and parameters m = 1, α = 1, β = −1, and
a = c = d = 1, b = 0. Plot the degree distribution. What is the
power law exponent?
(b) Use fast greedy method to find the community structure. What
is the modularity?
4
2 Random Walk on Networks
1. Random walk on Erdös-Rényi networks
(a) Create an undirected random network with 1000 nodes, and
the probability p for drawing an edge between any pair of nodes
equal to 0.01.
(b) Let a random walker start from a randomly selected node (no
teleportation). We use t to denote the number of steps that the
walker has taken. Measure the average distance (defined as
the shortest path length) hs(t)i of the walker from his starting
point at step t. Also, measure the standard deviation σ
2
(t) =
h(s(t) − hs(t)i)
2
i of this distance. Plot hs(t)i v.s. t and σ
2
(t) v.s.
t. Here, the average h·i is over random choices of the starting
nodes.
(c) Measure the degree distribution of the nodes reached at the
end of the random walk. How does it compare to the degree
distribution of graph?
(d) Repeat (b) for undirected random networks with 100 and 10000
nodes. Compare the results and explain qualitatively. Does the
diameter of the network play a role?
2. Random walk on networks with fat-tailed degree distribution
(a) Generate an undirected preferential attachment network with
1000 nodes, where each new node attaches to m = 1 old nodes.
(b) Let a random walker start from a randomly selected node. Measure and plot hs(t)i v.s. t and σ
2
(t) v.s. t.
(c) Measure the degree distribution of the nodes reached at the
end of the random walk on this network. How does it compare
with the degree distribution of the graph?
(d) Repeat (b) for preferential attachment networks with 100 and
10000 nodes, and m = 1. Compare the results and explain
qualitatively. Does the diameter of the network play a role?
5
3. PageRank
The PageRank algorithm, as used by the Google search engine,
exploits the linkage structure of the web to compute global “importance” scores that can be used to influence the ranking of search
results. Here, we use random walk to simulate PageRank.
(a) Create a directed random network with 1000 nodes, using the
preferential attachment model, where m = 4. Note that in
this directed model, the out-degree of every node is m, while
the in-degrees follow a power law distribution. Measure the
probability that the walker visits each node. Is this probability
related to the degree of the nodes?
(b) In all previous questions, we didn’t have any teleportation.
Now, we use a teleportation probability of α = 0.15. By performing random walks on the network created in 3(a), measure the probability that the walker visits each node. Is this
probability related to the degree of the node?
4. Personalized PageRank
While the use of PageRank has proven very effective, the web’s
rapid growth in size and diversity drives an increasing demand for
greater flexibility in ranking. Ideally, each user should be able to
define their own notion of importance for each individual query.
(a) Suppose you have your own notion of importance. Your interest
in a node is proportional to the node’s PageRank, because you
totally rely upon Google to decide which website to visit (assume that these nodes represent websites). Again, use random
walk on network generated in part 3 to simulate this personalized PageRank. Here the teleportation probability to each
node is proportional to its PageRank (as opposed to the regular PageRank, where at teleportation, the chance of visiting all
nodes are the same and equal to 1
N
). Again, let the teleportation probability be equal to α = 0.15. Compare the results with
3(b).
6
(b) Find two nodes in the network with median PageRanks. Repeat part (a) if teleportations land only on those two nodes
(with probabilities 1/2, 1/2). How are the PageRank values affected?
(c) More or less, (c) is what happens in the real world, in that a
user browsing the web only teleports to a set of trusted web
pages. However, this is against the different assumption of
normal PageRank, where we assume that people’s interest in
all nodes are the same. Can you take into account the effect of
this self-reinforcement and adjust the PageRank equation?
7
Final Remarks
The following functions from igraph library are useful for this project:
• degree, degree.distribution, diameter, vcount, ecount
• random.graph.game, barabasi.game, aging.prefatt.game,
degree.sequence.game
• page_rank
For part 2 of the project, you can start off with the Jupyter notebook
provided to you.
8

ECE 232E Project 2 Social Network Mining

In this project, we will study the various properties of social networks.
In the first part of the project, we will study an undirected social network (Facebook). In the second part of the project, we will study a
directed social network (Google +).
1 Facebook network
In this project, we will be using the dataset given below:
https://snap.stanford.edu/data/egonets-Facebook.html
The Facebook network can be created from the edgelist file (facebook
combined.txt)
1.1 Structural properties of the facebook network
Having created the facebook network, we will study some of the structural properties of the network. To be specific, we will study
1
• Connectivity
• Degree distribution
Question 1: Is the facebook network connected? If not, find the giant
connected component (GCC) of the network and report the size of the
GCC.
Question 2: Find the diameter of the network. If the network is not
connected, then find the diameter of the GCC.
Question 3: Plot the degree distribution of the facebook network and
report the average degree.
Question 4: Plot the degree distribution of question 3 in a log-log scale.
Try to fit a line to the plot and estimate the slope of the line.
1.2 Personalized network
A personalized network of an user vi is defined as the subgraph induced
by vi and it’s neighbors. In this part, we will study some of the structural properties of the personalized network of the user whose graph
node ID is 1 (node ID in edgelist is 0). From this point onwards, whenever we are refering to a node ID we mean the graph node ID which is
1 + node ID in edgelist.
Question 5: Create a personalized network of the user whose ID is 1.
How many nodes and edges does this personalized network have?
Question 6: What is the diameter of the personalized network? Please
state a trivial upper and lower bound for the diameter of the personalized network.
Question 7: In the context of the personalized network, what is the
meaning of the diameter of the personalized network to be equal to the
2
upper bound you derived in question 6. What is the meaning of the diameter of the personalized network to be equal to the lower bound you
derived in question 6?
1.3 Core node’s personalized network
A core node is defined as the nodes that have more than 200 neighbors.
For visualization purpose, we have displayed the personalized network
of a core node below.
An example of a personal network. The core node is shown in black.
In this part, we will study various properties of the personalized network of the core nodes.
Question 8: How many core nodes are there in the Facebook network.
What is the average degree of the core nodes?
3
1.3.1 Community structure of core node’s personalized network
In this part, we study the community structure of the core node’s personalized network. To be specific, we will study the community structure of the personalized network of the following core nodes:
• Node ID 1
• Node ID 108
• Node ID 349
• Node ID 484
• Node ID 1087
Question 9: For each of the above core node’s personalized network,
find the community structure using Fast-Greedy, Edge-Betweenness,
and Infomap community detection algorithms. Compare the modularity scores of the algorithms. For visualization purpose, display the community structure of the core node’s personalized networks using colors.
Nodes belonging to the same community should have the same color
and nodes belonging to different communities should have different
color. In this question, you should have 15 plots in total.
1.3.2 Community structure with the core node removed
In this part, we will explore the effect on the community structure of a
core node’s personalized network when the core node itself is removed
from the personalized network.
Question 10: For each of the core node’s personalized network(use same
core nodes as question 9), remove the core node from the personalized
network and find the community structure of the modified personalized
network. Use the same community detection algorithm as question 9.
4
Compare the modularity score of the community structure of the modified personalized network with the modularity score of the community
structure of the personalized network of question 9. For visualization
purpose, display the community structure of the modified personalized
network using colors. In this question, you should have 15 plots in
total.
1.3.3 Characteristic of nodes in the personalized network
In this part, we will explore characteristics of nodes in the personalized network using two measures. These two measures are stated and
defined below:
• Embeddedness of a node is defined as the number of mutual
friends a node shares with the core node.
• Dispersion of a node is defined as the sum of distances between
every pair of the mutual friends the node shares with the core node.
The distances should be calculated in a modified graph where the
node (whose dispersion is being computed) and the core node are
removed.
For further details on the above characteristics, you can read the paper
below:
https://arxiv.org/abs/1310.6753
Question 11: Write an expression relating the Embeddedness of a
node to it’s degree.
Question 12: For each of the core node’s personalized network (use the
same core nodes as question 9), plot the distribution of embeddedness
and dispersion. In this question, you will have 10 plots.
Question 13: For each of the core node’s personalized network, plot
5
the community structure of the personalized network using colors and
highlight the node with maximum dispersion. Also, highlight the edges
incident to this node. To detect the community structure, use FastGreedy algorithm. In this question, you will have 5 plots.
Question 14: Repeat question 13, but now highlight the node with
maximum embeddedness and the node with maximum dispersion
embeddedness . Also,
highlight the edges incident to these nodes
Question 15: Use the plots from questions 13 and 14 to explain the
characteristics of a node revealed by each of this measure.
1.4 Friend recommendation in personalized networks
In many social networks, it is desirable to predict future links between
pairs of nodes in the network. In the context of this Facebook network
it is equivalent to recommending friends to users. In this part of the
project, we will explore some neighborhood-based measures for friend
recommendation. The network that we will be using for this part is the
personalized network of node with ID 415.
1.4.1 Neighborhood based measure
In this project, we will be exploring three different neighborhood-based
measures. Before we define these measures, let’s introduce some notation:
• Si is the neighbor set of node i in the network
• Sj is the neighbor set of node j in the network
Then, with the above notation we define the three measures below:
6
• Common neighbor measure between node i and node j is defined
as
CommonNeighbors(i, j) = |Si ∩ Sj
|
• Jaccard measure between node i and node j is defined as
Jaccard(i, j) = |Si ∩ Sj
|
|Si ∪ Sj
|
• Adamic-Adar measure between node i and node j is defined as
AdamicAdar(i, j) = X
k∈Si∩Sj
1
log(|Sk|)
1.4.2 Friend recommendation using neighborhood based measures
We can use the neighborhood based measures defined in the previous
section to recommend new friends to users in the network. Suppose we
want to recommend t new friends to some user i in the network using
Jaccard measure. We follow the steps listed below:
1. For each node in the network that is not a neighbor of i, compute
the jaccard measure between the node i and the node not in the
neighborhood of i
Compute Jaccard(i, j) ∀j ∈ S
C
i
2. Then pick t nodes that have the highest jaccard measure with node
i and recommend these nodes as friends to node i
1.4.3 Creating the list of users
Having defined the friend recommendation procedure, we can now apply it to the personalized network of node ID 415. Before we apply
7
the algorithm, we need to create the list of users who we want to recommend new friends to. We create this list by picking all nodes with
degree 24. We will denote this list as Nr.
Question 16: What is |Nr|?
1.4.4 Average accuracy of friend recommendation algorithm
In this part, we will apply the 3 different types of friend recommendation algorithms to recommend friends to the users in the list Nr. We
will define an average accuracy measure to compare the performances
of the friend recommendation algorithms.
Suppose we want to compute the average accuracy of the friend recommendation algorithm. This task is completed in two steps:
1. Compute the average accuracy for each user in the list Nr
2. Compute the average accuracy of the algorithm by averaging across
the accuracies of the users in the list Nr
Let’s describe the procedure for accomplishing the step 1 of the task.
Suppose we want to compute the average accuracy for user i in the list
Nr. We can compute it by iterating over the following steps 10 times
and then taking the average:
1. Remove each edge of node i at random with probability 0.25. In
this context, it is equivalent to deleting some friends of node i. Let’s
denote the list of friends deleted as Ri
2. Use one of the three neighborhood based measures to recommend
|Ri
| new friends to the user i. Let’s denote the list of friends recommended as Pi
3. The accuracy for the user i for this iteration is given by |Pi∩Ri
|
|Ri
|
8
By iterating over the above steps for 10 times and then taking the average gives us the average accuracy of user i. In this manner, we compute
the average accuracy for each user in the list Nr. Once we have computed them, then we can take the mean of the average accuracies of the
users in the list Nr. The mean value will be the average accuracy of the
friend recommendation algorithm.
Question 17: Compute the average accuracy of the friend recommendation algorithm that uses:
• Common Neighbors measure
• Jaccard measure
• Adamic Adar measure
Based on the average accuracy values, which friend recommendation
algorithm is the best?
2 Google+ network
In this part, we will explore the structure of the Google+ network. The
dataset for creating the network can be found in the link below:
https://snap.stanford.edu/data/egonets-Gplus.html
Create directed personal networks for users who have more than 2 circles. The data required to create such personal networks can be found
in the file named gplus.tar.gz.
Question 18: How many personal networks are there?
Question 19: For the 3 personal networks (node ID given below), plot
the in-degree and out-degree distribution of these personal networks.
9
Do the personal networks have a similar in and out degree distribution. In this question, you should have 6 plots.
• 109327480479767108490
• 115625564993990145546
• 101373961279443806744
2.1 Community structure of personal networks
In this part of the project, we will explore the community structure
of the personal networks that we created and explore the connections
between communities and user circles.
Question 20: For the 3 personal networks picked in question 19, extract the community structure of each personal network using Walktrap community detection algorithm. Report the modularity scores and
plot the communities using colors. Are the modularity scores similar?
In this question, you should have 3 plots.
Having found the communities, now we will explore the relationship
between circles and communities. In order to explore the relationship,
we define two measures:
• Homogeneity
• Completeness
Before, we state the expression for homogeneity and completeness, let’s
introduce some notation:
• C is the set of circles, C = {C1, C2, C3, · · · }
10
• K is the set of communities, K = {K1, K2, K3, · · · }
• ai is the number of people in circle Ci
• bi is the number of people in community Ki with circle information
• N is the total number of people with circle information
• Cji is the number of people belonging to community j and circle i
Then, with the above notation, we have the following expressions for
the entropy
H(C) = −
X
|C|
i=1
ai
N
log( ai
N
) (1)
H(K) = −
X
|K|
i=1
bi
N
log( bi
N
) (2)
and conditional entropy
H(C|K) = −
X
|K|
j=1
X
|C|
i=1
Cji
N
log(Cji
bj
) (3)
H(K|C) = −
X
|C|
i=1
X
|K|
j=1
Cji
N
log(Cji
ai
) (4)
Now we can state the expression for homogeneity, h as
h = 1 −
H(C|K)
H(C)
(5)
and the expression for completeness, c as
c = 1 −
H(K|C)
H(K)
(6)
Question 21: Based on the expression for h and c, explain the meaning
of homogeneity and completeness in words.
Question 22: Compute the h and c values for the community structures
of the 3 personal network (same nodes as question 19). Interpret the
values and provide a detailed explanation.
11
3 Submission
Please submit a zip file containing your codes and report to
CCLE. The zip file should be named as “Project2_UID1_…_UIDn.zip”
where UIDx are student ID numbers of team members. If you
had any questions you can post on piazza.

ECE 232E Project 3: Reinforcement learning and Inverse Reinforcement learning

1 Introduction
Reinforcement Learning (RL) is the task of learning from interaction to achieve
a goal. The learner and the decision maker is called the agent. The thing
it interacts with, comprising everything outside the agent, is called the environment. These interact continually, the agent selecting actions and the
environment responding to those actions by presenting rewards and new states.
In the first part of the project, we will learn the optimal policy of an agent
navigating in a 2-D environment. We will implement the Value iteration algorithm to learn the optimal policy.
Inverse Reinforcement Learning (IRL) is the task of extracting an expert’s reward function by observing the optimal policy of the expert. In the second
part of the project, we will explore the application of IRL in the context of
apprenticeship learning.
2 Reinforcement learning (RL)
The two main objects in Reinforcement learning are:
• Agent
• Environment
In this project, we will learn the optimal policy of a single agent navigating in
a 2-D environment.
2.1 Environment
In this project, we assume that the environment of the agent is modeled by a
Markov Decision Process (MDP). In a MDP, agents occupy a state of the
environment and perform actions to change the state they are in. After taking
an action, they are given some representation of the new state and some reward
value associated with the new state.
1
An MDP formally is a tuple (S, A,Pa
ss0 , Ra
ss0 , ) where:
• S is a set of states
• A is a set of actions
• Pa
ss0 is a set of transition probabilities, where Pa
ss0 is the probability of
transitioning from state s 2 S to state s0 2 S after taking action a 2 A
– Pa
ss0 = P(st+1 = s0
|st = s, at = a)
• Given any current state and action, s and a, together with any next state,
s0
, the expected value of the next reward is Ra
ss0
– Ra
ss0 = E(rt+1|st = s, at = a, st+1 = s0
)
• 2 [0, 1) is the discount factor, and it is used to compute the present
value of future reward
– If is close to 1 then the future rewards are discounted less
– If is close to 0 then the future rewards are discounted more
In the next few subsections, we will discuss the parameters that will be used to
generate the environment for the project.
2.1.1 State space
In this project, we consider the state space to be a 2-D square grid with 100
states. The 2-D square grid along with the numbering of the states is shown in
figure 1
Figure 1: 2-D square grid with state numbering
2.1.2 Action set
In this project, we consider the action set(A) to contain the 4 following actions:
• Move Right
• Move Left
2
• Move Up
• Move Down
The 4 types of actions are displayed in figure 2
Figure 2: 4 types of action
From the above figure, we can see that the agent can take 4 actions from
the state marked with a dot.
2.1.3 Transition probabilities
In this project, we define the transition probabilities in the following manner:
1. If state s0 and s are not neighboring states in the 2-D grid, then
P(st+1 = s0
|st = s, at = a)=0
s0 and s are neighbors in the 2-D grid if you can move to s0 from s by
taking an action a from the action set A. We will consider a state s to be
a neighbor of itself. For example, from figure 1 we can observe that states
1 and 11 are neighbors (we can transition from 1 to 11 by moving right)
but states 1 and 12 are not neighbors.
2. Each action corresponds to a movement in the intended direction with
probability 1 w, but has a probability of w of moving in a random
direction instead due to wind. To illustrate this, let’s consider the states
shown in figure 3
3
Figure 3: Inner grid states (Non-boundary states)
The transition probabilities for the non-boundary states shown in figure
3 are given below:
P(st+1 = 43|st = 44, at =”)=1 w +
w
4
P(st+1 = 34|st = 44, at =”) = w
4
P(st+1 = 54|st = 44, at =”) = w
4
P(st+1 = 45|st = 44, at =”) = w
4
From the above calculation it can be observed that if the agent is at a nonboundary state then it has 4 neighbors excluding itself and the probability
w is uniformly distributed over the 4 neighbors. Also, if the agent is at
a non-boundary state then it transitions to a new state after taking an
action (P(st+1 = 44|st = 44, at =”) = 0)
3. If the agent is at one of the four corner states (0,9,90,99), the agent stays
at the current state if it takes an action to move o↵ the grid or is blown
o↵ the grid by wind. The actions can be divided into two categories:
• Action to move o↵ the grid
• Action to stay in the grid
To illustrate this, let’s consider the states shown in figure 4
4
Figure 4: Corner states
The transition probabilities for taking an action to move o↵ the grid are
given below:
P(st+1 = 10|st = 0, at =”) = w
4
P(st+1 = 1|st = 0, at =”) = w
4
P(st+1 = 0|st = 0, at =”)=1 w +
w
4 +
w
4
The transition probabilities for taking an action to stay in the grid are
given below:
P(st+1 = 10|st = 0, at =!)=1 w +
w
4
P(st+1 = 1|st = 0, at =!) = w
4
P(st+1 = 0|st = 0, at =!) = w
4 +
w
4
At a corner state, you can be blown o↵ the grid in two directions. As a
result, we have P(st+1 = 0|st = 0, at =!) = w
4 + w
4 since we can be blown
o↵ the grid in two directions and in both the cases we stay at the current
state.
4. If the agent is at one of the edge states, the agent stays at the current
state if it takes an action to move o↵ the grid or is blown o↵ the grid by
wind. The actions can be divided into two categories:
• Action to move o↵ the grid
• Action to stay in the grid
To illustrate this, let’s consider the states shown in figure 5
5
Figure 5: Edge states
The transition probabilities for taking an action to move o↵ the grid are
given below:
P(st+1 = 0|st = 1, at = ) = w
4
P(st+1 = 11|st = 1, at = ) = w
4
P(st+1 = 2|st = 1, at = ) = w
4
P(st+1 = 1|st = 1, at = )=1 w +
w
4
The transition probabilities for taking an action to stay in the grid are
given below:
P(st+1 = 0|st = 1, at =”)=1 w +
w
4
P(st+1 = 11|st = 1, at =”) = w
4
P(st+1 = 2|st = 1, at =”) = w
4
P(st+1 = 1|st = 1, at =”) = w
4
At an edge state, you can be blown o↵ the grid in one direction. As a
result, we have P(st+1 = 1|st = 1, at =”) = w
4 since we can be blown o↵
the grid in one direction and in that case we stay at the current state.
The main di↵erence between a corner state and an edge state is that a corner
state has 2 neighbors and an edge state has 3 neighbors.
2.1.4 Reward function
To simplify the project, we will assume that the reward function is independent
of the current state (s) and the action that you take at the current state (a).
To be specific, reward function only depends on the state that you transition to
(s0
). With this simplification, we have
Ra
ss0 = R(s0
)
6
In this project, we will learn the optimal policy of an agent for two di↵erent
reward functions:
• Reward function 1
• Reward function 2
The two di↵erent reward functions are displayed in figures 6 and 7 respectively
Figure 6: Reward function 1
Figure 7: Reward function 2
Question 1: (10 points) For visualization purpose, generate heat maps of
Reward function 1 and Reward function 2. For the heat maps, make sure you
display the coloring scale. You will have 2 plots for this question
For solving question 1, you might find the following function useful:
https://matplotlib.org/api/_as_gen/matplotlib.pyplot.pcolor.html
7
3 Optimal policy learning using RL algorithms
In this part of the project, we will use reinforcement learning (RL) algorithm
to find the optimal policy. The main steps in RL algorithm are:
• Find optimal state-value or action-value
• Use the optimal state-value or action-value to determine the deterministic
optimal policy
There are a couple of RL algorithms, but we will use the Value iteration algorithm since it was discussed in detail in the lecture. We will skip the derivation
of the algorithm here because it was covered in the lecture (for the derivation
details please refer to the lecture slides on Reinforcement learning). We will just
reproduce the algorithm below for the ease of implementation:
1: procedure Value Iteration(Pa
ss0 , Ra
ss0 , S, A, ):
2: for all s 2 S do . Initialization
3: V (s) 0
4: end for
5: 1
6: while > ✏ do . Estimation
7: 0
8: for all s 2 S do
9: v V (s);
10: V (s) max
a2A
P
s02S Pa
ss0 [Ra
ss0 + V (s0
)];
11: max(, |v V (s)|);
12: end for
13: end while
14: for all s 2 S do . Computation
15: ⇡(s) arg max
a2A
P
s02S Pa
ss0 [Ra
ss0 + V (s0
)];
16: end for
17: end procedure return ⇡
Question 2: (40 points) Create the environment of the agent using the information provided in section 2. To be specific, create the MDP by setting up
the state-space, action set, transition probabilities, discount factor, and reward
function. For creating the environment, use the following set of parameters:
• Number of states = 100 (state space is a 10 by 10 square grid as displayed
in figure 1)
• Number of actions = 4 (set of possible actions is displayed in figure 2)
• w = 0.1
• Discount factor = 0.8
• Reward function 1
8
After you have created the environment, then write an optimal state-value function that takes as input the environment of the agent and outputs the optimal
value of each state in the grid. For the optimal state-value function, you have
to implement the Initialization (lines 2-4) and Estimation (lines 5-13) steps of
the Value Iteration algorithm. For the estimation step, use ✏ = 0.01. For visualization purpose, you should generate a figure similar to that of figure 1 but
with the number of state replaced by the optimal value of that state. In this
question, you should have 1 plot.
Question 3: (5 points) Generate a heat map of the optimal state values across
the 2-D grid. For generating the heat map, you can use the same function provided in the hint earlier (see the hint after question 1).
Question 4: (15 points) Explain the distribution of the optimal state values
across the 2-D grid. (Hint: Use the figure generated in question 3 to explain)
Question 5: (30 points) Implement the computation step of the value iteration
algorithm (lines 14-17) to compute the optimal policy of the agent navigating
the 2-D state-space. For visualization purpose, you should generate a figure
similar to that of figure 1 but with the number of state replaced by the optimal
action at that state. The optimal actions should be displayed using arrows.
Does the optimal policy of the agent match your intuition? Please provide a
brief explanation. Is it possible for the agent to compute the optimal action to
take at each state by observing the optimal values of it’s neighboring states? In
this question, you should have 1 plot.
Question 6: (10 points) Modify the environment of the agent by replacing Reward function 1 with Reward function 2. Use the optimal state-value function
implemented in question 2 to compute the optimal value of each state in the
grid. For visualization purpose, you should generate a figure similar to that
of figure 1 but with the number of state replaced by the optimal value of that
state. In this question, you should have 1 plot.
Question 7: (10 points) Generate a heat map of the optimal state values (found
in question 6) across the 2-D grid. For generating the heat map, you can use
the same function provided in the hint earlier.
Question 8: (20 points) Explain the distribution of the optimal state values
across the 2-D grid. (Hint: Use the figure generated in question 7 to explain)
Question 9: (20 points) Implement the computation step of the value iteration
algorithm (lines 14-17) to compute the optimal policy of the agent navigating
the 2-D state-space. For visualization purpose, you should generate a figure
similar to that of figure 1 but with the number of state replaced by the optimal
action at that state. The optimal actions should be displayed using arrows.
Does the optimal policy of the agent match your intuition? Please provide a
brief explanation. In this question, you should have 1 plot.
9
4 Inverse Reinforcement learning (IRL)
Inverse Reinforcement learning (IRL) is the task of learning an expert’s reward
function by observing the optimal behavior of the expert. The motivation for
IRL comes from apprenticeship learning. In apprenticeship learning, the goal of
the agent is to learn a policy by observing the behavior of an expert. This task
can be accomplished in two ways:
1. Learn the policy directly from expert behavior
2. Learn the expert’s reward function and use it to generate the optimal
policy
The second way is preferred because the reward function provides a much more
parsimonious description of behavior. Reward function, rather than the policy,
is the most succinct, robust, and transferable definition of the task. Therefore, extracting the reward function of an expert would help design more robust
agents.
In this part of the project, we will use IRL algorithm to extract the reward
function. We will use the optimal policy computed in the previous section as
the expert behavior and use the algorithm to extract the reward function of
the expert. Then, we will use the extracted reward function to compute the
optimal policy of the agent. We will compare the optimal policy of the agent
to the optimal policy of the expert and use some similarity metric between the
two to measure the performance of the IRL algorithm.
4.1 IRL algorithm
For finite state spaces, there are a couple of IRL algorithms for extracting the
reward function:
• Linear Programming (LP) formulation
• Maximum Entropy formulation
Since we covered LP formulation in the lecture and it is the simplest IRL algorithm, so we will use the LP formulation in this project. We will skip the
derivation of the algorithm (for details on the derivation please refer to the
lecture slides) here. The LP formulation of the IRL is given by equation 1
maximize
R,ti,ui
P|S|
i=1(ti ui)
subject to [(Pa1 (i) Pa(i))(I Pa1 )1R] ti, 8a 2 A \ a1, 8i
(Pa1 Pa)(I Pa1 )1R ⌫ 0, 8a 2 A \ a1
u R u
|Ri|  Rmax, i = 1, 2, ··· , |S|
(1)
In the LP given by equation 1, R is the reward vector (R(i) = R(si)), Pa is
the transition probability matrix, is the adjustable penalty coecient, and
ti’s and ui’s are the extra optimization variables (please note that u(i) = ui).
Use the maximum absolute value of the ground truth reward as Rmax. For the
10
ease of implementation, we can recast the LP in equation 1 into an equivalent
form given by equation 2 using block matrices.
maximize x cT x
subject to Dx 0, 8a 2 A \ a1
(2)
Question 10: (10 points) Express c, x, D in terms of R, Pa, Pa1 , ti, u, and
Rmax
4.2 Performance measure
In this project, we use a very simple measure to evaluate the performance of the
IRL algorithm. Before we state the performance measure, let’s introduce some
notation:
• OA(s): Optimal action of the agent at state s
• OE(s): Optimal action of the expert at state s
•
m(s) = (
1, OA(s) = OE(s)
0, else
Then with the above notation, accuracy is given by equation 3
Accuracy =
P
s2S m(s)
|S| (3)
Since we are using the optimal policy found in the previous section as the expert
behavior, so we will use the optimal policy found in the previous section to fill
the OE(s) values. Please note that these values will be di↵erent depending on
whether we used Reward Function 1 or Reward Function 2 to create the environment.
To compute OA(s), we will solve the linear program given by equation 2 to
extract the reward function of the expert. For solving the linear program you
can use the LP solver in python (from cvxopt import solvers and then use
solvers.lp). Then, we will use the extracted reward function to compute the optimal policy of the agent using the value iteration algorithm you implemented
in the previous section. The optimal policy of the agent found in this manner
will be used to fill the OA(s) values. Please note that these values will depend
on the adjustable penalty coecient . We will tune to maximize the accuracy.
Question 11: (30 points) Sweep from 0 to 5 to get 500 evenly spaced values for . For each value of compute OA(s) by following the process described
above. For this problem, use the optimal policy of the agent found in question
5 to fill in the OE(s) values. Then use equation 3 to compute the accuracy of
the IRL algorithm for this value of . You need to repeat the above process
for all 500 values of to get 500 data points. Plot (x-axis) against Accuracy
(y-axis). In this question, you should have 1 plot.
11
Question 12: (5 points) Use the plot in question 11 to compute the value of
for which accuracy is maximum. For future reference we will denote this
value as (1)
max. Please report (1)
max
Question 13: (15 points) For (1)
max, generate heat maps of the ground truth
reward and the extracted reward. Please note that the ground truth reward
is the Reward function 1 and the extracted reward is computed by solving the
linear program given by equation 2 with the parameter set to (1)
max. In this
question, you should have 2 plots.
Question 14: (10 points) Use the extracted reward function computed in question 13, to compute the optimal values of the states in the 2-D grid. For computing the optimal values you need to use the optimal state-value function that
you wrote in question 2. For visualization purpose, generate a heat map of
the optimal state values across the 2-D grid (similar to the figure generated in
question 3). In this question, you should have 1 plot.
Question 15: (10 points) Compare the heat maps of Question 3 and Question 14 and provide a brief explanation on their similarities and di↵erences.
Question 16: (10 points) Use the extracted reward function found in question 13
to compute the optimal policy of the agent. For computing the optimal policy
of the agent you need to use the function that you wrote in question 5. For visualization purpose, you should generate a figure similar to that of figure 1 but
with the number of state replaced by the optimal action at that state. The actions should be displayed using arrows. In this question, you should have 1 plot.
Question 17: (10 points) Compare the figures of Question 5 and Question 16
and provide a brief explanation on their similarities and di↵erences.
Question 18: (30 points) Sweep from 0 to 5 to get 500 evenly spaced values for . For each value of compute OA(s) by following the process described
above. For this problem, use the optimal policy of the agent found in question
9 to fill in the OE(s) values. Then use equation 3 to compute the accuracy of
the IRL algorithm for this value of . You need to repeat the above process
for all 500 values of to get 500 data points. Plot (x-axis) against Accuracy
(y-axis). In this question, you should have 1 plot.
Question 19: (5 points) Use the plot in question 18 to compute the value of
for which accuracy is maximum. For future reference we will denote this
value as (2)
max. Please report (2)
max
Question 20: (15 points) For (2)
max, generate heat maps of the ground truth
reward and the extracted reward. Please note that the ground truth reward
is the Reward function 2 and the extracted reward is computed by solving the
linear program given by equation 2 with the parameter set to (2)
max. In this
question, you should have 2 plots.
Question 21: (10 points) Use the extracted reward function computed in ques12
tion 20, to compute the optimal values of the states in the 2-D grid. For computing the optimal values you need to use the optimal state-value function that
you wrote in question 2. For visualization purpose, generate a heat map of
the optimal state values across the 2-D grid (similar to the figure generated in
question 7). In this question, you should have 1 plot.
Question 22: (10 points) Compare the heat maps of Question 7 and Question 21 and provide a brief explanation on their similarities and di↵erences.
Question 23: (10 points) Use the extracted reward function found in question 20
to compute the optimal policy of the agent. For computing the optimal policy
of the agent you need to use the function that you wrote in question 9. For visualization purpose, you should generate a figure similar to that of figure 1 but
with the number of state replaced by the optimal action at that state. The actions should be displayed using arrows. In this question, you should have 1 plot.
Question 24: (10 points) Compare the figures of Question 9 and Question 23
and provide a brief explanation on their similarities and di↵erences.
Question 25: (50 points) From the figure in question 23, you should observe
that the optimal policy of the agent has two major discrepancies. Please identify and provide the causes for these two discrepancies. One of the discrepancy
can be fixed easily by a slight modification to the value iteration algorithm.
Perform this modification and re-run the modified value iteration algorithm to
compute the optimal policy of the agent. Also, recompute the maximum accuracy after this modification. Is there a change in maximum accuracy? The
second discrepancy is harder to fix and is a limitation of the simple IRL algorithm. If you can provide a solution to the second discrepancy then we will give
you a bonus of 50 points.
5 Submission
Please submit a zip file containing your codes and report to CCLE.
The zip file should be named as ”Project2 UID1 … UIDn.zip” where
UIDx are student ID numbers of team members.

ECE 232E Project 4 IMDb Mining

In this project, we will study the various properties of Internet Movie
Database (IMDb). In the first part of the project, we will explore the
properties of a directed actor/actress network. In the second part of the
project, we will explore the properties of an undirected movie network.
1 Actor/Actress network
In this part of the project, we will create the network using the data
from the following text files:
• actor_movies.txt
• actress_movies.txt
The text files can be downloaded from the following link: https://
ucla.box.com/s/z45q3g5zrpay8b8gtbql6ojaecb7kj2u
In order to create the network in a consistent manner, you will need to
do some data preprocessing. The preprocessing consists of 2 parts:
1. Merging the two text files into one and then removing the actor/actress
1
who has acted in less than 10 movies
2. Cleaning the merged text file
The cleaning part is necessary to avoid inconsistency in the network
creation. If you analyze the merged text file, then you will observe
that same movie might be counted multiple times due to the role of the
actor/actress in that movie. For example, we might have
• Movie X (voice)
• Movie X (as uncredited)
If you don’t clean the merged text file, then Movie X (voice) and Movie
X (as uncredited) will be considered as different movies. Therefore, you
will need to perform some cleaning operations to remove inconsistencies of various types.
Question 1: Perform the preprocessing on the two text files and report
the total number of actors and actresses and total number of unique
movies that these actors and actresses have acted in.
1.1 Directed actor/actress network creation
We will use the processed text file to create the directed actor/actress
network. The nodes of the network are the actor/actress and there are
weighted edges between the nodes in the network. The weights of the
edges are given by equation 1
wi→j =
|Si ∩ Sj
|
|Si
|
(1)
where Si is the set of movies in which actor/actress vi has acted in and
Sj is the set of movies in which actor/actress vj has acted in.
Question 2: Create a weighted directed actor/actress network using the
2
processed text file and equation 1. Plot the in-degree distribution of the
actor/actress network. Briefly comment on the in-degree distribution.
1.2 Actor pairings
In this section, we will try to find the pairings between actors. We will
consider the following 10 actors:
• Tom Cruise
• Emma Watson (II)
• George Clooney
• Tom Hanks
• Dwayne Johnson (I)
• Johnny Depp
• Will Smith (I)
• Meryl Streep
• Leonardo DiCaprio
• Brad Pitt
Question 3: Design a simple algorithm to find the actor pairings. To
be specific, your algorithm should take as input one of the actors listed
above and should return the name of the actor with whom the input
actor prefers to work the most. Run your algorithm for the actors listed
above and report the actor names returned by your algorithm. Also for
each pair, report the (input actor, output actor) edge weight. Does all
the actor pairing make sense?
3
1.3 Actor rankings
In this section, we will extract the top 10 actor/actress from the network.
Question 4: Use the google’s pagerank algorithm to find the top 10 actor/actress in the network. Report the top 10 actor/actress and also the
number of movies and the in-degree of each of the actor/actress in the
top 10 list. Does the top 10 list have any actor/actress listed in the previous section? If it does not have any of the actor/actress listed in the
previous section, please provide an explanation for this phenomenon.
Question 5: Report the pagerank scores of the actor/actress listed in
the previous section. Also, report the number of movies each of these
actor/actress have acted in and also their in-degree.
2 Movie network
In this part, we will create an undirected movie network and then explore the various structural properties of the network.
2.1 Undirected movie network creation
We will use the processed text files from the previous section to create
the movie network. The nodes of the network are the movies and there
are weighted edges between the nodes in the network. To reduce the
size of the network, we will only consider movies that has at least 5
actor/actress in it. The weights of the edges are given by equation 2
wi→j =
|Ai ∩ Aj
|
|Ai ∪ Aj
|
(2)
4
where Ai is the set of actors in movie vi and Aj is the set of actors in
movie vj
. Since,
wi→j = wj→i
so we have an undirected network.
Question 6: Create a weighted undirected movie network using equation 2. Plot the degree distribution of the movie network. Briefly comment on the degree distribution.
2.2 Communities in the movie network
In this part, we will extract the communities in the movie network and
explore their relationship with the movie genre. For this part you will
need to load the movie_genre.txt file.
Question 7: Use the Fast Greedy community detection algorithm to
find the communities in the movie network. Pick 10 communities and
for each community plot the distribution of the genres of the movies in
the community.
Question 8(a): In each community determine the most dominant genre
based simply on frequency counts. Which generes tend to be the most
frequent dominant ones across communities and why?
Question 8(b): In each community, for the i
th genre assign a score of
ln(c(i)) ∗
p(i)
q(i) where: c(i) is the number of movies belonging to genre i in
the community; p(i) is the fraction of genre i movies in the community,
and q(i) is the fraction of genre i movies in the entire data set. Now
determine the most dominant genre in each communitiy based on the
modified scores. What are your findings and how do they differ from
the results in 8(a).
Question 8(c): Find a community of movies that has size between 10
and 20. Determine all the actors who acted in these movies and plot the
5
corresponding bipartite graph (i.e. restricted to these particular movies
and actors). Determine three most important actors and explain how
they help form the community. Is there a correlation between these
actors and the dominant genres you found for this community in 8(a)
and 8(b).
2.3 Neighborhood analysis of movies
In this part of the project, you will need to load the movie_rating.txt
file and we will explore the neighborhood of the following 3 movies:
• Batman v Superman: Dawn of Justice (2016); Rating: 6.6
• Mission: Impossible – Rogue Nation (2015); Rating: 7.4
• Minions (2015); Rating: 6.4
Question 9: For each of the movies listed above, extract it’s neighbors
and plot the distribution of the available ratings of the movies in the
neighborhood. Is the average rating of the movies in the neighborhood
similar to the rating of the movie whose neighbors have been extracted?
In this question, you should have 3 plots.
Question 10: Repeat question 10, but now restrict the neighborhood
to consist of movies from the same community. Is there a better match
between the average rating of the movies in the restricted neighborhood and the rating of the movie whose neighbors have been extracted.
In this question, you should have 3 plots.
Question 11: For each of the movies listed above, extract it’s top 5 neighbors and also report the community membership of the top 5 neighbors.
In this question, the sorting is done based on the edge weights.
6
2.4 Predicting ratings of movies
In this part of the project, we will explore various rating prediction
techniques to predict the ratings of the following 3 movies:
• Batman v Superman: Dawn of Justice (2016)
• Mission: Impossible – Rogue Nation (2015)
• Minions (2015)
Question 12: Train a regression model to predict the ratings of movies:
for the training set you can pick any subset of movies with available
ratings as the target variables; you have to specify the exact feature
set that you use to train the regression model and report the root mean
squared error (RMSE). Now use this trained model to predict the ratings of the 3 movies listed above (which obviously should not be included in your training data).
We will now predict the ratings of the movies using a different approach. To be specific, we will use a bipartite graph approach for rating
prediction. In a bipartite graph, G = (V, E), we have a partition of the
vertex set such that
V1 ∪ V2 = V
V1 ∩ V2 = ∅
and
eij = (vi
, vj )
where vi ∈ V1 and vj ∈ V2. In a bipartite graph, vertices belonging to
the same partitioned set are non-adjacent.
In this project, we will create a bipartite graph in the following manner:
• V1 represents the set of actor/actresses
7
• V2 represents the set of movies
• There is an edge eij between a node in V1 and V2 if the actor i has
acted in movie j
Question 13: Create a bipartite graph following the procedure described
above. Determine and justify a metric for assigning a weight to each
actor. Then, predict the ratings of the 3 movies using the weights of the
actors in the bipartite graph. Report the RMSE. Is this rating mechanism better than the one in question 12? Justify your answer.

ECE 232E Project 5 Graph Algorithms

Introduction
In this project we will explore graph theory theorems and algorithms,
by applying them on real data. In the first part of the project, we consider a particular graph which models correlations between stock price
time series. In the second part, we analyse traffic data on a dataset
provided by Uber.
1 Stock Market
In this part of the project, we study data from stock market. The data
is available on this Dropbox Link. The goal of this part is to study
correlation structures among fluctuation patterns of stock prices using
tools from graph theory. The intuition is that investors will have similar strategies of investment for stocks that are effected by the same
economic factors. For example, the stocks belonging the transportation sector may have different absolute prices, but if for example fuel
prices change or are expected to change significantly in the near future,
1
then you would expect the investors to buy or sell all stocks similarly
and maximize their returns. Towards that goal, we construct different
graphs based on similarities among the time series of returns on different stocks at different time scales (day vs a week). Then, we study properties of such graphs. The data is obtained from Yahoo Finance website
for 3 years. You’re provided with a number of csv tables, each containing several fields: Date, Open, High, Low, Close, Volume, and Adj Close
price. The files are named according to Ticker Symbol of each stock. You
may find the market sector for each company in Name_sector.csv.
1.1 Return correlation
In this part of the project, we will compute the correlation among lognormalized stock-return time series data. Before giving the expression
for correlation, we introduce the following notation:
• pi(t) is the closing price of stock i at the t
th day
• qi(t) is the return of stock i over a period of [t − 1, t]
qi(t) = pi(t) − pi(t − 1)
pi(t − 1)
• ri(t) is the log-normalized return stock i over a period of [t − 1, t]
ri(t) = log(1 + qi(t))
Then with the above notation, we define the correlation between the
log-normalized stock-return time series data of stocks i and j as
ρij =
hri(t)rj (t)i − hri(t)ihrj (t)i
p
(hri(t)
2i − hri(t)i
2
)(hrj (t)
2i − hrj (t)i
2
)
where h·i is a temporal average on the investigated time regime (for our
data set it is over 3 years).
Question 1: Provide an upper and lower bound on ρij . Also, provide
a justification for using log-normalized return (ri(t)) instead of regular
return (qi(t)).
2
1.2 Constructing correlation graphs
In this part,we construct a correlation graph using the correlation coefficient computed in the previous section. The correlation graph has
the stocks as the nodes and the edge weights are given by the following
expression
wij =
q
2(1 − ρij )
Compute the edge weights using the above expression and construct
the correlation graph.
Question 2: Plot the degree distribution of the correlation graph and
a histogram showing the un-normalized distribution of edge weights.
1.3 Minimum spanning tree (MST)
In this part of the project, we will extract the MST of the correlation
graph and interpret it.
Question 3: Extract the MST of the correlation graph. Each stock can
be categorized into a sector, which can be found in Name_sector.csv
file. Plot the MST and color-code the nodes based on sectors. Do you
see any pattern in the MST? The structures that you find in MST are
called Vine clusters. Provide a detailed explanation about the pattern
you observe.
1.4 Sector clustering in MST’s
In this part, we want to predict the market sector of an unknown stock.
We will explore two methods for performing the task. In order to eval3
uate the performance of the methods we define the following metric
α =
1
|V |
X
vi∈V
P(vi ∈ Si)
where Si is the sector of node i. Define
P(vi ∈ Si) = |Qi
|
|Ni
|
where Qi is the set of neighbors of node i that belong to the same sector
as node i and Ni is the set of neighbors of node i. Compare α with the
case where
P(vi ∈ Si) = |Si
|
|V |
Question 4: Report the value of α for the above two cases and provide
an interpretation for the difference.
1.5 Correlation graphs for weekly data
In the previous parts, we constructed the correlation graph based on
daily data. In this part of the project, we will construct a correlation
graph based on weekly data. To create the graph, sample the stock data
weekly on Mondays and then calculate ρij using the sampled data. If
there is a holiday on a Monday, we ignore that week. Create the correlation graph based on weekly data.
Question 5: Extract the MST from the correlation graph based on weekly
data. Compare the pattern of this MST with the pattern of the MST
found in question 3.
2 Let’s Help Santa!
Companies like Google and Uber have a vast amount of statistics about
transportation dynamics. Santa has decided to use network theory to
4
facilitate his gift delivery for the next christmas. When we learned
about his decision, we designed this part of the project to help him. We
will send him your results for this part!
2.1 Download the Data
Go to “Uber Movement” website and download data of Monthly Aggregate (all days), 2017 Quarter 4, for San Francisco area 1
. The
dataset contains pairwise traveling time statistics between most pairs
of points in San Francisco area. Points on the map are represented by
unique IDs. To understand the correspondence between map IDs and
areas, download Geo Boundaries file from the same website 2
. This
file contains latitudes and longitudes of the corners of the polygons circumscribing each area. In addition, it contains one street address inside each area, referred to as DISPLAY_NAME. To be specific, if an area
is represented by a polygon with 5 corners, then you have a 5×2 matrix
of the latitudes and longitudes, each row of which represents latitude
and longitude of one corner.
2.2 Build Your Graph
Read the dataset at hand, and build a graph in which nodes correspond
to locations, and undirected weighted edges correspond to the mean
traveling times between each pair of locations (only December). Add
the following attributes to the vertices:
1. Display name: the street address
2. Location: mean of the coordinates of the polygon’s corners (a 2-D
vector)
1
If you download the dataset correctly, it should be named as
san_francisco-censustracts-2017-4-All-MonthlyAggregate.csv
2The file should be named SAN_FRANCISCO_CENSUSTRACTS.JSON
5
The graph will contain some isolated nodes (extra nodes existing in the
Geo Boundaries JSON file) and a few small connected components. Remove such nodes and just keep the giant connected component of the
graph. In addition, merge duplicate edges by averaging their weights
3
. We will refer to this cleaned graph as G afterwards.
Question 6: Report the number of nodes and edges in G.
2.3 Traveling Salesman Problem
Question 7: Build a minimum spanning tree (MST) of graph G. Report
the street addresses of the two endpoints of a few edges. Are the results
intuitive?
Question 8: Determine what percentage of triangles in the graph (sets
of 3 points on the map) satisfy the triangle inequality. You do not need
to inspect all triangles, you can just estimate by random sampling of
1000 triangles.
Now, we want to find an approximation solution for the traveling salesman problem (TSP) on G. Apply the 2-approximate algorithm described
in the class4
. Inspect the sequence of street addresses visited on the
map and see if the results are intuitive.
Question 9: Find the empirical performance of the approximate algorithm:
ρ =
Approximate TSP Cost
Optimal TSP Cost
Question 10: Plot the trajectory that Santa has to travel!
3Duplicate edges may exist when the dataset provides you with the statistic of a road in both directions. We remove duplicate edges for the sake of simplicity.
4You can find the algorithm in: Papadimitriou and Steiglitz, “Combinatorial optimization: algorithms and complexity”, Chapter 17, page 414
6
3 Analysing the Traffic Flow
Next December, there is going to be a sport event between Stanford
University and University of California, Santa Cruz (UCSC). A large
number of students are enthusiastic about the event, which is going
to be held in UCSC. Stanford fans want to drive from their campus to
the rival’s. We would like to analyse the maximum traffic that can flow
from Stanford to UCSC.
3.1 Estimate the Roads
We want to estimate the map of roads without using actual road datasets.
Educate yourself about Delaunay triangulation algorithm and then apply it to the nodes coordinates5
.
Question 11: Plot the road mesh that you obtain and explain the result.
Create a subgraph G∆ induced by the edges produced by triangulation.
3.2 Calculate Road Traffic Flows
Question 12: Using simple math, calculate the traffic flow for each road
in terms of cars/hour.
Hint: Consider the following assumptions:
• Each degree of latitude and longitude ≈ 69 miles
• Car length ≈ 5 m = 0.003 mile
• Cars maintain a safety distance of 2 seconds to the next car
• Each road has 2 lanes in each direction
5You can use scipy.spatial.Delaunay in python
7
Assuming no traffic jam, consider the calculated traffic flow as the max
capacity of each road.
3.3 Calculate the Max Flow
Consider the following addresses:
• Source address: 100 Campus Drive, Stanford
• Destination address: 700 Meder Street, Santa Cruz
Question 13: Calculate the maximum number of cars that can commute per hour from Stanford to UCSC. Also calculate the number of
edge-disjoint paths between the two spots. Does the number of edgedisjoint paths match what you see on your road map?
3.4 Defoliate Your Graph
In G∆, there are a number of unreal roads that could be removed. For
instance, there are many fake bridges crossing the bay. Apply a threshold on the travel time of the roads in G∆ to remove the fake edges. Trim
the fake edges and call the resulting graph G˜∆.
Question 14: Plot G˜∆ on real map coordinates. Are real bridges preserved?
Hint: You can consider the following coordinates:
• Golden Gate Bridge: [[-122.475, 37.806], [-122.479, 37.83]]
• Richmond, San Rafael Bridge: [[-122.501, 37.956], [-122.387, 37.93]]
8
• San Mateo Bridge: [[-122.273, 37.563], [-122.122, 37.627]]
• Dambarton Bridge: [[-122.142, 37.486], [-122.067, 37.54]]
• San Francisco – Oakland Bay Bridge: [[-122.388, 37.788], [-122.302,
37.825]]
Question 15: Now, repeat question 8 for G˜∆ and report the results. Do
you see any significant changes?

ECE 232E Project 1 to 5 solutions

Download Details:

Description

ECE 232E Project 1 Random Graphs and Random Walks

ECE 232E Project 2 Social Network Mining

ECE 232E Project 3: Reinforcement learning and Inverse Reinforcement learning

ECE 232E Project 4 IMDb Mining

ECE 232E Project 5 Graph Algorithms

ECE 232E Project 1 to 5 solutions

Download Details:

Description

ECE 232E Project 1 Random Graphs and Random Walks

ECE 232E Project 2 Social Network Mining

ECE 232E Project 3: Reinforcement learning and Inverse Reinforcement learning

ECE 232E Project 4 IMDb Mining

ECE 232E Project 5 Graph Algorithms

Related products

ECE 232E Project 1 Random Graphs and Random Walks solution

ECE 232E Project 3: Reinforcement learning and Inverse Reinforcement learning solution

ECE 232E Summer 2025 Projects 1 to 4 solution