Part 0: Getting started
For this project, we are assigning you to a team. We will let you change these teams in future assignments.
You can find your assigned teammate(s) by logging into IU Github, at http://github.iu.edu/. In the
upper left hand corner of the screen, you should see a pull-down menu. Select cs-b551-sp2021. Then in the
box below, you should see a repository called userid1 -a2, userid1-userid2 -a2, or userid1-userid2-userid3 -a2,
where the other user ID(s) correspond to your teammate(s). Now that you know their userid(s), you can
write them an email at email@example.com.
To get started, clone the github repository:
git clone firstname.lastname@example.org:cs-b551-sp2021/your-repo-name -a2
If that doesn’t work, instead try:
git clone https://github.iu.edu/cs-b551-sp2021/your-repo-name-a2
where your-repo-name is the one you found on the GitHub website above.
Part 1: Pikachu
Pikachu is a popular childhood game in a certain rural midwestern town that requires just a board consisting
of a grid of n × n (with n ≥ 7) squares, 2n white stones, and 2n black stones. Each stone is also called a
Pichu. Initially the board starts empty, except for two rows of white Pichus on the second and third row of
the board, and two rows of black Pichus on rows n − 2 and n − 1. Two players alternate turns, with White
going first. When a Pichu reaches the opposite side of the board (i.e. when a Black Pichu reaches row 1 or
a white Pichu reaches row n), then the Pichu is marked with an X and becomes a Pikachu. In any given
turn, a player can do one of the following:
• Move a single Pichu of his or her color one square forward, left, or right, if that square is empty.
• Move a single Pichu of his or her color to “jump” over a single piece of the opposite color by moving
two squares forward, left, or right, if that square is empty. The jumped piece is removed from the
board as soon as it is jumped.
• Move a single Pikachu of his or her color any number of squares forward, left, right, or backwards, to
an empty square, as long as all squares in between are also empty.
• Move a single Pikachu of his or her color to “jump” over a single piece of the opposite color and
landing any number of squares forward, left, right, or backwards, as long as all of the squares between
the Pikachu’s start position and jumped piece are empty and all the squares between the jumped piece
and the ending position are empty. The jumped piece is removed as soon as it is jumped.
The winner is the player who captures all of the other player’s pieces first.
Your task is to write a Python program that plays Pikachu well. Your program should accept a command
line argument that gives the current state of the board as a string of .’s, w’s, W’s, b’s, and B’s, which indicate
which squares have no piece, a white Pichu, a white Pikachu, a black Pichu, and a black Pikachu, respectively,
in row-major order. For example, if n = 7, then the encoding of the start state of the game would be:
More precisely, your program will be called with four command line parameters: (1) the value of n, (2) the
current player (w or b), (3) the state of the board, encoded as above, and (4) a time limit in seconds. Your
program should then decide a recommended single move for the given player with the given current board
state, and display the new state of the board after making that move. Displaying multiple lines of output
is fine as long as the last line has the recommended board state. The time limit is the amount of time that
your program should expect to have to make its decision; our testing code will kill your program at that
point, and will use whichever was the last move your program recommended. For example, a sample run of
your program might look like:
[djcran@macbook]$ python3 ./pikachu.py 7 w …….wwwwwwwwwwwwwww…….bbbbbbbbbbbbbb……. 10
Hmm, I’d recommend moving the Pichu at row 3 column 3 to row 4 column 3.
The competitions. To make things more interesting, we will provide a way for your program to play against
other groups’ programs. We will release this code at least 1 week before the deadline. While the majority of
your grade will be on correctness, programming style, quality of answers given in comments, etc., a portion
may be based on how well your code performs against others, with particularly well-performing programs
eligible for prizes including extra credit points.
Hint: Since our grading program only looks at the last solution that you output, your program can output
multiple solutions. In other words, you can choose to completely ignore the time parameter passed to your
program and simply output multiple answers until you run out of time and we automatically kill your
Note: Your code must conform with the interface standards mentioned above! The last line of the output
must be the new board in the format given, without any extra characters or empty lines. Also, note that
your program cannot assume that the game will be run in sequence from start to end; given a current board
position on the command line, your code must find a recommended next best move. Your program can write
files to disk to preserve state between runs, but should correctly handle the case when a new board state is
presented to your program that is unrelated to the last state it saw.
Part 2: The Game of Sebastian
is a one-player game of luck and skill. Each turn has four steps:
1. The player rolls five dice.
1Yes, Sebastian and Pichu are David’s pet birds and yes this is yet another bird-themed problem.
2. The player inspects the dice and chooses any subset (including none or all) and rolls them.
3. The player inspects the dice again again choose any subset and rolls them.
4. The player must assign the outcome to exactly one category on their score card, depending on which
five dice are showing after the third roll.
Here are the categories on the score card:
• Primis: The player can add the number of dice that show 1 to his or her score.
• Secundus: The player can count the number of dice that show 2, multiply by 2, and add to the score.
• Tertium: The player can count the number of dice that show 3, multiply by 3, and add to the score.
• Quartus: The player can count the number of dice that show 4, multiply by 4, and add to the score.
• Quintus: The player can count the number of dice that show 5, multiply by 5, and add to the score.
• Sextus: The player can count the number of dice that show 6, multiply by 6, and add to the score.
• Company: If the five dice are either 1, 2, 3, 4, 5 or 2, 3, 4, 5, 6, the player can add 40 points to their
score. (Note that for this and all other categories, the order of the dice is not important.)
• Prattle: If four of the five dice are either 1, 2, 3, 4, or 2, 3, 4, 5, or 3, 4, 5, 6, the player can add 30
points to their score.
• Squadron: If three of the dice show the same number, and the other two dice are also the same, the
player can add 25 points to their score.
• Triplex: If three of the dice are the same, the player can add up the values of all five dice and add this
to their score.
• Quadrupla: If four of the dice are the same, the player can add up the values of all five dice and add
the sum to their score.
• Quintuplicatam: If all five dice are the same, the player can add 50 points to their score.
• Pandemonium: The sum of all five dice, no matter what they are.
Players can choose which category to fill at the end of each turn, but each category may be filled
only once per game. A player can also choose to assign a roll to a category that does not match the
requirements, but then a zero is entered into that category.
For example, here’s what a few moves of the game might look like, with a single player:
• Player rolls dice and gets 1, 2, 3, 4, 5 on the first roll. They decide not to reroll any dice. Player
assigns it to Company and gets 40 points. (They could have decided to assign it instead to Prattle for
30 points, or to one of the first 5 categories to get a score of 1 for Primis, 2 for Secundus, 3 for Tertium,
4 for Quartus, or 5 for Quintus, or 15 for Pandemonium. Or they could have chosen to get a 0 for any
• Player rolls dice and gets 1, 3, 3, 4, 4. They decide to reroll the 1, and they get a 2. They reroll the 2,
and get a 6. So the final dice are 6, 3, 3, 4, 4. They could assign this to Primis for 1 point, Tertium for
6 points, or Quartus for 8 points, or 6 for Sextus, or 20 for Pandemonium, or any other category for 0
points. They choose to assign to Quartus, and now have 48 points (i.e., 40 from the first turn plus 8
from this turn).
• Player rolls dice and gets 1, 2, 2, 3, 3. They reroll the 1 and get 4, and reroll the 4 and get 5. So the
final dice are 5, 2, 2, 3, 3. Their choices now are: Quintus for 5 points, Primis for 1 point, Secundus for
4 points, Tertium for 6 points, or 15 for Pandemonium, or 0 in another category. They choose Tertium
and now have 54 points.
• Player rolls dice and gets 1, 2, 3, 4, 5 on the first roll. Player can’t assign it to Company since they’ve
already used this category already, so they count it as a Prattle instead, and now have 84 points.
• Player rolls dice and gets 1, 2, 3, 4, 5 after their 3 rolls. The player can’t take either Company or Prattle,
or Tertium or Quartus since they have been used, but they could take Primis for 1 point, Secundus for
2 points, or Quintus for 5 points, or 15 for Pandemonium. However, the player decides it’s unlikely
they’ll ever manage to get a Quintuplicatam, whereas it’s much more likely they’ll be able to get a
better score for one of those other categories, so they assign to Quintuplicatam and get a score of 0.
• And so on…
After 13 turns, all categories are full, and the game ends. If the player managed to get a score of at least 63
totaled across the first six categories (Primis through Sextus), they get a bonus of 35 points added to their
Implement a program that plays Sebastian as well as possible, i.e. getting the highest possible score. Your
goal is to achieve as high an average score as possible. As with Assignment 1, a small portion of your grade
will be based on how well your program works with respect to the rest of the class. To get you started, we’ve
implemented some skeleton code that should be in your cloned repository. You can run it like,
The skeleton code implements a very naive automatic player that simply rolls once and assigns the roll to
whichever category is next available. You’ll want to modify the file called SebastianAutoPlayer.py. There
are helper programs called sebastian.py and SebastianState.py to keep track of the scoreboard and to roll
the dice. While you should look at SebastianState.py to understand how these classes work, and you may
want to modify SebastianState.py for debugging purposes, your final submission should run using the
SebastianState.py file that we supplied without any modifications, or else we won’t be able to
grade your submission correctly.
Part 3: Document classification
Many practical problems involve sorting textual objects — documents, emails, sentences, tweets, etc. —
into two specific categories — spam vs nonspam, important vs unimportant, acceptable vs inappropriate,
etc. Naive Bayes classifiers are often used for such problems. They often use a bag-of-words model, which
means that each object is represented as just an unordered “bag” of words, with no information about the
grammatical structure or order of words in the document. Suppose there are classes A and B. For a given
textual object D consisting of words w1, w2, …, wn, a Bayesian classifier evaluates decides that D belongs to
A by computing the “odds” and comparing to a threshold,
P(A|w1, w2, …, wn)
P(B|w1, w2, …, wn)
where P(A|w1, …wn) is the posterior probability that D is in class A. Using the Naive Bayes assumption,
the odds ratio can be factored into P(A), P(B), and terms of the form P(wi
|A) and P(wi
|B). These are
the parameters of the Naive Bayes model.
Your job is to write a program that estimates the Naive Bayes parameters from training data (where the
correct label is given), and then uses these parameters to classify new text objects.
We’ve provided skeleton code and a sample training and testing dataset to get you started. You can run it
python3 ./classify.py tweets.location.train.txt tweets.location.test.txt
Hints: Don’t worry, at least at first, about whether the “words” in your model are actually words. Just
treat every unique space-delimited token you encounter as a “word,” even if it’s misspelled, a number, a
punctuation mark, etc. It may be helpful to ignore tokens that do not occur more than a handful of times,
What to turn in
Turn in the three programs on GitHub (remember to add, commit, push) — we’ll grade whatever version
you’ve put there as of 11:59PM on the due date. To make sure that the latest version of your work has been
accepted by GitHub, you can log into the github.iu.edu website and browse the code online.