## Description

1. (25 points) Write a paragraph describing how you became interested in Computational Linguistics,

any projects or specific areas you’re interested in, and/or career goals. How would you characterize

your experience in linguistics, math, or computer programming (or other relevant engineering)?

Recalling the lecture slides, which of the subfields or subtasks of Computational Linguistics are you

particularly interested in?

2. (25 points) Consider the following sentence:

I saw that gas can explode.

a. How many phrase structure trees can you find for this sentence? Do not include pragmatically

odd interpretations. Draw each tree and provide a discriminating explanation of the situation

modeled by the interpretation.

b. Write the phrase structure trees from the previous question using Penn Treebank notation. That

is, write it with brackets and parentheses: (S (NP (NNP Kim)) (VP (VBZ sleeps)))

3. (10 points) How many six-letter “words” can be formed from the alphabet { a – z }? A “word” for

this question must have at least one vowel { a e i o u }, and may not contain all vowels. Show your

work and explain your answer.

4. (10 points) How many ways can the characters in the following tuple be arranged?

( 萄 萄 萄 萄 橙 橙 苹 梨 蕉 )

5. (30 points) Consider a document processing system which performs pairwise comparisons and a

corpus containing 19 documents as follows:

Topic Count

Conference Proceedings 7

Journal Articles 9

Workshop Abstracts 3

a. How many pairwise comparisons are possible between documents on the same topic?

b. How many pairwise comparisons are possible between documents on different topics?

** (10 points, extra credit) In the lecture, we showed that you can form

𝑛!

(𝑛 − 𝑘)! 𝑘!

different unordered sets of k distinct items from a set of n distinct items.

Write an expression that

gives the number of unordered sets of k items that can be formed from a set of n distinct items while

allowing repetition in the output set.