COMP 8780 Assignment #3 solution


Original Work ?
Category: You will Instantly receive a download link for .ZIP solution file upon Payment


5/5 - (4 votes)

1. Build a baseline statistical tagger.

(i) [10 points] Use the assignment#2’s hash of hashes to train a
baseline lexicalized statistical tagger on the entire BROWN corpus.

(ii) [20 points] Use the baseline lexicalized statistical tagger to tag
all the words in the SnapshotBROWN.pos.all.txt file. Evaluate and report the
performance of this baseline tagger on the Snapshot file.

(iii) [20 points] add few rules to handle unknown words for the tagger
in (ii). The rules can be morphological, contextual, or of other
nature. Use 25 new sentences to evaluate this tagger (the (ii) tagger +
unknown word rules). You can pick 25 sentences from a news article
from the web and report the performance on those.

NOTE: You should only consider the 45 proper tags from Penn Treebank
tagset (available in the slides). You should disregard tags such as
-NONE-, etc.