## Description

Q1 (6 points): Manually create FSTs for the following regular relations and save the

FSTs in Carmel format as files “fst1”, “fst2”, “fst3” under q1/.

fst1 for {(a2n

, b

n

) | n >=0}

fst2 for {(a

n

, b

2nc) | n >=0}

fst3 for{(an d*, (bc)

n g) | n>=0}

Q2 (14 points): Use Carmel to build a FST acceptor, fst_acceptor.sh.

The format of the command line is: fst_acceptor.sh fst_file input_file >

output_file

fst_file is an FST in the Carmel format (e.g., “examples/fst0”,

“examples/wfst1”, “examples/wfst2”)

Each line in the input file is a string (e.g., “examples/ex”, “examples/ex2”)

Each line in the output_file has the format “x => y prob” (e.g.,

“examples/ex.fst0”), where

o x is the string from the input file.

o y is the output string if x is accepted by the FST, or *none* if x is not

accepted by the FST.

o prob is the probability of the path whose yield is x.

o The probability of a path is the product of the probabilities of the edges in

the path.

o If there are multiple paths for an input string x, y is the output string of

the path with the highest probability (for paths with the same probabilities,

Carmel breaks the tie somehow)

Run your fst_acceptor.sh with the FSTs in Q1 and hw3/examples/ex as input file,

save the output files in ex.fst[1-3], respectively, under q2/.

fst_acceptor.sh q1/fst1 hw3/examples/ex > q2/ex.fst1

fst_acceptor.sh q1/fst2 hw3/examples/ex > q2/ex.fst2

fst_acceptor.sh q1/fst3 hw3/examples/ex > q2/ex.fst3

Run the following commands and save the output files under q2/.

fst_acceptor.sh hw3/examples/wfst1 hw3/examples/ex2 > q2/ex2.wfst1

fst_acceptor.sh hw3/examples/wfst2 hw3/examples/ex2 > q2/ex2.wfst2

Q3 (25 points): Build fst_acceptor2.sh WITHOUT using Carmel, which has the same

command line format and functionality as fst_acceptor.sh except:

fst_acceptor2.sh CANNOT use Carmel

Since we have not covered Viterbi algorithm, fst_acceptor2.sh will handle only

non-ambiguous FST; that is, if you ignore the output symbols on the arcs, the

resulting FSA is a DFA, not an NFA.

If the FST is ambiguous, your code should print to stderr “The input FST is

ambiguous”, and print out nothing to stdout.

a) Run the following commands and save the output files under q3/. Remember to

use fst1 and fst2, not wfst1 and wfst2 for the following:

fst_acceptor2.sh hw3/examples/fst1 hw3/examples/ex2 > q3/ex2.fst1

fst_acceptor2.sh hw3/examples/fst2 hw3/examples/ex2 > q3/ex2.fst2

Q4 (30 points): Build nfa_to_dfa.sh WITHOUT using Carmel, which converts an input

NFA to an equivalent DFA (with the caveat below).

The format of the command line is:

nfa_to_dfa.sh input_nfa_file > output_dfa_file

Both input_nfa_file and output_dfa_file are FSA files in the Carmel format: the

former is an NFA, the latter is a DFA. If the former is a DFA, the script will

simply output the same FSA.

To learn how to convert an NFA to an DFA, please look at the slides attached to

this assignment called NFA-to-DFA-conversion.pdf. You can also find videos on

youtube such as https://www.youtube.com/watch?v=taClnxU-nao

The main idea is that each state in the output DFA will correspond to a set of

states in the input NFA. The algorithm will start from the start state of the NFA

and determine for each input symbol, what states will be reached and make those

states into a single state in the DFA. Then for each new DFA state, determine

what states can be reached for each input symbol. Repeat the process until no new

DFA states will be created. A DFA state is a final state if and only if one of the

corresponding NFA states is a final state in the NFA.

One caveat about the DFA: If the resulting DFA may have more than one final

state, because the Carmel format allows only one final state, you have to create a

new final state (let’s call it FinalState), and add arcs that go from each of these

DFA final states to FinalStatewith with the empty string as the label. So in that

sense, the output-dfa-file is not a real DFA, but this is due to the limitation of the

Carmel format. If the resulting DFA has only one final state, do NOT create a

new final state.

For the resulting “DFA”, let’s use the following convention for naming the states

in the DFA:

o Suppose a DFA state corresponds to the set S, say {q1, q3, q5}, the state

should be called q1-q3-q5 (the states are sorted alphabetically), not q1-q5-

q3, etc.

If S has only one member, say q1, then the DFA state should be

called q1 as well.

o If the DFA has multiple final states, and we have to create a new final

state, let’s call that final state FinalState. You can assume that the input

NFA does not have a state with the same name.

The conversion algorithm in the NFA-to-DFA-conversion.pdf file defines a few

functions:

o MoveNFA(s, a) is the set of states in the input NFA that can be reached

from a state s when the input symbol is a. Here, s is a single state in the

NFA.

o MoveNFA(S, a) is the set of states in the input NFA that can be reached

from any state s in S when the input symbol is a. Here, S is a set of states

in the NFA.

o ԑ-Closure(s) is the set of states in the input NFA that can be reached from

a state s by going over zero or more arcs with empty string as the label.

o ԑ-Closure(S) is the set of states in the input NFA that can be reached from

any state in S by going over zero or more arcs with empty string as the

label.

o MoveDFA(A, a)=B, where B is the set of states in the input NFA that can be

reached from a state A in the DFA when the input symbol is a. Here, A is a

state in the DFA which means that A actually corresponds to a set of states

in the NFA. If B is not already a state in the current DFA, we then make it

as a new state in the DFA.

Run the following commands and save the output files under q4/.

nfa_to_dfa.sh hw3/examples/nfa1 > q4/dfa1

nfa_to_dfa.sh hw3/examples/nfa2 > q4/dfa2

The submission should include:

The readme.(pdf | txt) file as always.

Hw.tar.gz that includes all the files specified in submit-file-list