## Assign4## HMM/Viterbi
In this assignment you will implement the viterbi algorithm to find the best path or state-sequence for a given input DNA sequence. The HMM is specified in a file, where \(M\) is the number of symbols (4 for DNA), \(N\) is the number of states, \(A\) is the transition probability matrix between states, and \(E\) gives the symbol emission probabilities for each state. The state numbers begin at 0, with 0 being the begin state. There is no explicit end state. You can add one if you want. The HMM is specified in compact form where only non-zero transition, and emission probabilities are specified. For example, one of the lines under "A:" is You may assume that "A:" and "E:" appear alone on a single line. First you must read in all the transition probabilities (until you hit a "E:"), then read the emission probabilities. Make sure that you convert all probabilities into log, and take the summation instead of products to compute the best path through the HMM for a given sequence (in fasta format). Read sec 3.6 in R12 for how/why of log transformation. ## What to submitWrite a program called Test your code on the HMM file hmm.txt, and the input sequence file Attach:test.txt. Here is an example output on the following test sequence on Attach:hmm.txt file. >test2 AGCTCAGTTGCTTATGCGACACCA The output is: Log-prob: -53.8207431931 State seq: [0, 1, 24, 2, 23, 25, 1, 24, 3, 23, 4, 22, 3, 23, 1, 23, 4, 22, 4, 21, 2, 21, 2, 22, 1] The log (base 2) of the probability is given. Please note that |

Page last modified on March 12, 2019, at 08:09 PM