|
|
Choosing A Webhost: |
Re: Parameter Settings in BaumWelchTraining]: msg#00023java.bio.general
I agree. If the BaumWelch trainer does cause problems one could always implement a different version of ModelTrainer. - Mark Dan Bolser <dmb@xxxxxxxxxxxxxxxxxx> 03/12/2004 04:47 PM To: Mark Schreiber/GP/Novartis@PH cc: sacoca@xxxxxxxxxxxxx, Biojava Mailing List <biojava-l@xxxxxxxxxxx> Subject: Re: [Biojava-l] Parameter Settings in BaumWelchTraining] On Fri, 12 Mar 2004 mark.schreiber@xxxxxxxxxxxxxxxxxx wrote: > When you call the train() method of the BaumWelchTrainer you supply it > with a SequenceDB. The sequences from this DB are used to optimize the > weights of the model. > > However, I have a bad feeling that when you train your model with the > BaumWelchTrainer your previously set counts will be ignored and > overwritten. You could check by looking into AbstractModelTrainer.train() > (which is what the BaumWelchTrainer extends). You could also run some > tests to see if using a pre-trained model makes any difference to the > final outcome. Does anyone more expert than me on the DP package (ie most > people) know if the counts are overwritten? The idea sounds good either way, so it would be a shame to have to reject it on the basis of a technicality :) Cheers > > - Mark > > > > > > sacoca@xxxxxxxxxxxxx > Sent by: biojava-l-bounces@xxxxxxxxxxxxxxxxxxx > 03/12/2004 01:30 PM > > > To: sacoca@xxxxxxxxxxxxx > cc: Biojava Mailing List <biojava-l@xxxxxxxxxxx> > Subject: Re: [Biojava-l] Parameter Settings in BaumWelchTraining] > > > Sorry for the previous error. > ---------------------------- Original Message ---------------------------- > Subject: Re: [Biojava-l] Parameter Settings in BaumWelchTraining > From: sacoca@xxxxxxxxxxxxx > Date: Fri, March 12, 2004 12:27 am > To: mark.schreiber@xxxxxxxxxxxxxxxxxx > -------------------------------------------------------------------------- > > Here is the code I have for the training. Using what you told me below, I > can retreive all of the weights that I calculated manually for the hmm > (distributions for the transitions and distributions for the alphabet of > each state). What I do not understand is how to use this information and > the sequences stored in a file to run the BaumWelchAlgorithm and then > retreive the optimized values calculated by the algorithm to set them back > into my HMM. > > //Retreive the alphabet of all states > FiniteAlphabet SA = hmm.stateAlphabet(); > Iterator i = SA.iterator(); > > SimpleModelTrainer MT = new SimpleModelTrainer(); > MT.registerModel(hmm); > > //go through each state > while(i.hasNext()) > {Symbol Currentstate = (Symbol)i.next(); > > //Retreive the distribution of all transitions from the current state > FiniteAlphabet From = hmm.transitionsFrom((State)Currentstate); > Distribution d = hmm.getWeights((State)Currentstate); > Iterator i2 = From.iterator(); > > //go through it and look at all the weights for each of the transitions > while(i2.hasNext()) > {Symbol s = (Symbol)i2.next(); > System.out.println("From state "+Currentstate.getName()+ > "To State "+s.getName()+ > "Weight "+d.getWeight(s));} > > //get the distribution for the alphabet of the current state > Distribution d2 =((EmissionState)Currentstate).getDistribution(); > FiniteAlphabet IN = (FiniteAlphabet)hmm.emissionAlphabet(); > Iterator i3 = IN.iterator(); > //you can go through it the same way as above using a while loop > ***************************************************************** > This is what I don't understand!!!! > ***************************************************************** > here, we have a set of training sequences stored in a file in fasta format > that i'd like to use with the BaumWelch algorithm to optimize the > transition distributions mentionned above. > > //This is the file with all the training sequences > BufferedInputStream is = new BufferedInputStream(new > FileInputStream("z:/Sequences.faa")); > > //Load the file with the SequenceDB class > SequenceDB DB = SeqIOTools.readFasta(is, ProtAlphabet); > > //use 100 cycles as the stop criteria > StoppingCriteria stopper = new StoppingCriteria() > {public boolean isTrainingComplete(TrainingAlgorithm ta) > {return (ta.getCycle() > 100);}}; > > ***************************************** > This part is what I am clueless about > ***************************************** > //How do I optimize my hmm with the BaumWelch algorithm and retreive //the > optimized values ? How do I train the distribution above with //the baum > welch and the sequences that I have ? > DP dp= DPFactory.DEFAULT.createDP(hmm); > BaumWelchTrainer bwt = new BaumWelchTrainer(dp); > } > > PS : I do not know why you are helping all of us here but thank you. It > makes Biojava a lot easier to deal with. > > Steve > > > Hi Stephane - > > > > Within EmissionState you can set a Distribution that contains emission > probabilities for the Symbols states emission alphabet using the > setDistribution method. This Distribution will be your predetermined > weights. > > > > To set the transition probabilities you can use the setWeights(State > source, Distribution weights). The source is the state you are > > transitioning from and the weights is the probability of transitioning > to any State that the source connects too. Because States implement > Symbol you can put them in a Distribution. > > > > To make a Distribution of States that state 'a' could connect to use the > following pseudo code: > > > > State a; > > Model m; > > FiniteAlphabet endPoints; > > > > endPoints = m.transitionsFrom(a); > > Distribution d = > > DistributionFactory.DEFAULT.createDistribution(endPoints); > > > > //You can then train d or set it's weights and put it back in the model > with > > > > m.setWeights(a, d); > > > > Mark Schreiber > > Principal Scientist (Bioinformatics) > > > > Novartis Institute for Tropical Diseases (NITD) > > 1 Science Park Road > > #04-14 The Capricorn, Science Park II > > Singapore 117528 > > > > phone +65 6722 2973 > > fax +65 6722 2910 > > > > > > > > > > > > sacoca@xxxxxxxxxxxxx > > Sent by: biojava-l-bounces@xxxxxxxxxxxxxxxxxxx > > 03/12/2004 06:11 AM > > > > > > To: "Biojava Mailing List" <biojava-l@xxxxxxxxxxx> > > cc: > > Subject: [Biojava-l] Parameter Settings in > > BaumWelchTraining > > > > > > Hi all. I'm trying to optimize the transition states probabilities for > my HMM. I already have set them to values which I think are pretty good. > Since I know the Baum Welch can only help with the scores and optimize > them up to a local maxima I thought of using the parameters I calculated > as a starting point. The problem is that I don't know how! > > I followed the example in biojava: > > > > .... > > //train the model to have uniform parameters > > ModelTrainer mt = new SimpleModelTrainer(); > > //register the model to train > > mt.registerModel(hmm); > > > > I want to use the values already set in my hmm as the starting > parameters in the BaumWelch. I don't want to use the uniform > distribution as indicated below! > > > > //as no other counts are being used the null weight will cause > > everything to be uniform > > mt.setNullModelWeight(1.0); > > mt.train(); > > > > I tried adding counts and looking up examples on the net but ended up > more confused than I started. How do I use the addCounts to make this > work! > > > > Stephane Acoca > > Master's Student > > McGill Center for Bioinformatics > > > > _______________________________________________ > > Biojava-l mailing list - Biojava-l@xxxxxxxxxxx > > http://biojava.org/mailman/listinfo/biojava-l > > > > > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@xxxxxxxxxxx > http://biojava.org/mailman/listinfo/biojava-l > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@xxxxxxxxxxx > http://biojava.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l@xxxxxxxxxxx http://biojava.org/mailman/listinfo/biojava-l
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Parameter Settings in BaumWelchTraining], Dan Bolser |
|---|---|
| Next by Date: | Re: Parameter Settings in BaumWelchTraining], Thomas Down |
| Previous by Thread: | Re: Parameter Settings in BaumWelchTraining], Thomas Down |
| Next by Thread: | is there a biojava class diagram? like bioperl diagram, Magic Fang |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |
Home
| advertise | OSDir is
an inevitable website.
|