|
[BioPython] BLAST in a python generator: msg#00034python.bio.general
Regarding threads and Biopython. I've been experimenting with keeping BLAST running in a separate thread using a python generator. Then calling the .next() method of the generator when I want to run the next query. The query is held in a stringIO buffer that the generator can read. The Idea is that the overhead of reading the database needn't be repeated as that is part of the generator's state. This is the first time I've written a generator. Unfortunately I don't seem to be able to get _all_ of the output of the BLAST record. Most of the time my select loops return only part of the result. The code below is one of several schemes I've tried. For those interested in the idea, here is some of the code: (no Biopython in this snippet) def BLASTpipe(inBuf, blastDB = genomeDB): """Generator for a BLAST process. inBuf is a StringIO buffer that contains one or more query sequences. .next() processes the query(s) in inBuf. inBuf is consumed and a tuple of the output and error strings is returned. """ # Format DB, if necessary if not os.access(blastDB + '.nhr' ,os.R_OK) \ or not os.access(blastDB + '.nin' ,os.R_OK) \ or not os.access(blastDB + '.nsq' ,os.R_OK): # db is not formatted tmpDbFile = NamedTemporaryFile() userDbFile = file(blastDB,'r') tmpDbFile.write(userDbFile.read()) userDbFile.close() tmpDbFile.flush() blastDB = tmpDbFile.name # format db os.system('%s -pF -l /dev/null -i%s' % (formatdb_exe, blastDB)) blast_in, blast_out, blast_err = os.popen3(blast_exe + \ ' -p blastn -d %s ' % (blastDB), 't',1) while True: outString = '' errString = '' inBuf.seek(0) inQuery = inBuf.read() blast_in.write(inQuery) inBuf.seek(0) inBuf.truncate() readyReaders, undef, undef = select([blast_out,blast_err],[],[],0.5) while readyReaders != []: if blast_out in readyReaders: outString = blast_out.read(1) while blast_out in select([blast_out],[],[],0.5) [0]: outString += blast_out.read(1) if blast_err in readyReaders: errString = blast_err.read(1) while blast_err in select([blast_err],[],[],0.5) [0]: errString += blast_err.read(1) readyReaders, undef, undef = select([blast_out],[],[],0.5) yield outString, errString # end Comments? Kael -- Kael Fischer, Ph.D. DeRisi Lab, University of California San Francisco Desk: 415-514-4320 kael@xxxxxxxxxxxxxxxxxx _______________________________________________ BioPython mailing list - BioPython@xxxxxxxxxxxxx http://biopython.org/mailman/listinfo/biopython |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | [BioPython] nj files form alignments: 00034, Edoardo Saccenti |
|---|---|
| Next by Date: | [BioPython] equation: 00034, Ernesto |
| Previous by Thread: | Re: [BioPython] Is BioPython thread safe?i: 00034, Brandon King |
| Next by Thread: | [BioPython] regular expression and filtering dna strings: 00034, enrico curiotto |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |