logo       

[BioPython] BLAST in a python generator: msg#00034

python.bio.general

Subject: [BioPython] BLAST in a python generator

Regarding threads and Biopython.

I've been experimenting with keeping BLAST running in a separate thread
using a python generator. Then calling the .next() method of the generator
when I want to run the next query. The query is held in a stringIO buffer
that the generator can read.

The Idea is that the overhead of reading the database needn't be repeated
as that is part of the generator's state.

This is the first time I've written a generator. Unfortunately I don't seem to
be able to get _all_ of the output of the BLAST record. Most of the time
my select loops return only part of the result. The code below is one of
several schemes I've tried.

For those interested in the idea, here is some of the code:
(no Biopython in this snippet)

def BLASTpipe(inBuf, blastDB = genomeDB):
"""Generator for a BLAST process.
inBuf is a StringIO buffer that contains one or
more query sequences.
.next() processes the query(s) in inBuf. inBuf is consumed
and a tuple of the output and error strings is returned.
"""

# Format DB, if necessary
if not os.access(blastDB + '.nhr' ,os.R_OK) \
or not os.access(blastDB + '.nin' ,os.R_OK) \
or not os.access(blastDB + '.nsq' ,os.R_OK):
# db is not formatted
tmpDbFile = NamedTemporaryFile()
userDbFile = file(blastDB,'r')
tmpDbFile.write(userDbFile.read())
userDbFile.close()
tmpDbFile.flush()
blastDB = tmpDbFile.name
# format db
os.system('%s -pF -l /dev/null -i%s' % (formatdb_exe, blastDB))

blast_in, blast_out, blast_err = os.popen3(blast_exe + \
' -p blastn -d %s ' % (blastDB),
't',1)
while True:
outString = ''
errString = ''


inBuf.seek(0)
inQuery = inBuf.read()

blast_in.write(inQuery)

inBuf.seek(0)
inBuf.truncate()

readyReaders, undef, undef = select([blast_out,blast_err],[],[],0.5)
while readyReaders != []:
if blast_out in readyReaders:
outString = blast_out.read(1)
while blast_out in select([blast_out],[],[],0.5) [0]:
outString += blast_out.read(1)

if blast_err in readyReaders:
errString = blast_err.read(1)
while blast_err in select([blast_err],[],[],0.5) [0]:
errString += blast_err.read(1)

readyReaders, undef, undef = select([blast_out],[],[],0.5)

yield outString, errString

# end

Comments?

Kael

--
Kael Fischer, Ph.D.
DeRisi Lab, University of California San Francisco
Desk: 415-514-4320
kael@xxxxxxxxxxxxxxxxxx

_______________________________________________
BioPython mailing list - BioPython@xxxxxxxxxxxxx
http://biopython.org/mailman/listinfo/biopython



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise