|
KEGG Gene Parser: msg#00072python.bio.devel
From: mike@xxxxxxxxxxx Subject: KEGG Gene Parser Date: 17 November 2004 11:23:08 GMT To: biopython-dev@xxxxxxxxxxxxx Hi, I've been working on a KEGG Gene parser attached and wondered if it would be of use to the project as a whole. It is based on the existing Bio.KEGG.Compound/Enzyme modules. I'm still in the process of testing the parser against more kegg files but it I believe it parses all the current kegg gene files successfully (except c.hominis). Known bugs and missing features: A method for outputting a record as straight text (__str__) Properly parsing the CODON_USAGE section. More refinement of expressions for POSITION section. Methods to handle CODON_USAGE and POSITION callbacks Parsing c.hominis aa_sequences (I'm not sure this is exactly a bug, see below) I'm not really interested in codon usage, position and returning a record in string form (for my current needs), so I haven't spent the time handling this stuff. It should be fairly easy to add if someone cares enough. Otherwise I may get around to it one day. I know c.hominis parsing is broken because they have very odd aa_seq entries and I'm trying to figure out if they have a broken program creating the file, mean something sensible, or it is just another dumb stretching of an inadequate flat-file format designed to test the patience of people writing parsers. I think it could do with some polishing and the parsing regexes could probably be optimised a fair bit but it is useful as it is. The files are available at <http://www.gene-hacker.net/python/__init__.py> <http://www.gene-hacker.net/python/gene_format.py> I'm continuing to work on the module to fix any parsing errors I come across with further testing/usage and will post updates soon. I'd be grateful for any comments or suggestions (be nice, I've only been using Python a little while ;) ) cheers Michael -- Dr Michael Maibaum Department of Biochemistry and Molecular Biology, UCL email: maibaum@xxxxxxxxxxxxxxxxxxxxxx |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Biopython-1.30.win32-py2.3 executable: 00072, Taschner, P. (HKG) |
|---|---|
| Next by Date: | Buy Regalis, also known as Superviagra or Cialis: 00072, Everett G. Cooke |
| Previous by Thread: | Biopython-1.30.win32-py2.3 executablei: 00072, Taschner, P. (HKG) |
| Next by Thread: | [Bug 1711] Enhancements to Bio.SCOP module: 00072, bugzilla-daemon |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |