osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Extracting data from ython dictionary object


Stanley Denman <dallasdisabilityattorney at gmail.com> writes:

> I am new to Python. I am trying to extract text from the bookmarks in a PDF file that would provide the data for a Word template merge. I have gotten down to a string of text pulled out of the list object that I got from using PyPDF2 module.  I am stuck on now to get the data out of the string that I need.  I am calling it a string, but Python is recognizing as a dictionary object.  
>
> Here is the string: 
>
> {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}
>
> What a want is the following to end up as fields on my Word template merge:
> MedSourceFirstName: "John"
> MedSourceLastName: "Milani"
> MedSourceLastTreatment: "05/28/2014"
>
> If I use keys() on the dictionary I get this:
> ['/Title', '/Page', '/Type']I was hoping "Src" and Tmt Dt." would be treated as keys.  Seems like the key/value pair of a dictionary would translate nicely to fieldname and fielddata for a Word document merge.  Here is my  code so far. 

A Python "dict" is a mapping of keys to values. Its "keys" method
gives you the keys (as you have used above).
The subscription syntax ("<some_dict>[<some_key>]"; e.g.
"pdf_info['/Title']") allows you to access the value associated with
"<some_key>".

In your case, relevant information is coded inside the values themselves.
You will need to extract this information yourself. Python's "re" module
might be of help (see the "library reference", for details).