osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Extracting data from ython dictionary object


I am new to Python. I am trying to extract text from the bookmarks in a PDF file that would provide the data for a Word template merge. I have gotten down to a string of text pulled out of the list object that I got from using PyPDF2 module.  I am stuck on now to get the data out of the string that I need.  I am calling it a string, but Python is recognizing as a dictionary object.  

Here is the string: 

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

What a want is the following to end up as fields on my Word template merge:
MedSourceFirstName: "John"
MedSourceLastName: "Milani"
MedSourceLastTreatment: "05/28/2014"

If I use keys() on the dictionary I get this:
['/Title', '/Page', '/Type']I was hoping "Src" and Tmt Dt." would be treated as keys.  Seems like the key/value pair of a dictionary would translate nicely to fieldname and fielddata for a Word document merge.  Here is my  code so far. 

[python]import PyPDF2
pdfFileObj=open('x.pdf','rb')
pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
MyList=pdfReader.getOutlines()
MyDict=(MyList[-1][0])
print(isinstance(MyDict,dict))
print(MyDict)
print(list(MyDict.keys()))[/python] 

I get this output in Sublime Text:
True
{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}
['/Title', '/Page', '/Type']
[Finished in 0.4s]

Thank you in advance for any suggestions.