-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathpdf2xml.py
More file actions
90 lines (84 loc) · 1.81 KB
/
pdf2xml.py
File metadata and controls
90 lines (84 loc) · 1.81 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# This code loads a desired pdf and converts it to xml so you can look at the document to see what you want to extract.
from pdfquery import PDFQuery
# load the pdf
pdf = PDFQuery('p1.pdf')
pdf.load()
#convert to xml
pdf.tree.write('p1.xml', pretty_print = True)
pdf
# Variables to get
"""
Headers for sheet
Gender
First Name
Middle Name
Last Name
Initials
Mother's maiden name
Birthday
Birthplace
Zodiacal Sign
Username
Password
Password Hash (MD5)
Password Hash (SHA1)
E-mail
Phone
Address
SSN (2141123212) - issued in maryland
Passport
Passport #
Passport issued date
Passport exp date
Passport code
P<USAANDREWS<<WAYNE<VICTOR<<<<<<<<<<<<<<<<<<
5730248430USA9605280M3012073<<<<<<<<<<<<<<08
Drivers License
Number
State issued
issued date
exp date
Car
Car License Plate
Number
state issued
date issued
Hair Color
Eyes Color
Height
Weight
Shoe Size
Blood Type
Unique ID Numbers
GUID
UniqID
Western Union MTCN
MoneyGram MTCN
FICO Credit Score
Experian
Equifax
Vantage Score
FICO NextGen Risk Score
FICO Small Business Scoring Service(SBSS)
Religion
Political side
Favorite Color
Favorite comfort food
Favorite Cereal
Favorite Season
Favorite Animal
Lucky number
Preparer Tax Identification Number (PTIN)
Interim PTIN (temporary PTIN)
Employer Identification Number (EIN)
Individual Taxpayer Identification Number (ITIN)
Adoption Taxpayer Identification Number (ATIN)
Alignment
Abilities
Charisma
Constitution
Dexterity
Intelligence
Strength
Wisdom
"""