-
Notifications
You must be signed in to change notification settings - Fork 11
/
README
70 lines (43 loc) · 1.46 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
pdftoxml
====
version 1.0.0
July 2007
The Xpdf software and documentation are
copyright 1996-2007 Glyph & Cog, LLC.
Email: derekn@foolabs.com
WWW: http://www.foolabs.com/xpdf/
The PDF data structures, operators, and specification are
copyright 1985-2006 Adobe Systems Inc.
The libxml2 software and documentation are released under the MIT License.
See the Copyright file in the distribution for the precise wording.
What is pdftoxml?
-------------
pdftoxml is an open source PDF to XML convertor.
pdftoxml runs under Linux and on Win32 systems.
pdftoxml is based on xpdf and is essentially a (large) modification
of pdftotext in order to generate XML instead of plain text.
The XML generation uses the libxml2 library
Distribution
------------
pdftoxml is licensed under the GNU General Public License (GPL), version 2.
Compatibility
-------------
pdftoxml is developed and tested on a Linux 2.4 x86 system.
In addition, it has been compiled on a Win32 system.
Getting pdftoxml
------------
The latest version is available from: https://sourceforge.net/projects/pdf2xml
Source code is available from: http://pdf2xml.cvs.sourceforge.net/pdf2xml/
Running pdftoxml
------------
To run pdftoxml, simply type:
pdftoxml.exe file.pdf
Command line options and many other details are(should be) described in sourceforge
Compiling pdftoxml
--------------
See the separate file, INSTALL.
Contributors
----
Hervé Déjean (src)
Sophie Andrieu (src)
Jean-Yves Vion-Dury (schemas)