-
-
Notifications
You must be signed in to change notification settings - Fork 16
/
README.txt
53 lines (37 loc) · 2.4 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
pdfcompare
==========
Compare text of two PDF files, write a resulting PDF with highlighted changes.
Potential text portions that were moved around are recognized and analyzed
for similarity with a second level diff.
Dependencies:
* pyPdf
* reportlab.pdfgen
* reportlab.lib.colors
* pygame.font
* poppler-utils for pdftohtml
Packages
========
DEB and RPM packages are built in
https://build.opensuse.org/package/show/Documentation:Tools/pdfcompare
Downloads directly from the openSUSE Build Service are available in
http://software.opensuse.org/download.html?project=Documentation%3ATools&package=pdfcompare
Stable releases are done via github
https://github.com/jnweiger/pdfcompare/releases
Example Usage and tips
======================
Starting with two vastly differently formatted PDF files, we want to see the textual difference betweeen
GPL-3.0 and AGPL-3.0 license. Most of the text is identical, except for preamble, Paragraph 13 and the footer.
We first generate an unusually formatted (no obvious LaTeX output) version of the GPL-3.0 text
wget http://www.tp-link.de/resources/document/GPL%20License%20Terms.pdf -O all-gpl-tplink.pdf
pdftk all-gpl-tplink.pdf cat 18-26 output gpl-3.0-tplink.pdf
Then we download a PDF version of the Affero GNU Public License
wget http://trac.frantovo.cz/sql-vyuka/export/29%3A4b6ab4ba1a95/licence/agpl-3.0.pdf
Now we produce an output reflecting the agpl contents and layout, with color highlights added.
Green shows text that was not in GPL-3.0 but is in AGPL-3.0
Red marks the gaps where text was removed in AGPL. Most PDF viewers can show an annotation popup when the mouse is over
the colored mark. For red marks, the annotation popup contains the word 'del: ' and the deleted text.
Yellow marks show changed text. The annotation popup contains the word 'chg:' and the original text.
pdfcompare gpl-3.0-tplink.pdf agpl-3.0.pdf -o gpl-agpl-diff.pdf
xdg-open gpl-agpl-diff.pdf
Pdfcompare also features text search and spellchecking (via hunspell). Search hits are marked in pink. Spellcheck errors are underlined in pink. If you get excessive spellcheck errors, try switching the language with env DICTIONARY=de_DE or study the hunspell documentation.
The option --margin 0,0,0,240 can be used with the two documents used here to ignore the page numbers introduced in agpl-3.0.pdf -- With this option, a gray bar will cover the page number and it is not marked as a change.