name |
pdf2xml |
short description |
convert PDF files to XML. This script heavily relies on Apache Tika and pdftotext for the extraction of text and the conversion to XML. It tries to combine information from both tools and different conversion modes: |
software category |
scraping documents |
developer |
Jörg Tiedemann |
maintainer |
Jörg Tiedemann |
current version |
None |
last changed |
None |
programming lanuage(s) |
Perl, Java |
operating system(s) |
|
license |
GPL-3.0 |
costs |
0 |
language |
|
architecture |
library |
web-links |
supported methods |
|
additional features |