splitted tools authored by Gallenkamp, Fabian's avatar Gallenkamp, Fabian
.general data
|===
| name | pdf2xml
| short description | convert PDF files to XML. This script heavily relies on Apache Tika and pdftotext for the extraction of text and the conversion to XML. It tries to combine information from both tools and different conversion modes:
| software category | scraping documents
| developer | Jörg Tiedemann
| maintainer | Jörg Tiedemann
| current version | None
| last changed | None
| programming lanuage(s) | Perl, Java
| operating system(s)|
| license | GPL-3.0
| costs | 0
| language |
| architecture | library
| web-links | link:https://bitbucket.org/tiedemann/pdf2xml/src/master/[repository],
|===
.features
|===
| supported methods |
| additional features |
|===
\ No newline at end of file