splitted tools authored by Gallenkamp, Fabian's avatar Gallenkamp, Fabian
......@@ -2,7 +2,7 @@
|===
| name | boilerpipeR
| short description | Generic Extraction of main text content from HTML files; removal of ads, sidebars and headers using the boilerpipe (http://code.google.com/p/boilerpipe/) Java library. The extraction heuristics from boilerpipe show a robust performance for a wide range of web site templates.
| software category | data wrangling
| software category | scraping websites
| developer | C. Kohlschütter,P.Fankhauser,W.Nejdl
| maintainer | Mario Annau
| current version | None
......
......