... | ... | @@ -4,243 +4,288 @@ |
|
|
:sectnums:
|
|
|
:sectnumlevels: 8
|
|
|
|
|
|
== stand-alone applications
|
|
|
|
|
|
== user-consented tracking
|
|
|
_Collection of sensor data on (mobile) devices in accordance with data protection laws._
|
|
|
=== scraping
|
|
|
_Tools in the area of web-scraping_
|
|
|
|
|
|
link:Tool_AWARE[AWARE] (link:http://www.awareframework.com/[website] link:https://github.com/denzilferreira/aware-client[repository-android] link:https://github.com/tetujin/aware-client-ios[repository-iOS] link:https://github.com/tetujin/aware-client-osx[repository-OSX] link:https://github.com/denzilferreira/aware-server[repository-server] ):: 'AWARE is an Android framework dedicated to instrument, infer, log and share mobile context information, for application developers, researchers and smartphone users. AWARE captures hardware-, software-, and human-based data. The data is then analyzed using AWARE plugins. They transform data into information you can understand.' link:http://www.awareframework.com/what-is-aware/[Source, visited: 27.02.2019] < | Apache-2.0 | framework | Java | >
|
|
|
link:Tool_facepager[facepager] (link:https://github.com/strohne/Facepager[wiki] link:https://github.com/strohne/Facepager[repository] ):: < | MIT | stand-alone application | Python | >
|
|
|
|
|
|
link:Tool_MEILI[MEILI] (link:http://adrianprelipcean.github.io/[website-dev] link:https://github.com/Badger-MEILI[repository-group] ):: < | GPL-3.0 | framework | Java | >
|
|
|
=== tools for corpus linguistics/text mining/(semi-)automated text analysis
|
|
|
_Integrated platforms for corpus analysis and processing._
|
|
|
|
|
|
link:Tool_PassiveDataKit[Passive Data Kit] (link:https://passivedatakit.org/[website] link:https://github.com/audaciouscode/PassiveDataKit-Django[repository-djangoserver] link:https://github.com/audaciouscode/PassiveDataKit-Android[repository-android] link:https://github.com/audaciouscode/PassiveDataKit-iOS[repository-iOS] ):: < | Apache-2.0 | framework | Python, Java | english>
|
|
|
link:Tool_CorpusExplorer[CorpusExplorer] (link:https://notes.jan-oliver-ruediger.de/software/corpusexplorer-overview/[website] link:https://github.com/notesjor/corpusexplorer2.0[repository] ):: 'OpenSource Software für Korpuslinguist*innen und Text-/Data-Mining Interessierte. Der CorpusExplorer vereint über 50 interaktiven Auswertungsmöglichkeiten mit einer einfachen Bedienung. Routineaufgaben wie z. B. Textakquise, Taggen oder die grafische Aufbereitung von Ergebnissen werden vollständig automatisiert. Die einfache Handhabung erleichtert den Einsatz in der universitären Lehre und führt zu schnellen sowie gehaltvollen Ergebnissen. Dabei ist der CorpusExplorer offen für viele Standards (XML, CSV, JSON, R, uvm.) und bietet darüber hinaus ein eigenes Software Development Kit (SDK) an, mit dem es möglich ist, alle Funktionen in eigene Programme zu integrieren.' link:https://notes.jan-oliver-ruediger.de/software/corpusexplorer-overview/[source, retrieved 22.03.2019] < Q12019 | AGPL-3.0 | stand-alone application | C# | german>
|
|
|
|
|
|
link:Tool_WebHistorian(CE)[Web Historian(CE)] (link:https://doi.org/10.5281/zenodo.1322782[website-doi] link:http://www.webhistorian.org[website] link:https://github.com/WebHistorian/community[repository] ):: Chrome browser extension designed to integrate web browsing history data collection into research projects collecting other types of data from participants (e.g. surveys, in-depth interviews, experiments). It uses client-side D3 visualizations to inform participants about the data being collected during the informed consent process. It allows participants to delete specific browsing data or opt-out of browsing data collection. It directs participants to an online survey once they have reviewed their data and made a choice of whether to participate. It has been used with Qualtrics surveys, but any survey that accepts data from a URL will work. It works with the open source Passive Data Kit (PDK) as the backend for data collection. To successfully upload, you need to fill in the address of your PDK server in the js/app/config.js file. < e06b3e174f9668f5c62f30a9bedde223023e0bca | GPL-3.0 | plugin | Javascript | english>
|
|
|
link:Tool_COSMOS[COSMOS] (link:http://socialdatalab.net/COSMOS[website] ):: COSMOS Open Data Analytics software < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
=== computer assisted/aided qualitative data analysis software (CAQDAS)
|
|
|
_assist with qualitative research such as transcription analysis, coding and text interpretation, recursive abstraction, content analysis, discourse analysis, grounded theory methodology, etc._
|
|
|
|
|
|
== scraping
|
|
|
_Tools in the area of web-scraping_
|
|
|
link:Tool_ATLAS.ti[ATLAS.ti] (link:https://atlasti.com/de/produkt/what-is-atlas-ti/[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_TWINT[TWINT] (link:https://twint.io/[website] link:https://github.com/twintproject/twint[repository] ):: TWINT (Twitter Intelligence Tool) 'Formerly known as Tweep, Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.' link:https://github.com/twintproject/twint[Retrieved 07.03.2019] < | MIT | package | Python | >
|
|
|
link:Tool_Leximancer[Leximancer] (link:https://info.leximancer.com/[website] ):: 'Leximancer automatically analyses your text documents to identify the high level concepts in your text documents, delivering the key ideas and actionable insights you need with powerful interactive visualisations and data exports.' < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_YouTubeComments[YouTubeComments] (link:https://osf.io/hqsxe/[website] link:https://github.com/JuKo007/YouTubeComments[repository] ):: 'This repository contains an R script as well as an interactive Jupyter notebook to demonstrate how to automatically collect, format, and explore YouTube comments, including the emojis they contain. The script and notebook showcase the following steps: Getting access to the YouTube API Extracting comments for a video Formatting the comments & extracting emojis Basic sentiment analysis for text & emojis' link:https://github.com/JuKo007/YouTubeComments[Retrieved 07.03.2019] < | Unknown | package | R | >
|
|
|
link:Tool_MAXQDA[MAXQDA] (link:https://www.rrz.uni-hamburg.de/services/software/alphabetisch/maxqda.html[website-uhh] ):: 'MAXQDA gehört zu den weltweit führenden und umfangreichsten QDA-Software-Programmen im Bereich der Qualitativen und Mixed-Methods-Forschung. Die Software hilft Ihnen beim Erfassen, Organisieren, Analysieren, Visualisieren und Veröffentlichen Ihrer Daten. Ob Grounded Theory, Literaturreview, explorative Marktforschung, Interviews, Webseitenanalyse oder Surveys: Analysieren Sie was Sie wollen, wie Sie wollen.MAXQDA Analytics Pro ist die erweiterte Version von MAXQDA und enthält neben allen Funktionen für die Qualitative & Mixed Methods-Forschung auch ein Modul für die quantitative Textanalyse (MAXDictio) und ein Modul für die statistische Datenanalyse (MAXQDA Stats)'link:https://www.rrz.uni-hamburg.de/services/software/alphabetisch/maxqda.html[Source, visited: 27.02.2019] < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_facepager[facepager] (link:https://github.com/strohne/Facepager[wiki] link:https://github.com/strohne/Facepager[repository] ):: < | MIT | package | Python | >
|
|
|
link:Tool_NVivo[NVivo] (link:https://www.qsrinternational.com/nvivo/who-uses-nvivo/academics[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_Scrapy[Scrapy] (link:https://scrapy.org/[website] link:https://github.com/scrapy/scrapy[repository] ):: < | BSD | package | Python | >
|
|
|
link:Tool_QDAMiner[QDAMiner] (link:https://provalisresearch.com/products/qualitative-data-analysis-software/[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_RSelenium[RSelenium] (link:https://github.com/ropensci/RSelenium[repository] ):: < | AGPL-3.0 | package | R | >
|
|
|
link:Tool_ORAPro[ORA Pro] (link:http://netanomics.com/[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_Quirkos[Quirkos] (link:https://www.quirkos.com/[website] link:[repository] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
== tools for corpus linguistics/text mining/(semi-)automated text analysis
|
|
|
_Integrated platforms for corpus analysis and processing._
|
|
|
link:Tool_TAMS[TAMS] (link:https://sourceforge.net/projects/tamsys[website] ):: 'Text Analysis Markup System (TAMS) is both a system of marking documents for qualitative analysis and a series of tools for mining information based on that syntax.' < | GPL-2.0 | stand-alone application | | >
|
|
|
|
|
|
link:Tool_AmCAT[AmCAT] (link:http://vanatteveldt.com/amcat/[website-entwickler] link:https://github.com/amcat/amcat[repository] link:http://wiki.amcat.nl/3.4:AmCAT_Navigator_3[wiki] ):: 'The Amsterdam Content Analysis Toolkit (AmCAT) is an open source infrastructure that makes it easy to do large-scale automatic and manual content analysis (text analysis) for the social sciences and humanities.' < | AGPL-3.0 | SaaS | Python | >
|
|
|
=== natuaral language processing(NLP)
|
|
|
__
|
|
|
|
|
|
link:Tool_COSMOS[COSMOS] (link:http://socialdatalab.net/COSMOS[website] ):: COSMOS Open Data Analytics software < | Proprietary | standalone | | >
|
|
|
link:Tool_RapidMiner[RapidMiner] (link:https://rapidminer.com/[website] link:https://github.com/rapidminer/rapidminer-studio[repository] ):: < | AGPL-3.0 | stand-alone application | Java | >
|
|
|
|
|
|
link:Tool_CWB[CWB] (link:http://cwb.sourceforge.net/index.php[website] link:http://svn.code.sf.net/p/cwb/code/cwb/trunk[repository-cwb] link:http://svn.code.sf.net/p/cwb/code/gui/cqpweb/trunk[repository-cqpweb] ):: CWB, the IMS[Institut für Maschinelle Sprachverarbeitung Stuttgart] Open Corpus Workbench is 'a fast, powerful and extremely flexible corpus querying system.' < 3.4.15 | GPL-3.0 | framework | C, Perl | english>
|
|
|
=== topic-models
|
|
|
__
|
|
|
|
|
|
link:Tool_LCM[LCM] (link:http://lcm.informatik.uni-leipzig.de/generic.html[website] ):: Leipzig Corpus Miner a decentralized SaaS application for the analysis of very large amounts of news texts < | LGPL | framework | Java, R | >
|
|
|
link:Tool_TOME[TOME] (link:https://dhlab.lmc.gatech.edu/tome/[website] link:https://github.com/GeorgiaTechDHLab/TOME/[repository] ):: 'TOME is a tool to support the interactive exploration and visualization of text-based archives, supported by a Digital Humanities Startup Grant from the National Endowment for the Humanities (Lauren Klein and Jacob Eisenstein, co-PIs). Drawing upon the technique of topic modeling—a machine learning method for identifying the set of topics, or themes, in a document set—our tool allows humanities scholars to trace the evolution and circulation of these themes across networks and over time.' < | Unknown | stand-alone application | Python, Jupyter Notebook | >
|
|
|
|
|
|
link:Tool_iLCM[iLCM] (link:https://ilcm.informatik.uni-leipzig.de/[website] link:https://hub.docker.com/r/ckahmann/ilcm_r/tags[repository-docker] ):: 'The iLCM[LCM=Leipzig Corpus Miner] project pursues the development of an integrated research environment for the analysis of structured and unstructured data in a ‘Software as a Service’ architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data using text mining methods and requirements for the reproducibility of data-driven research designs in the social sciences.' link:http://ilcm.informatik.uni-leipzig.de/ilcm/ilcm/[source, retrieved 08.03.2019] < 0.96 | LGPL | SaaS | Java, Python, R | german>
|
|
|
=== sentiment analysis
|
|
|
__
|
|
|
|
|
|
link:Tool_OpinionFinder[OpinionFinder] (link:https://mpqa.cs.pitt.edu/opinionfinder/[website] link:[repository] ):: 'OpinionFinder is a system that processes documents and automatically identifies subjective sentences as well as various aspects of subjectivity within sentences, including agents who are sources of opinion, direct subjective expressions and speech events, and sentiment expressions. OpinionFinder was developed by researchers at the University of Pittsburgh, Cornell University, and the University of Utah. In addition to OpinionFinder, we are also releasing the automatic annotations produced by running OpinionFinder on a subset of the Penn Treebank.' < | Unknown | stand-alone application | Java | >
|
|
|
|
|
|
== computer assisted/aided qualitative data analysis software (CAQDAS)
|
|
|
_assist with qualitative research such as transcription analysis, coding and text interpretation, recursive abstraction, content analysis, discourse analysis, grounded theory methodology, etc._
|
|
|
=== statistical software
|
|
|
_software that helps calcualting with specific statistical models_
|
|
|
|
|
|
link:Tool_ATLAS.ti[ATLAS.ti] (link:https://atlasti.com/de/produkt/what-is-atlas-ti/[website] ):: < | Proprietary | standalone | | >
|
|
|
link:Tool_gretl[gretl] (link:http://gretl.sourceforge.net/[website] link:https://sourceforge.net/p/gretl/git/ci/master/tree/[repository] ):: Is a cross-platform software package for econometric analysis < | GPL-3.0 | stand-alone application | C | >
|
|
|
|
|
|
link:Tool_Leximancer[Leximancer] (link:https://info.leximancer.com/[website] ):: 'Leximancer automatically analyses your text documents to identify the high level concepts in your text documents, delivering the key ideas and actionable insights you need with powerful interactive visualisations and data exports.' < | Proprietary | standalone | | >
|
|
|
link:Tool_SPSS[SPSS] (link:https://www.rrz.uni-hamburg.de/services/software/software-thematisch/statistik/spss-netzlizenz.html[website-uhh] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_MAXQDA[MAXQDA] (link:https://www.rrz.uni-hamburg.de/services/software/alphabetisch/maxqda.html[website-uhh] ):: 'MAXQDA gehört zu den weltweit führenden und umfangreichsten QDA-Software-Programmen im Bereich der Qualitativen und Mixed-Methods-Forschung. Die Software hilft Ihnen beim Erfassen, Organisieren, Analysieren, Visualisieren und Veröffentlichen Ihrer Daten. Ob Grounded Theory, Literaturreview, explorative Marktforschung, Interviews, Webseitenanalyse oder Surveys: Analysieren Sie was Sie wollen, wie Sie wollen.MAXQDA Analytics Pro ist die erweiterte Version von MAXQDA und enthält neben allen Funktionen für die Qualitative & Mixed Methods-Forschung auch ein Modul für die quantitative Textanalyse (MAXDictio) und ein Modul für die statistische Datenanalyse (MAXQDA Stats)'link:https://www.rrz.uni-hamburg.de/services/software/alphabetisch/maxqda.html[Source, visited: 27.02.2019] < | Proprietary | standalone | | >
|
|
|
link:Tool_STATA[STATA] (link:https://www.rrz.uni-hamburg.de/services/software/software-thematisch/statistik/stata.html[website-uhh] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_NVivo[NVivo] (link:https://www.qsrinternational.com/nvivo/who-uses-nvivo/academics[website] ):: < | Proprietary | standalone | | >
|
|
|
=== network analysis
|
|
|
_social network analysis_
|
|
|
|
|
|
link:Tool_QDAMiner[QDAMiner] (link:https://provalisresearch.com/products/qualitative-data-analysis-software/[website] ):: < | Proprietary | standalone | | >
|
|
|
link:Tool_AutoMap[AutoMap] (link:http://www.casos.cs.cmu.edu/projects/automap/software.php[website] ):: 'AutoMap enables the extraction of information from texts using Network Text Analysis methods. AutoMap supports the extraction of several types of data from unstructured documents. The type of information that can be extracted includes: content analytic data (words and frequencies), semantic network data (the network of concepts), meta-network data (the cross classification of concepts into their ontological category such as people, places and things and the connections among these classified concepts), and sentiment data (attitudes, beliefs). Extraction of each type of data assumes the previously listed type of data has been extracted.' < | Proprietary | stand-alone application | Java | >
|
|
|
|
|
|
link:Tool_ORAPro[ORA Pro] (link:http://netanomics.com/[website] ):: < | Proprietary | standalone | | >
|
|
|
link:Tool_NodeXL[NodeXL] (link:https://www.smrfoundation.org/nodexl/[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_Quirkos[Quirkos] (link:https://www.quirkos.com/[website] link:[repository] ):: < | Proprietary | standalone | | >
|
|
|
link:Tool_ORAPro[ORA Pro] (link:[website] link:[repository] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_RQDA[RQDA] (link:http://rqda.r-forge.r-project.org/[website] link:https://github.com/Ronggui/RQDA[repository] ):: 'It includes a number of standard Computer-Aided Qualitative Data Analysis features. In addition it seamlessly integrates with R, which means that a) statistical analysis on the coding is possible, and b) functions for data manipulation and analysis can be easily extended by writing R functions. To some extent, RQDA and R make an integrated platform for both quantitative and qualitative data analysis.' < | BSD | package | R | >
|
|
|
link:Tool_Pajek[Pajek] (link:http://mrvar.fdv.uni-lj.si/pajek/[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_TAMS[TAMS] (link:https://sourceforge.net/projects/tamsys[website] ):: 'Text Analysis Markup System (TAMS) is both a system of marking documents for qualitative analysis and a series of tools for mining information based on that syntax.' < | GPL-2.0 | standalone | | >
|
|
|
link:Tool_NetworkX[NetworkX] (link:[website] link:[repository] ):: 'Data structures for graphs, digraphs, and multigraphs Many standard graph algorithms Network structure and analysis measures Generators for classic graphs, random graphs, and synthetic networks Nodes can be 'anything' (e.g., text, images, XML records) Edges can hold arbitrary data (e.g., weights, time-series) Open source 3-clause BSD license Well tested with over 90% code coverage Additional benefits from Python include fast prototyping, easy to teach, and multi-platform.' < | BSD | stand-alone application | Python | >
|
|
|
|
|
|
link:Tool_UCINET[UCINET] (link:https://sites.google.com/site/ucinetsoftware/home[website] ):: 'UCINET 6 for Windows is a software package for the analysis of social network data. It was developed by Lin Freeman, Martin Everett and Steve Borgatti. It comes with the NetDraw network visualization tool.' < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
== natuaral language processing(NLP)
|
|
|
__
|
|
|
=== audio-transcriptions
|
|
|
_software that converts speech into electronic text document._
|
|
|
|
|
|
link:Tool_ApacheOpenNLP[Apache OpenNLP] (link:https://opennlp.apache.org/[website] link:[repository] ):: 'OpenNLP supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, language detection and coreference resolution.' < | Apache-2.0 | package | Java | >
|
|
|
link:Tool_f4analyse[f4analyse] (link:https://www.audiotranskription.de/f4-analyse[website] ):: < | Proprietary | stand-alone application | | >
|
|
|
|
|
|
link:Tool_GATE[GATE] (link:https://gate.ac.uk/overview.html[website] link:https://github.com/GateNLP/gate-core[repository] ):: GATE - General Architecture for Text Engineering < | LGPL | package | Java | >
|
|
|
link:Tool_EXMARaLDA[EXMARaLDA] (link:[website] link:https://github.com/EXMARaLDA/exmaralda[repository] ):: 'EXMARaLDA ist ein System für das computergestützte Arbeiten mit (vor allem) mündlichen Korpora. Es besteht aus einem Transkriptions- und Annotationseditor (Partitur-Editor), einem Tool zum Verwalten von Korpora (Corpus-Manager) und einem Such- und Analysewerkzeug (EXAKT).' < | Unknown | stand-alone application | Java | >
|
|
|
|
|
|
link:Tool_Gensim[Gensim] (link:https://pypi.org/project/gensim/[website] link:[repository] ):: 'Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.' < | LGPL | package | Python | >
|
|
|
=== agent-based modeling
|
|
|
__
|
|
|
|
|
|
link:Tool_NLTK[NLTK] (link:http://www.nltk.org/index.html[website] link:https://github.com/nltk/nltk[repository] ):: 'NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.' < | Apache-2.0 | package | Python | >
|
|
|
link:Tool_NetLogo[NetLogo] (link:http://ccl.northwestern.edu/netlogo/[website] link:https://github.com/NetLogo/NetLogo[repository] ):: 'NetLogo is a multi-agent programmable modeling environment. It is used by many tens of thousands of students, teachers and researchers worldwide. It also powers HubNet participatory simulations.' < | Unknown | stand-alone application | Java, Scala | >
|
|
|
|
|
|
link:Tool_Pandas[Pandas] (link:http://pandas.pydata.org/[website] link:https://github.com/pandas-dev/pandas[repository] ):: < | BSD | package | Python | >
|
|
|
|
|
|
link:Tool_polmineR[polmineR] (link:https://cran.r-project.org/package=polmineR[website-cran] link:https://github.com/PolMine/polmineR[repository] ):: < | GPL-3.0 | package | R | >
|
|
|
== server applications
|
|
|
|
|
|
link:Tool_quanteda[quanteda] (link:https://quanteda.io/[website] link:https://github.com/quanteda/quanteda[repository] ):: 'The package is designed for R users needing to apply natural language processing to texts, from documents to final analysis. Its capabilities match or exceed those provided in many end-user software applications, many of which are expensive and not open source. The package is therefore of great benefit to researchers, students, and other analysts with fewer financial resources. While using quanteda requires R programming knowledge, its API is designed to enable powerful, efficient analysis with a minimum of steps. By emphasizing consistent design, furthermore, quanteda lowers the barriers to learning and using NLP and quantitative text analysis even for proficient R programmers.' < | GPL-3.0 | package | R | >
|
|
|
=== tools for corpus linguistics/text mining/(semi-)automated text analysis
|
|
|
_Integrated platforms for corpus analysis and processing._
|
|
|
|
|
|
link:Tool_RapidMiner[RapidMiner] (link:https://rapidminer.com/[website] link:https://github.com/rapidminer/rapidminer-studio[repository] ):: < | AGPL-3.0 | framework | Java | >
|
|
|
link:Tool_AmCAT[AmCAT] (link:http://vanatteveldt.com/amcat/[website-entwickler] link:https://github.com/amcat/amcat[repository] link:http://wiki.amcat.nl/3.4:AmCAT_Navigator_3[wiki] ):: 'The Amsterdam Content Analysis Toolkit (AmCAT) is an open source infrastructure that makes it easy to do large-scale automatic and manual content analysis (text analysis) for the social sciences and humanities.' < | AGPL-3.0 | server application | Python | >
|
|
|
|
|
|
link:Tool_spaCy[spaCy] (link:https://spacy.io/[website] link:https://github.com/explosion/spaCy[repository] ):: spaCy 'excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. Independent research has confirmed that spaCy is the fastest in the world. If your application needs to process entire web dumps, spaCy is the library you want to be using.' < | MIT | package | Cython | >
|
|
|
link:Tool_CWB[CWB] (link:http://cwb.sourceforge.net/index.php[website] link:http://svn.code.sf.net/p/cwb/code/cwb/trunk[repository-cwb] link:http://svn.code.sf.net/p/cwb/code/gui/cqpweb/trunk[repository-cqpweb] ):: CWB, the IMS[Institut für Maschinelle Sprachverarbeitung Stuttgart] Open Corpus Workbench is 'a fast, powerful and extremely flexible corpus querying system.' < 3.4.15 | GPL-3.0 | server application | C, Perl | english>
|
|
|
|
|
|
link:Tool_StanfordCoreNLP[Stanford CoreNLP] (link:https://stanfordnlp.github.io/CoreNLP/[website] link:https://github.com/stanfordnlp/CoreNLP[repository] ):: < | GPL-3.0 | framework | Java | >
|
|
|
link:Tool_iLCM[iLCM] (link:https://ilcm.informatik.uni-leipzig.de/[website] link:https://hub.docker.com/r/ckahmann/ilcm_r/tags[repository-docker] ):: 'The iLCM[LCM=Leipzig Corpus Miner] project pursues the development of an integrated research environment for the analysis of structured and unstructured data in a ‘Software as a Service’ architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data using text mining methods and requirements for the reproducibility of data-driven research designs in the social sciences.' link:http://ilcm.informatik.uni-leipzig.de/ilcm/ilcm/[source, retrieved 08.03.2019] < 0.96 | LGPL | server application | Java, Python, R | german>
|
|
|
|
|
|
link:Tool_tm[tm] (link:http://tm.r-forge.r-project.org/[website] link:https://cran.r-project.org/package=tm[website-cran] link:[repository] ):: < | GPL-3.0 | package | R | >
|
|
|
=== collaborative annotation
|
|
|
__
|
|
|
|
|
|
link:Tool_xtas[xtas] (link:http://nlesc.github.io/xtas/[website] link:https://github.com/NLeSC/xtas[repository] ):: the eXtensible Text Analysis Suite(xtas) 'is a collection of natural language processing and text mining tools, brought together in a single software package with built-in distributed computing and support for the Elasticsearch document store.' < | Apache-2.0 | framework | Python | >
|
|
|
link:Tool_CATMA[CATMA] (link:http://catma.de/[website] link:https://www.slm.uni-hamburg.de/germanistik/forschung/forschungsprojekte/catma.html[website-uhh] link:https://github.com/mpetris/catma[repository] ):: 'CATMA (Computer Assisted Text Markup and Analysis) is a practical and intuitive tool for text researchers. In CATMA users can combine the hermeneutic, ‘undogmatic’ and the digital, taxonomy based approach to text and corpora—as a single researcher, or in real-time collaboration with other team members.' < | Apache-2.0 | server application | Python | >
|
|
|
|
|
|
link:Tool_WebAnno[WebAnno] (link:https://webanno.github.io/webanno/[website] link:https://github.com/webanno/webanno[repository] ):: WebAnno is a multi-user tool supporting different roles such as annotator, curator, and project manager. The progress and quality of annotation projects can be monitored and measuered in terms of inter-annotator agreement. Multiple annotation projects can be conducted in parallel. < | Apache-2.0 | server application | Python | >
|
|
|
|
|
|
== topic-models
|
|
|
=== collaborative writing
|
|
|
__
|
|
|
|
|
|
link:Tool_MALLET[MALLET] (link:http://mallet.cs.umass.edu/[website] link:https://github.com/mimno/Mallet[repository] ):: < | Apache-2.0 | package | Java | >
|
|
|
|
|
|
link:Tool_TOME[TOME] (link:https://dhlab.lmc.gatech.edu/tome/[website] link:https://github.com/GeorgiaTechDHLab/TOME/[repository] ):: 'TOME is a tool to support the interactive exploration and visualization of text-based archives, supported by a Digital Humanities Startup Grant from the National Endowment for the Humanities (Lauren Klein and Jacob Eisenstein, co-PIs). Drawing upon the technique of topic modeling—a machine learning method for identifying the set of topics, or themes, in a document set—our tool allows humanities scholars to trace the evolution and circulation of these themes across networks and over time.' < | Unknown | package | Python, Jupyter Notebook | >
|
|
|
link:Tool_FidusWriter[FidusWriter] (link:https://fiduswriter.org[website] link:https://github.com/fiduswriter/fiduswriter[repository] ):: < | AGPL-3.0 | server application | Python, Javascript | >
|
|
|
|
|
|
link:Tool_Stm[Stm] (link:http://structuraltopicmodel.com[website] link:https://github.com/bstewart/stm[repository] ):: 'The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et al (2014) <doi:10.1111/ajps.12103> and Roberts et al (2016) <doi:10.1080/01621459.2016.1141684>.' < | MIT | package | R | >
|
|
|
=== research data archiving
|
|
|
__
|
|
|
|
|
|
link:Tool_dataverse[dataverse] (link:http://dataverse.org/[website] link:https://github.com/IQSS/dataverse[repository] ):: < | Apache-2.0 | server application | Java | >
|
|
|
|
|
|
== sentiment analysis
|
|
|
=== online experiments
|
|
|
__
|
|
|
|
|
|
link:Tool_lexicoder[lexicoder] (link:http://www.lexicoder.com/index.html[website] ):: 'Lexicoder performs simple deductive content analyses of any kind of text, in almost any language. All that is required is the text itself, and a dictionary. Our own work initially focused on the analysis of newspaper stories during election campaigns, and both television and newspaper stories about public policy issues. The software can deal with almost any text, however, and lots of it. Our own databases typically include up to 100,000 news stories. Lexicoder processes these data, even with a relatively complicated coding dictionary, in about fifteen minutes. The software has, we hope, a wide range of applications in the social sciences. It is not the only software that conducts content analysis, of course - there are many packages out there, some of which are much more sophisticated than this one. The advantage to Lexicoder, however, is that it can run on any computer with a recent version of Java (PC or Mac), it is very simple to use, it can deal with huge bodies of data, it can be called from R as well as from the Command Line, and its free.' < | Proprietary | package | Java | >
|
|
|
link:Tool_LIONESS[LIONESS] (link:https://lioness-lab.org/[website] ):: 'LIONESS Lab is a free web-based platform for online interactive experiments. It allows you to develop, test and conduct decision-making experiments with live feedback between participants. LIONESS experiments include a standardized set of methods to deal with the set of challenges arising when conducting interactive experiments online. These methods reflect current ‘best practices’ for, e.g., preventing participants to enter a session more than once, facilitating on-the-fly formation of interaction groups, reducing waiting times for participants, driving down attrition by retaining attention of online participants and, importantly, adequate handling of cases in which participants drop out.With LIONESS Lab you can readily develop and test your experiments online in a user-friendly environment. You can develop experiments from scratch in a point-and-click fashion or start from an existent design from our growing repository and adjust it according your own requirements.' link:https://lioness-lab.org/faq/[Retrieved 07.03.2019] < | Proprietary | server application | Javascript | >
|
|
|
|
|
|
link:Tool_OpinionFinder[OpinionFinder] (link:https://mpqa.cs.pitt.edu/opinionfinder/[website] link:[repository] ):: 'OpinionFinder is a system that processes documents and automatically identifies subjective sentences as well as various aspects of subjectivity within sentences, including agents who are sources of opinion, direct subjective expressions and speech events, and sentiment expressions. OpinionFinder was developed by researchers at the University of Pittsburgh, Cornell University, and the University of Utah. In addition to OpinionFinder, we are also releasing the automatic annotations produced by running OpinionFinder on a subset of the Penn Treebank.' < | Unknown | package | Java | >
|
|
|
link:Tool_nodeGame[nodeGame] (link:https://nodegame.org/[website] link:https://github.com/nodeGame[repository] ):: 'NodeGame is a free, open source JavaScript/HTML5 framework for conducting synchronous experiments online and in the lab directly in the browser window. It is specifically designed to support behavioral research along three dimensions: larger group sizes, real-time (but also discrete time) experiments, batches of simultaneous experiments.' < 4.2.1 | MIT | server application | Javascript | english>
|
|
|
|
|
|
link:Tool_Readme[Readme] (link:https://gking.harvard.edu/readme[website] ):: 'The ReadMe software package for R takes as input a set of text documents (such as speeches, blog posts, newspaper articles, judicial opinions, movie reviews, etc.), a categorization scheme chosen by the user (e.g., ordered positive to negative sentiment ratings, unordered policy topics, or any other mutually exclusive and exhaustive set of categories), and a small subset of text documents hand classified into the given categories.' < | CC BY-NC-ND-3.0 | package | R | >
|
|
|
link:Tool_Breadboard[Breadboard] (link:http://breadboard.yale.edu/[website] link:https://github.com/human-nature-lab/breadboard[repository] ):: 'Breadboard is a software platform for developing and conducting human interaction experiments on networks. It allows researchers to rapidly design experiments using a flexible domain-specific language and provides researchers with immediate access to a diverse pool of online participants.' link:http://breadboard.yale.edu/[Retrieved: 07.03.2019] < | Unknown | server application | Javascript | >
|
|
|
|
|
|
link:Tool_Empirica(beta)[Empirica(beta)] (link:https://empirica.ly/[website] link:https://github.com/empiricaly/meteor-empirica-core[repository] ):: 'Open source project to tackle the problem of long development cycles required to produce software to conduct multi-participant and real-time human experiments online.' link:https://github.com/empiricaly/meteor-empirica-core[Retrieved: 07.03.2019] < | MIT | server application | Javascript | >
|
|
|
|
|
|
== visualization
|
|
|
=== investigative journalism
|
|
|
__
|
|
|
|
|
|
link:Tool_Gephi[Gephi] (link:https://gephi.org/[website] link:https://github.com/gephi/gephi/[repository] ):: 'Gephi is an award-winning open-source platform for visualizing and manipulating large graphs.' < | GPL-3.0 | package | Java | >
|
|
|
link:Tool_DocumentCloud[DocumentCloud] (link:https://www.documentcloud.org/[website] link:https://github.com/documentcloud/documentcloud[repository] ):: 'DocumentCloud is a platform founded on the belief that if journalists were more open about their sourcing, the public would be more inclined to trust their reporting. The platform is a tool to help journalists share, analyze, annotate and, ultimately, publish source documents to the open web.' link:https://www.documentcloud.org/about[Source, visited: 04.03.2019] < | MIT | server application | Ruby | >
|
|
|
|
|
|
link:Tool_sigma.js[sigma.js] (link:http://sigmajs.org/[website] link:https://github.com/jacomyal/sigma.js[repository] ):: 'Sigma is a JavaScript library dedicated to graph drawing. It makes easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications.' < | MIT | package | Javascript | >
|
|
|
link:Tool_NEWSLEAK[NEW/S/LEAK] (link:https://www.documentcloud.org/[website] link:https://github.com/documentcloud/documentcloud[repository] ):: 'DocumentCloud is a platform founded on the belief that if journalists were more open about their sourcing, the public would be more inclined to trust their reporting. The platform is a tool to help journalists share, analyze, annotate and, ultimately, publish source documents to the open web.' link:https://www.documentcloud.org/about[Source, visited: 04.03.2019] < | AGPL-3.0 | server application | Ruby | >
|
|
|
|
|
|
link:Tool_scikit-image[scikit-image] (link:https://scikit-image.org/[website] link:https://github.com/scikit-image/scikit-image[repository] ):: 'scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers.' < | BSD | package | Python | >
|
|
|
|
|
|
== programming-frameworks/libraries etc.
|
|
|
=== R
|
|
|
|
|
|
== collaborative annotation
|
|
|
__
|
|
|
==== scraping
|
|
|
_Tools in the area of web-scraping_
|
|
|
|
|
|
link:Tool_CATMA[CATMA] (link:http://catma.de/[website] link:https://www.slm.uni-hamburg.de/germanistik/forschung/forschungsprojekte/catma.html[website-uhh] link:https://github.com/mpetris/catma[repository] ):: 'CATMA (Computer Assisted Text Markup and Analysis) is a practical and intuitive tool for text researchers. In CATMA users can combine the hermeneutic, ‘undogmatic’ and the digital, taxonomy based approach to text and corpora—as a single researcher, or in real-time collaboration with other team members.' < | Apache-2.0 | package | Python | >
|
|
|
link:Tool_YouTubeComments[YouTubeComments] (link:https://osf.io/hqsxe/[website] link:https://github.com/JuKo007/YouTubeComments[repository] ):: 'This repository contains an R script as well as an interactive Jupyter notebook to demonstrate how to automatically collect, format, and explore YouTube comments, including the emojis they contain. The script and notebook showcase the following steps: Getting access to the YouTube API Extracting comments for a video Formatting the comments & extracting emojis Basic sentiment analysis for text & emojis' link:https://github.com/JuKo007/YouTubeComments[Retrieved 07.03.2019] < | Unknown | library | R | >
|
|
|
|
|
|
link:Tool_WebAnno[WebAnno] (link:https://webanno.github.io/webanno/[website] link:https://github.com/webanno/webanno[repository] ):: WebAnno is a multi-user tool supporting different roles such as annotator, curator, and project manager. The progress and quality of annotation projects can be monitored and measuered in terms of inter-annotator agreement. Multiple annotation projects can be conducted in parallel. < | Apache-2.0 | package | Python | >
|
|
|
link:Tool_RSelenium[RSelenium] (link:https://github.com/ropensci/RSelenium[repository] ):: < | AGPL-3.0 | library | R | >
|
|
|
|
|
|
link:Tool_rvest[rvest] (link:https://cran.r-project.org/web/packages/rvest/index.html[website-cran] link:https://github.com/tidyverse/rvest[repository] ):: < | GPL-3.0 | library | R | >
|
|
|
|
|
|
== collaborative writing
|
|
|
__
|
|
|
==== tools for corpus linguistics/text mining/(semi-)automated text analysis
|
|
|
_Integrated platforms for corpus analysis and processing._
|
|
|
|
|
|
link:Tool_LCM[LCM] (link:http://lcm.informatik.uni-leipzig.de/generic.html[website] ):: Leipzig Corpus Miner a decentralized SaaS application for the analysis of very large amounts of news texts < | LGPL | framework | Java, R | >
|
|
|
|
|
|
link:Tool_FidusWriter[FidusWriter] (link:https://fiduswriter.org[website] link:https://github.com/fiduswriter/fiduswriter[repository] ):: < | AGPL-3.0 | package | Python, Javascript | >
|
|
|
==== computer assisted/aided qualitative data analysis software (CAQDAS)
|
|
|
_assist with qualitative research such as transcription analysis, coding and text interpretation, recursive abstraction, content analysis, discourse analysis, grounded theory methodology, etc._
|
|
|
|
|
|
link:Tool_RQDA[RQDA] (link:http://rqda.r-forge.r-project.org/[website] link:https://github.com/Ronggui/RQDA[repository] ):: 'It includes a number of standard Computer-Aided Qualitative Data Analysis features. In addition it seamlessly integrates with R, which means that a) statistical analysis on the coding is possible, and b) functions for data manipulation and analysis can be easily extended by writing R functions. To some extent, RQDA and R make an integrated platform for both quantitative and qualitative data analysis.' < | BSD | library | R | >
|
|
|
|
|
|
== research data archiving
|
|
|
==== natuaral language processing(NLP)
|
|
|
__
|
|
|
|
|
|
link:Tool_dataverse[dataverse] (link:http://dataverse.org/[website] link:https://github.com/IQSS/dataverse[repository] ):: < | Apache-2.0 | framework | Java | >
|
|
|
link:Tool_polmineR[polmineR] (link:https://cran.r-project.org/package=polmineR[website-cran] link:https://github.com/PolMine/polmineR[repository] ):: < | GPL-3.0 | library | R | >
|
|
|
|
|
|
link:Tool_quanteda[quanteda] (link:https://quanteda.io/[website] link:https://github.com/quanteda/quanteda[repository] ):: 'The package is designed for R users needing to apply natural language processing to texts, from documents to final analysis. Its capabilities match or exceed those provided in many end-user software applications, many of which are expensive and not open source. The package is therefore of great benefit to researchers, students, and other analysts with fewer financial resources. While using quanteda requires R programming knowledge, its API is designed to enable powerful, efficient analysis with a minimum of steps. By emphasizing consistent design, furthermore, quanteda lowers the barriers to learning and using NLP and quantitative text analysis even for proficient R programmers.' < | GPL-3.0 | library | R | >
|
|
|
|
|
|
== statistical software
|
|
|
_software that helps calcualting with specific statistical models_
|
|
|
link:Tool_tm[tm] (link:http://tm.r-forge.r-project.org/[website] link:https://cran.r-project.org/package=tm[website-cran] link:[repository] ):: < | GPL-3.0 | library | R | >
|
|
|
|
|
|
link:Tool_gretl[gretl] (link:http://gretl.sourceforge.net/[website] link:https://sourceforge.net/p/gretl/git/ci/master/tree/[repository] ):: Is a cross-platform software package for econometric analysis < | GPL-3.0 | package | C | >
|
|
|
==== topic-models
|
|
|
__
|
|
|
|
|
|
link:Tool_MLwiN[MLwiN] (link:http://www.bristol.ac.uk/cmm/software/mlwin/[website] link:[repository] ):: < | Proprietary | package | | >
|
|
|
link:Tool_Stm[Stm] (link:http://structuraltopicmodel.com[website] link:https://github.com/bstewart/stm[repository] ):: 'The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et al (2014) <doi:10.1111/ajps.12103> and Roberts et al (2016) <doi:10.1080/01621459.2016.1141684>.' < | MIT | library | R | >
|
|
|
|
|
|
link:Tool_SPSS[SPSS] (link:https://www.rrz.uni-hamburg.de/services/software/software-thematisch/statistik/spss-netzlizenz.html[website-uhh] ):: < | Proprietary | package | | >
|
|
|
==== sentiment analysis
|
|
|
__
|
|
|
|
|
|
link:Tool_Readme[Readme] (link:https://gking.harvard.edu/readme[website] ):: 'The ReadMe software package for R takes as input a set of text documents (such as speeches, blog posts, newspaper articles, judicial opinions, movie reviews, etc.), a categorization scheme chosen by the user (e.g., ordered positive to negative sentiment ratings, unordered policy topics, or any other mutually exclusive and exhaustive set of categories), and a small subset of text documents hand classified into the given categories.' < | CC BY-NC-ND-3.0 | library | R | >
|
|
|
|
|
|
link:Tool_STATA[STATA] (link:https://www.rrz.uni-hamburg.de/services/software/software-thematisch/statistik/stata.html[website-uhh] ):: < | Proprietary | package | | >
|
|
|
==== nowcasting
|
|
|
__
|
|
|
|
|
|
link:Tool_Nowcasting[Nowcasting] (link:https://cran.r-project.org/package=nowcasting[website-cran] link:https://github.com/nmecsys/nowcasting[repository] ):: < | GPL-3.0 | library | R | >
|
|
|
|
|
|
== nowcasting
|
|
|
==== miscellaneous
|
|
|
__
|
|
|
|
|
|
link:Tool_Nowcasting[Nowcasting] (link:https://cran.r-project.org/package=nowcasting[website-cran] link:https://github.com/nmecsys/nowcasting[repository] ):: < | GPL-3.0 | package | R | >
|
|
|
link:Tool_spades[spades] (link:http://spades.predictiveecology.org/[website] link:[repository] ):: < | None | library | R | >
|
|
|
|
|
|
=== Python
|
|
|
|
|
|
== network analysis
|
|
|
_social network analysis_
|
|
|
==== user-consented tracking
|
|
|
_Collection of sensor data on (mobile) devices in accordance with data protection laws._
|
|
|
|
|
|
link:Tool_AutoMap[AutoMap] (link:http://www.casos.cs.cmu.edu/projects/automap/software.php[website] ):: 'AutoMap enables the extraction of information from texts using Network Text Analysis methods. AutoMap supports the extraction of several types of data from unstructured documents. The type of information that can be extracted includes: content analytic data (words and frequencies), semantic network data (the network of concepts), meta-network data (the cross classification of concepts into their ontological category such as people, places and things and the connections among these classified concepts), and sentiment data (attitudes, beliefs). Extraction of each type of data assumes the previously listed type of data has been extracted.' < | Proprietary | package | Java | >
|
|
|
link:Tool_PassiveDataKit[Passive Data Kit] (link:https://passivedatakit.org/[website] link:https://github.com/audaciouscode/PassiveDataKit-Django[repository-djangoserver] link:https://github.com/audaciouscode/PassiveDataKit-Android[repository-android] link:https://github.com/audaciouscode/PassiveDataKit-iOS[repository-iOS] ):: < | Apache-2.0 | framework | Python, Java | english>
|
|
|
|
|
|
link:Tool_NodeXL[NodeXL] (link:https://www.smrfoundation.org/nodexl/[website] ):: < | Proprietary | package | | >
|
|
|
==== scraping
|
|
|
_Tools in the area of web-scraping_
|
|
|
|
|
|
link:Tool_ORAPro[ORA Pro] (link:[website] link:[repository] ):: < | Proprietary | package | | >
|
|
|
link:Tool_TWINT[TWINT] (link:https://twint.io/[website] link:https://github.com/twintproject/twint[repository] ):: TWINT (Twitter Intelligence Tool) 'Formerly known as Tweep, Twint is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.' link:https://github.com/twintproject/twint[Retrieved 07.03.2019] < | MIT | library | Python | >
|
|
|
|
|
|
link:Tool_Pajek[Pajek] (link:http://mrvar.fdv.uni-lj.si/pajek/[website] ):: < | Proprietary | package | | >
|
|
|
link:Tool_Scrapy[Scrapy] (link:https://scrapy.org/[website] link:https://github.com/scrapy/scrapy[repository] ):: < | BSD | framework | Python | >
|
|
|
|
|
|
link:Tool_NetworkX[NetworkX] (link:[website] link:[repository] ):: 'Data structures for graphs, digraphs, and multigraphs Many standard graph algorithms Network structure and analysis measures Generators for classic graphs, random graphs, and synthetic networks Nodes can be 'anything' (e.g., text, images, XML records) Edges can hold arbitrary data (e.g., weights, time-series) Open source 3-clause BSD license Well tested with over 90% code coverage Additional benefits from Python include fast prototyping, easy to teach, and multi-platform.' < | BSD | package | Python | >
|
|
|
link:Tool_BeautifulSoup[Beautiful Soup] (link:https://www.crummy.com/software/BeautifulSoup/[website] link:https://github.com/scrapy/scrapy[repository] ):: 'Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Three features make it powerful: Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application;Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don't have to think about encodings, unless the document doesn't specify an encoding and Beautiful Soup can't detect one. Then you just have to specify the original encoding.;Beautiful Soup sits on top of popular Python parsers like lxml and html5lib, allowing you to try out different parsing strategies or trade speed for flexibility.'link:https://www.crummy.com/software/BeautifulSoup/[Retrieved 22.03.2019] < | MIT | library | Python | >
|
|
|
|
|
|
link:Tool_UCINET[UCINET] (link:https://sites.google.com/site/ucinetsoftware/home[website] ):: 'UCINET 6 for Windows is a software package for the analysis of social network data. It was developed by Lin Freeman, Martin Everett and Steve Borgatti. It comes with the NetDraw network visualization tool.' < | Proprietary | package | | >
|
|
|
link:Tool_Robobrowser[Robobrowser] (link:https://robobrowser.readthedocs.io/en/latest/readme.html[website] link:https://github.com/jmcarp/robobrowser[repository] ):: < | MIT | library | Python | >
|
|
|
|
|
|
==== natuaral language processing(NLP)
|
|
|
__
|
|
|
|
|
|
== search
|
|
|
_information retrieval in large datasets._
|
|
|
link:Tool_Gensim[Gensim] (link:https://pypi.org/project/gensim/[website] link:[repository] ):: 'Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.' < | LGPL | library | Python | >
|
|
|
|
|
|
link:Tool_LuceneSolr[LuceneSolr] (link:http://lucene.apache.org/solr/[website] link:[repository] ):: < | Apache-2.0 | package | | >
|
|
|
link:Tool_NLTK[NLTK] (link:http://www.nltk.org/index.html[website] link:https://github.com/nltk/nltk[repository] ):: 'NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.' < | Apache-2.0 | framework | Python | >
|
|
|
|
|
|
link:Tool_Pandas[Pandas] (link:http://pandas.pydata.org/[website] link:https://github.com/pandas-dev/pandas[repository] ):: < | BSD | library | Python | >
|
|
|
|
|
|
== ESM/EMA surveys
|
|
|
_Datenerhebung in 'natürlicher' Umgebung._
|
|
|
link:Tool_xtas[xtas] (link:http://nlesc.github.io/xtas/[website] link:https://github.com/NLeSC/xtas[repository] ):: the eXtensible Text Analysis Suite(xtas) 'is a collection of natural language processing and text mining tools, brought together in a single software package with built-in distributed computing and support for the Elasticsearch document store.' < | Apache-2.0 | framework | Python | >
|
|
|
|
|
|
link:Tool_paco[paco] (link:https://www.pacoapp.com/[website] link:https://github.com/google/paco[repository] ):: < | Apache-2.0 | framework | Objective-C, Java | >
|
|
|
==== visualization
|
|
|
__
|
|
|
|
|
|
link:Tool_scikit-image[scikit-image] (link:https://scikit-image.org/[website] link:https://github.com/scikit-image/scikit-image[repository] ):: 'scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers.' < | BSD | library | Python | >
|
|
|
|
|
|
== audio-transcriptions
|
|
|
_software that converts speech into electronic text document._
|
|
|
==== optical character recognition (OCR)
|
|
|
_OCR is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text._
|
|
|
|
|
|
link:Tool_f4analyse[f4analyse] (link:https://www.audiotranskription.de/f4-analyse[website] ):: < | Proprietary | standalone | | >
|
|
|
link:Tool_tesseract[tesseract] (link:https://github.com/tesseract-ocr/tesseract[repository] ):: 'Tesseract is an open source text recognizer (OCR) Engine, available under the Apache 2.0 license. It can be used directly, or (for programmers) using an API to extract printed text from images. It supports a wide variety of languages.' < | Apache-2.0 | library | Python | >
|
|
|
|
|
|
link:Tool_EXMARaLDA[EXMARaLDA] (link:[website] link:https://github.com/EXMARaLDA/exmaralda[repository] ):: 'EXMARaLDA ist ein System für das computergestützte Arbeiten mit (vor allem) mündlichen Korpora. Es besteht aus einem Transkriptions- und Annotationseditor (Partitur-Editor), einem Tool zum Verwalten von Korpora (Corpus-Manager) und einem Such- und Analysewerkzeug (EXAKT).' < | Unknown | framework | Java | >
|
|
|
==== miscellaneous
|
|
|
__
|
|
|
|
|
|
link:Tool_scikit-learn[scikit-learn] (link:https://scikit-learn.org/stable/index.html[website] link:https://github.com/scikit-learn/scikit-learn[repository] ):: 'Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.' < | BSD | library | Python | >
|
|
|
|
|
|
== optical character recognition (OCR)
|
|
|
_OCR is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text._
|
|
|
=== Others
|
|
|
|
|
|
link:Tool_tesseract[tesseract] (link:https://github.com/tesseract-ocr/tesseract[repository] ):: 'Tesseract is an open source text recognizer (OCR) Engine, available under the Apache 2.0 license. It can be used directly, or (for programmers) using an API to extract printed text from images. It supports a wide variety of languages.' < | Apache-2.0 | package | Python | >
|
|
|
==== user-consented tracking
|
|
|
_Collection of sensor data on (mobile) devices in accordance with data protection laws._
|
|
|
|
|
|
link:Tool_AWARE[AWARE] (link:http://www.awareframework.com/[website] link:https://github.com/denzilferreira/aware-client[repository-android] link:https://github.com/tetujin/aware-client-ios[repository-iOS] link:https://github.com/tetujin/aware-client-osx[repository-OSX] link:https://github.com/denzilferreira/aware-server[repository-server] ):: 'AWARE is an Android framework dedicated to instrument, infer, log and share mobile context information, for application developers, researchers and smartphone users. AWARE captures hardware-, software-, and human-based data. The data is then analyzed using AWARE plugins. They transform data into information you can understand.' link:http://www.awareframework.com/what-is-aware/[Source, visited: 27.02.2019] < | Apache-2.0 | framework | Java | >
|
|
|
|
|
|
link:Tool_MEILI[MEILI] (link:http://adrianprelipcean.github.io/[website-dev] link:https://github.com/Badger-MEILI[repository-group] ):: < | GPL-3.0 | framework | Java | >
|
|
|
|
|
|
link:Tool_WebHistorian(CE)[Web Historian(CE)] (link:https://doi.org/10.5281/zenodo.1322782[website-doi] link:http://www.webhistorian.org[website] link:https://github.com/WebHistorian/community[repository] ):: Chrome browser extension designed to integrate web browsing history data collection into research projects collecting other types of data from participants (e.g. surveys, in-depth interviews, experiments). It uses client-side D3 visualizations to inform participants about the data being collected during the informed consent process. It allows participants to delete specific browsing data or opt-out of browsing data collection. It directs participants to an online survey once they have reviewed their data and made a choice of whether to participate. It has been used with Qualtrics surveys, but any survey that accepts data from a URL will work. It works with the open source Passive Data Kit (PDK) as the backend for data collection. To successfully upload, you need to fill in the address of your PDK server in the js/app/config.js file. < e06b3e174f9668f5c62f30a9bedde223023e0bca | GPL-3.0 | plugin | Javascript | english>
|
|
|
|
|
|
== online experiments
|
|
|
==== natuaral language processing(NLP)
|
|
|
__
|
|
|
|
|
|
link:Tool_LIONESS[LIONESS] (link:https://lioness-lab.org/[website] ):: 'LIONESS Lab is a free web-based platform for online interactive experiments. It allows you to develop, test and conduct decision-making experiments with live feedback between participants. LIONESS experiments include a standardized set of methods to deal with the set of challenges arising when conducting interactive experiments online. These methods reflect current ‘best practices’ for, e.g., preventing participants to enter a session more than once, facilitating on-the-fly formation of interaction groups, reducing waiting times for participants, driving down attrition by retaining attention of online participants and, importantly, adequate handling of cases in which participants drop out.With LIONESS Lab you can readily develop and test your experiments online in a user-friendly environment. You can develop experiments from scratch in a point-and-click fashion or start from an existent design from our growing repository and adjust it according your own requirements.' link:https://lioness-lab.org/faq/[Retrieved 07.03.2019] < | Proprietary | package | Javascript | >
|
|
|
link:Tool_ApacheOpenNLP[Apache OpenNLP] (link:https://opennlp.apache.org/[website] link:[repository] ):: 'OpenNLP supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, language detection and coreference resolution.' < | Apache-2.0 | library | Java | >
|
|
|
|
|
|
link:Tool_nodeGame[nodeGame] (link:https://nodegame.org/[website] link:https://github.com/nodeGame[repository] ):: 'NodeGame is a free, open source JavaScript/HTML5 framework for conducting synchronous experiments online and in the lab directly in the browser window. It is specifically designed to support behavioral research along three dimensions: larger group sizes, real-time (but also discrete time) experiments, batches of simultaneous experiments.' < 4.2.1 | MIT | package | Javascript | english>
|
|
|
link:Tool_GATE[GATE] (link:https://gate.ac.uk/overview.html[website] link:https://github.com/GateNLP/gate-core[repository] ):: GATE - General Architecture for Text Engineering < | LGPL | framework | Java | >
|
|
|
|
|
|
link:Tool_Breadboard[Breadboard] (link:http://breadboard.yale.edu/[website] link:https://github.com/human-nature-lab/breadboard[repository] ):: 'Breadboard is a software platform for developing and conducting human interaction experiments on networks. It allows researchers to rapidly design experiments using a flexible domain-specific language and provides researchers with immediate access to a diverse pool of online participants.' link:http://breadboard.yale.edu/[Retrieved: 07.03.2019] < | Unknown | package | Javascript | >
|
|
|
link:Tool_spaCy[spaCy] (link:https://spacy.io/[website] link:https://github.com/explosion/spaCy[repository] ):: spaCy 'excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. Independent research has confirmed that spaCy is the fastest in the world. If your application needs to process entire web dumps, spaCy is the library you want to be using.' < | MIT | library | Cython | >
|
|
|
|
|
|
link:Tool_Empirica(beta)[Empirica(beta)] (link:https://empirica.ly/[website] link:https://github.com/empiricaly/meteor-empirica-core[repository] ):: 'Open source project to tackle the problem of long development cycles required to produce software to conduct multi-participant and real-time human experiments online.' link:https://github.com/empiricaly/meteor-empirica-core[Retrieved: 07.03.2019] < | MIT | package | Javascript | >
|
|
|
link:Tool_StanfordCoreNLP[Stanford CoreNLP] (link:https://stanfordnlp.github.io/CoreNLP/[website] link:https://github.com/stanfordnlp/CoreNLP[repository] ):: < | GPL-3.0 | library | Java | >
|
|
|
|
|
|
==== topic-models
|
|
|
__
|
|
|
|
|
|
== (remote) eye tracking
|
|
|
_None_
|
|
|
link:Tool_MALLET[MALLET] (link:http://mallet.cs.umass.edu/[website] link:https://github.com/mimno/Mallet[repository] ):: < | Apache-2.0 | library | Java | >
|
|
|
|
|
|
link:Tool_SearchGazer[SearchGazer] (link:https://nodegame.org/[website] link:https://github.com/nodeGame[repository] ):: SearchGazer: Webcam Eye Tracking for Remote Studies of Web Search < | MIT | package | Javascript | >
|
|
|
==== sentiment analysis
|
|
|
__
|
|
|
|
|
|
link:Tool_lexicoder[lexicoder] (link:http://www.lexicoder.com/index.html[website] ):: 'Lexicoder performs simple deductive content analyses of any kind of text, in almost any language. All that is required is the text itself, and a dictionary. Our own work initially focused on the analysis of newspaper stories during election campaigns, and both television and newspaper stories about public policy issues. The software can deal with almost any text, however, and lots of it. Our own databases typically include up to 100,000 news stories. Lexicoder processes these data, even with a relatively complicated coding dictionary, in about fifteen minutes. The software has, we hope, a wide range of applications in the social sciences. It is not the only software that conducts content analysis, of course - there are many packages out there, some of which are much more sophisticated than this one. The advantage to Lexicoder, however, is that it can run on any computer with a recent version of Java (PC or Mac), it is very simple to use, it can deal with huge bodies of data, it can be called from R as well as from the Command Line, and its free.' < | Proprietary | library | Java | >
|
|
|
|
|
|
== agent-based modeling
|
|
|
==== visualization
|
|
|
__
|
|
|
|
|
|
link:Tool_NetLogo[NetLogo] (link:http://ccl.northwestern.edu/netlogo/[website] link:https://github.com/NetLogo/NetLogo[repository] ):: 'NetLogo is a multi-agent programmable modeling environment. It is used by many tens of thousands of students, teachers and researchers worldwide. It also powers HubNet participatory simulations.' < | Unknown | package | Java, Scala | >
|
|
|
link:Tool_Gephi[Gephi] (link:https://gephi.org/[website] link:https://github.com/gephi/gephi/[repository] ):: 'Gephi is an award-winning open-source platform for visualizing and manipulating large graphs.' < | GPL-3.0 | library | Java | >
|
|
|
|
|
|
link:Tool_sigma.js[sigma.js] (link:http://sigmajs.org/[website] link:https://github.com/jacomyal/sigma.js[repository] ):: 'Sigma is a JavaScript library dedicated to graph drawing. It makes easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications.' < | MIT | library | Javascript | >
|
|
|
|
|
|
== investigative journalism
|
|
|
__
|
|
|
==== statistical software
|
|
|
_software that helps calcualting with specific statistical models_
|
|
|
|
|
|
link:Tool_DocumentCloud[DocumentCloud] (link:https://www.documentcloud.org/[website] link:https://github.com/documentcloud/documentcloud[repository] ):: 'DocumentCloud is a platform founded on the belief that if journalists were more open about their sourcing, the public would be more inclined to trust their reporting. The platform is a tool to help journalists share, analyze, annotate and, ultimately, publish source documents to the open web.' link:https://www.documentcloud.org/about[Source, visited: 04.03.2019] < | MIT | standalone | Ruby | >
|
|
|
link:Tool_MLwiN[MLwiN] (link:http://www.bristol.ac.uk/cmm/software/mlwin/[website] link:[repository] ):: < | Proprietary | library | | >
|
|
|
|
|
|
link:Tool_NEWSLEAK[NEW/S/LEAK] (link:https://www.documentcloud.org/[website] link:https://github.com/documentcloud/documentcloud[repository] ):: 'DocumentCloud is a platform founded on the belief that if journalists were more open about their sourcing, the public would be more inclined to trust their reporting. The platform is a tool to help journalists share, analyze, annotate and, ultimately, publish source documents to the open web.' link:https://www.documentcloud.org/about[Source, visited: 04.03.2019] < | AGPL-3.0 | standalone | Ruby | >
|
|
|
==== search
|
|
|
_information retrieval in large datasets._
|
|
|
|
|
|
link:Tool_LuceneSolr[LuceneSolr] (link:http://lucene.apache.org/solr/[website] link:[repository] ):: < | Apache-2.0 | library | | >
|
|
|
|
|
|
== miscellaneous
|
|
|
__
|
|
|
==== ESM/EMA surveys
|
|
|
_Datenerhebung in 'natürlicher' Umgebung._
|
|
|
|
|
|
link:Tool_spades[spades] (link:http://spades.predictiveecology.org/[website] link:[repository] ):: < | None | package | R | >
|
|
|
link:Tool_paco[paco] (link:https://www.pacoapp.com/[website] link:https://github.com/google/paco[repository] ):: < | Apache-2.0 | framework | Objective-C, Java | >
|
|
|
|
|
|
link:Tool_scikit-learn[scikit-learn] (link:https://scikit-learn.org/stable/index.html[website] link:https://github.com/scikit-learn/scikit-learn[repository] ):: 'Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.' < | BSD | package | Python | >
|
|
|
==== (remote) eye tracking
|
|
|
_None_
|
|
|
|
|
|
link:Tool_SearchGazer[SearchGazer] (link:https://nodegame.org/[website] link:https://github.com/nodeGame[repository] ):: SearchGazer: Webcam Eye Tracking for Remote Studies of Web Search < | MIT | framework | Javascript | > |