Skip to content
Snippets Groups Projects
Select Git revision
  • 79b5fccdc0c9ac957499228a1e7139f7418c720c
  • develop default protected
  • feature/support_more_file_formats
  • feature/split_main_fuction
  • feature/QuestValidators
  • feature/languagetoolchecker-for-multiple-lngs
  • 20220414
  • 1.0
  • 0.2
  • 0.1.1
10 results

corpus-services

  • Open with
  • Download source code
  • Your workspaces

      A workspace is a virtual sandbox environment for your code in GitLab.

      No agents available to create workspaces. Please consult Workspaces documentation for troubleshooting.

  • Contributors · Forks · Issues · License


    Logo

    Corpus Services

    The Corpus Services project bundles functionality used for maintenance, curation, conversion, and visualization of corpus data in various projects.
    Explore the docs »

    Report Bug · Request Feature

    Table of Contents

    About The Project

    The (HZSK) Corpus Services were initially developed at the Hamburg Centre for Language Corpora (HZSK) as a quality control and publication framework for EXMARaLDA corpora. Since then, most development work has been done within the INEL project. A focus has been set on making the code adaptable to other use cases and data types. The Corpus Services project now bundles functionality used for maintenance, curation, conversion, and visualization of corpus data in various projects.

    Getting Started

    Additional documentation on the Corpus services can be found in the doc folder:

    You can also find some sample scripts (batch and shell) to use for calls to the corpus services jar and some further utilities here.

    There is also some information and scripts useful for automating the use of corpus-services available here.

    Prerequisites

    Java needs to be installed.

    Gitlab artifacts

    The latest compiled .jar file can be found here.

    Building

    To use the services for corpora, compile it using mvn clean compile assembly:single. (See Build with Maven or use a pregenerated artifact from Gitlab that you can download here.

    Usage

    The usable functions can be found in the help output:

    java -jar corpus-services-1.0.jar -h

    See How to use for the usage of the corpus services.

    Libraries

    Roadmap

    See the open issues for a list of proposed features (and known issues).

    Contributing

    Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions are greatly appreciated.

    1. Fork the Project
    2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
    3. Commit your Changes (git commit -m 'Add some AmazingFeature')
    4. Push to the Branch (git push origin feature/AmazingFeature)
    5. Open a Pull Request

    License

    Distributed under MIT License. See LICENSE for more information.

    Authors

    Anne Ferger

    Hanna Hedeland

    Daniel Jettka

    Tommi Pirinen

    Contact

    Anne Ferger - @anneferger1 - anne.ferger@uni-hamburg.de

    Project Link: Corpus Services

    Metadata

    PID: http://hdl.handle.net/11022/0000-0007-D8A6-A

    Citation

    For an introduction to the system please cite

    Hedeland, H. & Ferger, A. (2020). Towards Continuous Quality Control for Spoken Language Corpora. International Journal for Digital Curation, 15(1). https://doi.org/10.2218/ijdc.v15i1.601

    Acknowledgements

    Contributions to the project have been made by staff from the HZSK and several research projects at the University of Hamburg: INEL, the BMBF-funded CLARIN-D project (01UG1620G), the project WO 1886/1-2 within the DFG LIS program, and the BMBF-funded project QUEST.

    Parts of this project have been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies’ Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies’ Programme is coordinated by the Union of the German Academies of Sciences and Humanities.

    Thank you to all funders, supporters and contributors!

    Logo created at LogoMakr.com. This README file was created on the basis of the Best-README-Template.