Skip to content
Snippets Groups Projects
Select Git revision
  • ffb2fcd78c2e4ebb3d3c2f4deb4352bcffa5b511
  • master default protected
2 results

fcs-clarin-endpoint-hamburg

  • Open with
  • Download source code
  • Your workspaces

      A workspace is a virtual sandbox environment for your code in GitLab.

      No agents available to create workspaces. Please consult Workspaces documentation for troubleshooting.

  • user avatar
    Timofey Arkhangelskiy authored
    ffb2fcd7
    History

    FCS Clarin endpoint

    Overview

    This is an endpoint for Federated Content Search (FCS).

    There are many linguistic corpora online. They are available under different platforms and use a variety of query languages. FCS is a mechanism that allows you to search in multiple corpora at once, using simple text queries or a CQL-like language. This way, you can discover or compare corpora that can be useful for your research, after which you can proceed to them. This is done through the Aggregator.

    An endpoint is a piece of software that serves as an intermediary between FCS and individual corpora. It translates the FCS requests into corpus-specific query languages, waits for the results, and then renders them in an XML format required by the FCS.

    Different corpus platforms or online databases require different endpoints. This endpoint works with the following platforms or resources:

    Documentation

    All documentation is available here.

    CLARIN FCS specifications this endpoint implements are available here.

    Requirements

    This software was tested on Ubuntu and Windows. Its dependencies are the following:

    • python >= 3.8
    • python modules: fastapi, uvicorn, lxml, Jinja2 (you can use requirements.txt)
    • it is recommended to deploy the endpoint through apache2 with wsgi or nginx

    License

    The software is distributed under CC BY license (see LICENSE).

    Funding

    The development of this software was funded by the Akademie der Wissenschaften in Hamburg.