Project description

Kickoff2019_Gruppenfoto_02.jpg
The CLARIAH-DE team at the Kickoff-Meeting in Mannheim on April 8 (Image: Dr. A. Trabold for the IDS Mannheim; License CC BY-SA 4.0)

 

Governance and Project Implementation in Work Packages

The implementation and development of CLARIAH-DE will be ensured through the interaction of strategic, operational and participatory bodies. Together, they strike a balance between the technical requirements of infrastructure and the research-driven needs of science. The overall coordination lies jointly with the Eberhard Karls University Göttingen and the SUB Göttingen. The CLARIAH-DE network is advised by an Advisory Board.
Its members are:

The substantive work within CLARIAH-DE is divided into six work packages (German: Arbeitspaket, abbreviated to AP), which are carried out by teams of researchers from both partners.

Work Package 1 – Research Data, Standards and Procedures

Inevitably, the handling of data as a mutual concern of the two infrastructural projects CLARIN-D and DARIAH-DE involves some overlap and complementary developments - and thus offers potential for synergies. Therefore, since 2019, both infrastructures and their processes and standards have been consolidated within CLARIAH-DE. Building on earlier collaborations and coordinated activities, the developments that had hitherto occurred concurrently will become interoperable in the future.

This will be made possible, among other things, by the harmonisation of standards and procedures for the creation, processing and archiving of data and tools within Work Package (WP) 1. The texts of the Digital Library in the TextGrid Repository (TGR) will be transformed into the basic format of the German Text Archive (Basisformat des Deutschen Textarchivs or DTABf). Thus, the annotations of both large text collections will be standardised in terms of format, which means that the collections of the TGR can also be explored using the tools developed within the DTA or by CLARIN-D.

While the DTABf is already established as a pivot format for text collections, especially for full texts of historical printed works, newspapers and manuscripts with a simple structure, CLARIAH-DE is also evaluating it as a standard for editions. We are examining how to connect collections with sufficient heterogeneity to the DARIAH-DE Data Federation Architecture via the Data Modeling Environment.

Task- and Co-Taskleaders

Work Package 2 – Tools and Virtual Research Environments

Within the CLARIN-D and DARIAH-DE projects, specialised tools and research environments for the organisation and processing of text- and language-based resources and research data were designed, developed and made available. These will now be merged into a sustainable, unified service for the humanities. The Language Ressource Switchboard (LRS) will be used as a web-based bridge to overcome the associated technical and organisational challenges and to achieve interoperability. To this end, the LRS and the existing description format must be extended to classify the tools, e.g. by using TaDiRAH. Furthermore, in order to support the processing of specific language and text resources and collections in LRS, it is necessary to define a cross-project pivot format (DTA basic format) for the bidirectional conversion tools that build on it.

The results are documented in blog articles and step-by-step instructions and made available to interested members of the public.

Task- and Co-Taskleaders
  • Andrea Rapp (linglit, Technical University Darmstadt)
  • Erhard Hinrichs (Department of General and Computational Linguistics, University of Tübingen)

Work Package 4 – Technical Integration and Coordination of Technical Developments

The technical infrastructure is the basis of a user-friendly research infrastructure with a large number of services for scientific disciplines. It provides memory, basic functions and specific tools enabling uninterrupted, reliable use without time delays. In order to accomplish this across all CLARIN-D and DARIAH-DE services in the future, work package 4 deals with the technical integration and coordination of the two research infrastructures.

CLARIN-D and DARIAH-DE have very different disciplinary traditions, and have thus developed different technologies, tools, services and processes that can now complement each other. The consolidation, however, is not straightforward, as is the case, in particular, with the merging of the three search and retrieval tools: Generic Search, Federated Content Search and Virtual Language Observatory. Work package 4 will do basic conceptual work here. Consolidation often also requires the harmonisation of standards and interfaces (resource metadata, interchange formats). In other areas, such as Authentication and Authorization Infrastructure (AAI), WP4 has already found a solution based on the DARIAH-DE AAI. The work package is complemented by a technology watch that goes beyond CLARIN and DARIAH and tries to incorporate the most important developments in the field into its own plans wherever possible.

The work is based on previous extensive cooperation, for instance in the joint Technical Advisory Board, and preliminary integrative endeavours in recent years.

Task and Co-Taskleaders
  • Gerhard Heyer (Institute of Computer Science, Leipzig University)
  • Philipp Wieder (The Göttingen Society for Scientific Data Processing)