Department of Corpus and Computeral Linguistics, English Philology

The Department of Corpus and Computeral Linguistics, English Philology, is concerned with the modeling of linguistic phenomena at all levels of the linguistic organization of linguistics, lexis, grammar to semantics and discourse in research and teaching. In addition to knowledge about language, the focus lies on building up and annotating electronic text corpora – and on the development and implementation of computer-based methods and workflows for linguistic research using electronic corpora.

About the department

Head of Department: Dr. Sabine Bartsch

The focus of the linguistic research interests of the Department of Corpus and Computeral Linguistics at Technische Universität Darmstadt lies on the corpus-based investigation of functional variation (registers) of language in different contexts (e.g. language of science, political speeches, etc.). One focus here is the creation of profiles of interacting features of registers, as well as historical changes of such registers that took place in the course of the history of modern science, especially since the late 17th century. A related focus in the field of register research is the analysis of scientific texts as multimodal artefacts, which, in addition to natural language text, include images, tables, and formulas as well as other non-linguistic modalities.

Another central research topic is to be seen in lexical co-occurrence phenomena, in particular so-called collocations, which can be observed in all areas of language and are of particular relevance because they play an important role in semantic differentiations of meaning (“You shall know a word by the company it keeps”, J.R. Firth). In addition, collocations are of special significance regarding the language production and reception of people who are learning a foreign language, as well as in the formation of terminology, as a coexistence of lexical expressions often indicates a subject-specific differentiation of meaning.

Another area of interest focuses on grammatical constructions, including methods of studying rare phenomena in corpora. Examples are inventories of non-canonical constructions, such as cleft sentences, NP fronting, etc., or instantiations of modality and their distribution and function in the context of other constructions within the construction inventory of individual languages.

The methodological focus lies on the creation and annotation of electronic corpora, which are annotated using manual and computational methods, searched for linguistically relevant patterns, and characterized, analyzed, and statistically evaluated with regard to the quantitative distribution of linguistic phenomena. The analysis is based on comparisons of the distribution of linguistic phenomena. Another methodological interest lies in developing workflows to implement and test different research-specific methods and procedures.

At the Institute of Linguistics and Literary Studies, the Department of Corpus and Computeral Linguistics is responsible for the aspect of methods training in the field of digital philology (Digital Humanities / Digital Philology). In the scope of academic training, the main areas of specialization are:

  • corpus- and computer linguistic data and methods
  • corpus construction and processing
  • annotation and development of annotation procedures and workflows
  • quantitative analysis of digital corpora
  • methodological synergies between corpus and Computeral linguistics and digital literary science, as well as other textual sciences, e.g. History

The department contributes its distinctive subject areas to the academic training in the following courses of study:

  • Master of Arts Linguistic and Literary Computing
  • (Joint) Bachelor of Arts Digital Philology
  • as well as to the application subjects in the BSc and MSc programs

In addition, the department organizes workshops and tutorials on specific research questions.

In addition to methodological training, the department is also concerned with the digital research infrastructure in the digital philologies and digital humanities. This includes expert advice on infrastructure issues and providing the students with software for the processing of linguistic data and corpora, as well as providing technical guidance and advice on methodological and technical aspects of suitable digital work environments for researchers. This is also one of the subject areas of the project