LOEWE-Schwerpunkt Digital Humanities

Der LOEWE-Schwerpunkt Digital Humanities ist eine Kollaboration der Universität Frankfurt, der Technischen Universität Darmstadt und des Freien Deutsches Hochstifts / Frankfurter Goethe Museum. Ziel des LOEWE-Schwerpunkts ist es,  die Grundlagenforschung in den beteiligten geisteswissenschaftlichen Disziplinen im Hinblick auf informationstechnologische Verfahrensweisen miteinander zu vernetzen. Öffnet externen Link in neuem Fenster[weiter ...]

LOEWE Schwerpunkt Digital Humanities

LOEWE-Schwerpunkt Digital Humanities – Integrierte Aufbereitung und Auswertung textbasierter Corpora, Projektbereich „Gegenwartssprachliche Corpora“, seit 01.2011

Funded by the State of Hessen as part of the LOEWE initiative of excellence.

Teilprojekt 2.2 "Text as instance of the language system" | "Text als Instanz des Sprachsystems"

Contrastive investigation of non-canonical grammatical constructions between English and German

Descriptions of natural language grammars tend to focus on the canonical constructions of a language, yet actual usage also displays constructions that are in different ways marked and thus deviate from the canonical form. The  inventory of all permissible constructions provides insight into the way in which the possibilities offered by the language system are exploited in actual language use. Non-canonical constructions are an understudied phenomenon which is partly due to the fact that they are less frequent than canonical constructions; it must, furthermore, be taken into consideration that their usage is determined by the context which further restricts the amount of evidence available.
Non-canonical constructions are language specific and thus determined by the range of range of possibilities offered by the language system. Typological features of languages such as freedom of word order etc. must thus be assumed to have an impact on the range of constructions that are marked and deviate from the canonical form.

Corpora provide the necessary empirical foundation for the exploration of properties of text instances and classes of text realizing a spectrum of varieties such as functional, regional and social varieties etc. Corpora are equally indispensable for the investigation of the language system, especially for the description and modelling of grammatical and lexical phenomena and the relations between them as well as the relations between the usage and system.

The project aims to validate the hypothesis that natural language grammars constitute systems of construction that centered on a set of canonical constructions of a particular language which are complemented by a set of peripheral non-canonical constructions. The initial hypothesis are:

  • each sentence structure is associated with a distinctive domain of linguistic function,
  • canonical sentence structures are associated with more versatile domains of function than non-canonical constructions,
  • canonical and non-canonical constructions together form different systems of construction for different languages


  • Collect a broad spectrum of large and diversified corpora since in order to find a sufficient number of instances of these constructions
  • Merge corpora into a uniform resource
  • Describe structures like inversion, extraposition, cleft sentences in Englisch and equivalents in German using patterns over automatically identifiable features like lexical items, parts-of-speech and parses
  • Extract constructions from the corpora
  • Identify contexts of occurrence
  • Compare non-canonical constructions vs. canonical constructions with a language
  • Compare non-canonical constructions between English and German 

Projekt start: 01.2011

Project partners

 • Technische Universität Darmstadt
 • Goethe Universität Frankfurt am Main


Principal investigators

  • Prof. Dr. Gert Webelhuth, IEAS, Goethe Universität Frankfurt am Main
  • Prof. Dr. Iryna Gurevych, UKP Lab, Technische Universität Darmstadt
  • Dr. Sabine Bartsch, Department of Linguistic and Literary Studies, Technische Universität Darmstadt

Project staff (Darmstadt)

  • Richard Eckart de Castilho, Technical Lead
  • Erik-Lân Do Dinh, Student assistant


Institut für Sprach- und Literaturwissenschaft


Landwehrstraße 50A
Gebäude S4|23
64293 Darmstadt


Dolivostraße 15 
64293 Darmstadt

A A A | Drucken Drucken | Impressum | Sitemap | Suche | Webmail | TU
zum Seitenanfangzum Seitenanfang