Utterances in natural language do not occur in isolation, but rather are part of larger texts or conversations. A central notion of such larger discourses is coherence, the relatedness of utterances to each other. In this course, we will introduce models for discourse coherence which are based on discourse relations, relations such as Cause or Temporal-Precedence. These discourse relations are higher-order predicates that take abstract entities such as events, facts, or propositions (expressed by sentences) as their arguments. In order to fully interpret a given discourse, it is necessary to retrieve these discourse relations both when they are explicit, and marked by discourse connectives such as `because’ or `before’, as well as when they are left implicit. The task of automatically identifying discourse relations from text is called Shallow Discourse Parsing. It has received increased attention in recent years but remains a very difficult and unsolved task in NLP.
In this block seminar, we will first study existing systems and approaches to discourse parsing (as published in the CONLL Shared Tasks, and others). At the last Shared Task, we have participated with a shallow discourse parser for English (Oepen et al., 2016). The second part of the class will be spent on group projects aimed at improving individual modules or the general architecture of the parser, based on the preferences of the course participants.