Exploration of large Text Collections using Topic Models and Expert Knowledge

Exploration of large Text Collections using Topic Models and Expert Knowledge

Affiliation: TU Darmstadt

Associated since: Dezember 2016

CEDIFOR Project Partners

Short Description

The aim of this Project is to find answers to historical questions using a combination of hermeneutic and complex statistical methods (Topic Models).Fully automatic methods often provide unsatisfactory responses to profound research questions in the Humanities. Therefore, within this AP, historical hypothesis will be validated through empirically grounded observations. Topic Models represent thematic clusters that have been extracted automatically and can be interpreted by experts. Experts can stepwise improve, deepen and direct their analysis through interaction with the automatically recommended topics. This far, concepts such as ‘corruption’ and ‘bioethics’ have been investigated in a corpus that spans over several decades.

Project Results

The source code of the Topic Explorer system is available on GitHub.