Exploration of large Text Collections using Topic Models and Expert Knowledge
Affiliation: TU Darmstadt
Associated since: Dezember 2016
CEDIFOR Project Partners
- Carsten Schnober, UKP Lab, TU Darmstadt
- Prof. Dr. Iryna Gurevych, UKP Lab, TU Darmstadt
The aim of this Project is to find answers to historical questions using a combination of hermeneutic and complex statistical methods (Topic Models).Fully automatic methods often provide unsatisfactory responses to profound research questions in the Humanities. Therefore, within this AP, historical hypothesis will be validated through empirically grounded observations. Topic Models represent thematic clusters that have been extracted automatically and can be interpreted by experts. Experts can stepwise improve, deepen and direct their analysis through interaction with the automatically recommended topics. This far, concepts such as ‘corruption’ and ‘bioethics’ have been investigated in a corpus that spans over several decades.
The source code of the Topic Explorer system is available on GitHub.