Argumentative Microtext Corpus

The argumentative microtext corpus consists of short texts that respond to a trigger question such as "Should everybody be obliged to pay fees for public radio/TV?" All texts have been annotated with a tree representation of the underlying argumentation. The data is divided in two parts:

For illustration, here is an example from part 1 of the corpus (micro_b003):

Should health insurance cover alternative medical treatments?

EN: Health insurance companies should not cover treatment in complementary medicine unless the promised effect and its medical benefit have been concretely proven. Yet this very proof is lacking in most cases. Patients do often report relief of their complaints after such treatments. But as long as it is unclear as to how this works, the funds should rather be spent on therapies where one knows with certainty.

DE: Die Krankenkassen sollten Behandlungen beim Natur- oder Heilpraktiker nicht zahlen, es sei denn der versprochene Effekt und dessen medizinischer Nutzen sind handfest nachgewiesen. Genau dieser Nachweis fehlt jedoch in den meisten Fällen. Zwar verweisen die Patienten oft auf eine Linderung ihrer Beschwerden nach derartigen Behandlungen. Solange aber nicht klar ist, wieso es dazu kommt, sollte das Geld besser für Behandlungen ausgegeben werden, bei denen man es mit Sicherheit weiss.

A sample of our argumentation structure analysis (for a different text) is shown on this page.

Download links

Additional annotations and data

For Part 1 of the English corpus, various other annotation layers have become available:

Additional languages

Using the corpus: Examples

Besides our own work (see [10], [11], and this page), the corpus has been used by several researchers for purposes of argumentation mining. For instance, Stab and Gurevych [12] measured the performance of their argumentation structure analysis module. Potash et al. [13] ran experiments with a neural architecture, and Wachsmuth et al. [14] with tree kernels.