Corpus of Social Science Texts

This specialized corpus containing texts from the field of the humanities named hum-1.0-public was released in October 2016, containing 38 616 514 tokens. The corpus includes texts from the field of history, philosophy, sociology, linguistics, psychology, pedagogy, social work, ethnology, political science, cultural science and library science.

The corpus has been derived from the SNK for the purposes of comparative analyses of texts from specific domains.

The corpus is lemmatized and morphologically annotated. It included full bibliographical and style-genre annotation.