Corpus of Religious Texts

The specialized text corpus blf-2.0 covering the domain of belief and the supernatural was released in December 2014. The corpus contains 66 million tokens.

The corpus has been developed at SNK for scientific purposes in the field of religious terminology.

The texts have been lemmatized and morphologically annotated. The corpus includes full bibliographical and style-genre annotation.

The first version, which contained 14.5 million tokens was developed in 2008. It was for internal use only.