The 7th version of Web corpus released

The current version of the corpus web-7.0 contains 5 300 485 736 tokens. As compared to the previous version, the corpus size has increased by a billion tokens.

Morphological annotation and lemmatization provide more accurate results, compatible with the main corpus prim-10.0. The texts contain information on URL and retrieval time.

For more information, please visit the website.

Sing up for free access here.