The Slovak Named Entity Corpus snec-1.0 contains 468 715 tokens in 201 texts from Wikipedia, The Free Encyclopedia. Is is comprised of 27 000 sentences with more than 67 000 entities. Morphologically annotated texts have undergone a semi-automated supervised control.
The corpus contains data from the project Koncepcia a realizácia sémantickej anotácie korpusu (identifikácia viacslovných pomenovaní, ručná anotácia pomenovaných jednotiek, budovanie ontológií).
More information can be found here.