Slovak-Spanish Parallel Corpus

The second version par-skes-2.0 was released in February 24. 2. 2022 containing almost 35.6 million tokens (18.9 million tokens in the Spanish half and 16.7 million tokens in the Sloval half). The first version par-skes-1.0 was released in July 2019 containing about 11.5 million tokens (5 455 067 tokens in the Slovak half, 6 044 520 tokens in the Spanish half).

To access the whole corpus, use the web interface NoSketchEngine to query the Spanish half or the Slovak half. Knowledge of NoSketch Engine and CQL is required.

Slovak-Spanish Parallel corpus contains translations of 77 texts: translations from Spanish (59), translations from Slovak (1), as well as translations from other languages into Slovak and Spanish (17). The texts are automatically sentence-aligned. The Slovak texts are automatically morphologically annotated by the tagger MorphoDiTa which has been trained and tuned on tagset developed by the SNK. TreeTagger has been used to tag the Spanish texts.