In version hist-6.0 the following structure tags are used:
Structure tags | Value | Explanation | Example | Note |
---|---|---|---|---|
<noise> | Ruptured text | Information on hard-to-read reconstructed text. | <noise> ar … + Adonay </noise> | |
… | Omitted text | Text omitted for several reasons, e. g. in case of a long hard-to-read text, or in case of a section written in foreign language etc. | <de> … </de> | Omitted text in German. The tag “…“ represents one tag (not 3 separate dots). |
/ | Word split | In case of two or more words written together, words are divided by slash on tokens, in accordance with the current orthography. | s/nim, z/nassich, na/hlawu | |
‿ | Word joining | In case of word split (within a line, not at the end), that should be written together, ligature provides information where the word was split. Such word is searched as a single token. | ssetko oči‿sti; Žensku nemoc Pry‿ly‿ssneho toku zastawuge | Given split words can be searched in two ways: Pry‿ly‿ssneho aj Prylyssneho |
<miss> | Text preserved in transcription only | Some pages are preserved as transcribed texts only and can be found in the Department of the Slovak Language History, Onomastics and Etymology of the LSIL SAS , the original text was missing (or a photocopy). In such case, the page is marked as “missing”. | <miss>… Koho hlawa boli War mrtwu žihlawu w/wine/aneb w wode a/uwaž sy na hlawu Proti čerweneg nemocy . Hales vtluč na/prach a pi w palenem . </miss> | |
[] | Abbreviation written down | Some abbreviations are written down using square brackets, the written part of the abbreviation is shown in the brackets. | u Ge[ho] M[i]l[o]stj | The original short text was: u Ge Mlstj |
| | Word with 2 possible “readings” | In case that a word allows for 2 possible readings, a vertical line separates the two possible word variants. | tzytedlnost|czytedlnost |
The following tags used for foreign languages:
Structure tag | Explanation | Example |
---|---|---|
<la> | Latin | <la> Anno 1660 in mense o[cto]bri[s] . </la> |
<de> | German | <de> Wildfeuer </de> to gest djwy ohen |
<hu> | Hungarian | <hu> SZLOVÁK HERBÁRIUM </hu> |
<he> | Hebrew | <he> Adonay </he> |
<el> | Greek | <el> Tetragramaton </el> |
<frg> | an unknown language or overlapping languages | <frg> Jchmachataton </frg> |
In hist-5.0, in addition to the above mentioned tags, several other tags were used:
Structure tag | Abbreviation | Value | Explanation | Example | Note |
---|---|---|---|---|---|
remark | <rem> | lang=“la/de/hu/he/el/frg“ | Information on language of token/tokens annotated by <rem> | <rem lang=”la”>Item</rem> | Text in Latin |
remark | <rem> | abb=“…“ | written down abbreviation | <rem abb=”et cetera”>etc.</rem> |