Bibliographical, Style and Genre Annotation

Bibliographical, style and genre annotation are inevitable parts of the primary processing of corpus texts. Information about the identity and the basic text structure are useful for its archiving, citation, statistical evaluation of parameters or investigating the distribution of language units and language phenomena in particular texts. The annotation will be displayed at the bottom of the client Bonito window by clicking on the desired line in a concordance list with the right mouse button. The annotation consists of keys together with values, which can be either free (e.g. author’s name) or other (e.g. genre). Keys can refer to style and genre characteristics of text. The main categories are type of text (literary, journalistic, professional, live communication), genre (poem, novel, short story, article, etc.) and domain (subject area, e.g. science, law, politics, economy). These categories can be further divided. Other keys provide the bibliographic details of a source and information about the author and text. Here is the list of keys under which you can find relevant information.

External annotation

External annotation uses the key-value structure. Value is a string of characters finished at the end of each line. The multi-line names are therefore excluded. The values may be either free (e.g. name of author) or chosen from specified values (e.g. genre). Optional flags consists of a set of flags separated by commas. Each flag establishes a particular characteristic of a value. These values have a special meaning (they are not necessarily meaningful for all the keys):
(an empty space or a whitespace)
the same as „…“. Default value in the automatic annotation. But we suppose it will appear.
missing key
has the same value as the undefined key („…” or empty)
XXX
unknown value. It cannot be defined, e.g. author’s name in article.
YYY
undefinable value. It cannot be defined or has no meaning. It cannot be defined or has no meaning, e.g. gender of author (in collaborative work), gender of translator (if not a translation).
MIX
mixture. Mixed values, e.g. author is a hermaphrodite.
MSC
other. If the value is not defined in the set of values, e.g. author is a eunuch.
TTT
unknown value which needs to be defined. The annotation must be completed, the value added.

Annotation of the bank

Keys are in the form of title (abbreviation). Its meaning is described under the corresponding key and its possible values are listed, if not free. 1. Basic keys with free values:

Author (Auth)

  • Author’s name. As listed in resources under the standards for bibliographic records.

Origauthor (OrgA)

  • Original author’s name.

Translator (Trnr)

  • Name of translator. YYY, if not a translation.

Bibliography (Bibl)

  • Bibliography.

BOGOCONG (BOGO)

  • Multi-letter record of a conglomerate.

Name (Name)

  • Name of text.

Origname (OrgN)

  • Original name of text (in translation).

Conglomerate (Cong)

  • Identification of conglomerate which the text is a part of.

Comment (Comn)

  • Comment. It is used to specify or provide more information about the text.

Date (Date)

  • Issue date.

Dateorig (OrgD)

  • Original issue date (first issue, it might be identical with “Date”), original issue date of translations.

ISBN (ISBN)

  • ISBN number.

ISSN (ISSN)

  • ISSN number.

SourceId (ScId)

  • ID of document of archive (remains the same in the bank).

Id (Id)

  • Identification code of the document.
2. Keys with specific values:

Translation (Trnn)

  • Determines whether the text has been translated.
Values:
trn
translation
org
original text
ftr
loosely translated, retold text
YYY
combination of a translated and original text (e.g. a collection of short stories)

Rhyme (Rhym)

  • It indicates whether the text rhymes in the sense of rhythmic binding or is unrhymed.
Values:
nrh
unrhymed
rhy
rhymed
MIX
partially rhymed

Type (Type)

  • Text Type, the key is important when classifying texts into more homogeneous groups, it divides texts into individual styles.
Values:
img
literary (imaginative) text
inf
journalistic (informative) text
prf
professional text
liv
live communication

Subtype (SubT)

  • Subtype of the text, extended values are used to more precisely specify the style (Type) of the text.
Type and specification according to Subtype
for Type = imgpre Type = inffor Type = prffor Type = liv
(literary (imaginative) text)(journalistic (informative) text)(professional text)(live communication)
poe poetrypub public presssci scientific literature, articles, journals, university textbooksspk spoken
pro proseadv advertising materials, advertisingpop popular science, special interest magazineswri written (Internet, telex if used interactively, communication of speech-impaired people)
dra dramaadm administrative textstxb primary and high school textbooks
enc encyclopedia and similar alphabetically arranged works
man manuals, operating instructions, recipes,…

Genre (Genr)

  • Genre determines other properties of texts, a large group of fixed values is established. The properties of artistic texts are also determined by Subgenre. There is a close relationship between the Type and Genre keys, which is illustrated in the table:
Genre in individual styles
for Type = imgfor Type = inffor Type = prf
(literary (imaginative) text)(journalistic (informative) text)(professional text)
ver versedoc (documentary) minute, protocol, resolution, contract, annual report, resolutionmon monograph
son song, librettoann (announce) directive, decree, questionnaire, commercials, announcements, offershnd handbook
scd drama script, drama playlst (heslovité) lists, programmes, rules, statues, content, mastheaddis dissertation, rigorous theses
scf film script, film subtitlesrpt (report) report, interview, announcement, communiqueins instruction
scr radio scriptanl (analytic) editorial, comment, gloss, review, critics, discussion, polemic, debate, caricaturedpl diploma, bachelor and final works
nov novelpbb (belles-lettres) feuilleton, report, feature, columnstd study
col short story, collection of short storiesspc speeches (political, occasional)abs abstract
ess essaydsc discussion/polemic/debate papertcl article
dia dialoguesrfl reflection
mem memoirs, biographies, autobiographiesref paper, term paper
let letterslct lecture
chr chroniclecrs characteristics
sen short epic genres (sayings, quotes, aphorisms, jokes, etc.)crt short epic genres (quotes, aphorisms etc.)
fac non-fictionopn opinion

Subgenre (SubG)

  • for Genre: nov, col, ver, fac (for Genre ver and fac, Subgenre is optional)
Values:
crm
crime, detective
scf
sci-fi, fantasy, mystery
adn
adventurous, westerns
rms
romance novels
bel
belles lettres
jun
junior literature
trv
travel literature

Domain (Domn)

  • Domain, thematic area (activities or knowledge).
Values:
ars
artistic science
hum
human science
law
law
nat
natural science
tec
technology
ecn
economy, management
blf
belief, supernatural
lif
life style
ins
interdisciplinary science
plt
politics
gov
state and public administration

Subdomain (SubD)

  • Subdomain — a more detailed definition of a thematic, professional area.
    For Domain “ins” there is no Subdomain.
The scope of relationships between Domain and Subdomain
for Domain = arsfor Domain = humfor Domain = lawfor Domain = natfor Domain = tec
mus music, opera, operetta, ballethis history, archeologybil bills, statutes, regulationsagr agriculturetra transport, lines, telecommunication
cin cinema, filmpsy psychologyjud judicaturesmed medicineene energetics
arc architectureedu educationjur jurisdiction (other legal texts)pha pharmacyind industry
art art, photos, sculpturesoc sociology, communicationzoo zoologycom computer science
the theatre, theatre studies and criticsphi philosophy, aestheticsbot botanybui building industry
lit literature, literature science and criticsinf library science and information sourcesbio biologysta normalization, standardization
pol political scienceche chemistry
lin linguisticsmat mathematics
eth ethnology, ethnography, anthropologyggr geography
cul cultural sciencephy physics (including astronomy)
swo social workmet meteorology
mec mass media and marketing communication, media, advertisinggeo geology
env environmental studies, ecology
for Domain = ecnfor Domain = blffor Domain = liffor Domain = pltfor Domain = gov
eco economy, banking, businessrel religion, belief, sectshou household (flat, garden, handicraft, kitchen, breeding)reg (optional) regionuso central authorities; institutions, centers and businesses with nationwide scope
mng management, controlteo theologyfsh clothing, fashionsam local government and self-government bodies
mer merchandising, consumer areaexc the supernatural, occult, magic, astrologyspo sporttvs professional texts on public administration and self-government
sct social life
amu amusement, games, hobbies, free time, travelling
min ethnic minorities
reg region
cnl counselling
clt culture

Medium (Medi)

  • Medium, refers to the data carrier or text source.
Values:
lib
book
ebk
e-book
nws
newspaper
jou
journal
ste
studying materials
net
the Internet and other (pre-internet) networks. These include specific Internet newspapers, websites, e-mail, usenet contributions, contributions to fora, and live communication. Note that print newspapers downloaded from the Internet are „nws“, electronic books intended primarily for publishing are „lib“, but the e-books primarily intended for on-screen viewing are „net“.
for
form
occ
occasional (miscellanies)
npu
non-published texts, handwritings
tvf
television, cinema
rad
radio

Authsex (AutS)

  • Sex of author.
Values:
msc
masculine
fem
feminine

Transsex (TrnS)

  • Sex of translator, see Authsex.

Varieta (Vari)

  • Language variant of document. It is Slovak mostly.
Values:
std
standard Slovak
nst
non-standard Slovak
ost
old standard / before the orthography reform in 1953

Paragraphs (Para)

  • Determines the text division.
Values:
tru
true; text divided into paragraphs
fls
false; information on text division lost

Emphasis (Emph)

  • Information on presence of an original highlighted text.
Values:
tru
true
fls
false

Diacritics (Dcrt)

  • Text with correct or incorrect diacritics.
Values:
tru
true; correct diacritic marks
fls
false; incorrect or missing diacritic marks

Corrected (Corr)

  • Document corrected or not.
Values:
tru
yes
fls
no

License (Lice)

  • Type of licence.

Lang (Lang)

  • Language of work, three-letter abbreviation in ISO format 639-2. It is always Slovak.

Origlang (OrgL)

  • Original language of work according to ISO 639-3. Translations of already translated texts are marked „>“. For example: eng>ger.