Bibliographical, Style and Genre Annotation

Bibliographical, style and genre annotation are inevitable parts of the primary processing of corpus texts. Information about the identity and the basic text structure are useful for its archiving, citation, statistical evaluation of parameters or investigating the distribution of language units and language phenomena in particular texts. The annotation will be displayed at the bottom of the client Bonito window by clicking on the desired line in a concordance list with the right mouse button. The annotation consists of keys together with values, which can be either free (e.g. author’s name) or other (e.g. genre). Keys can refer to style and genre characteristics of text. The main categories are type of text (literary, journalistic, professional, live communication), genre (poem, novel, short story, article, etc.) and domain (subject area, e.g. science, law, politics, economy). These categories can be further divided. Other keys provide the bibliographic details of a source and information about the author and text. Here is the list of keys under which you can find relevant information.

External annotation

External annotation uses the key-value structure. Value is a string of
characters finished at the end of each line. The multi-line names are
therefore excluded. The values may be either free (e.g. name of author)
or chosen from specified values (e.g. genre). Optional flags consists of
a set of flags separated by commas. Each flag establishes a particular
characteristic of a value. These values have a special meaning (they are
not necessarily meaningful for all the keys):

(an empty space or a whitespace)
the same as „…“. Default value in the automatic annotation. But we suppose it will appear.
missing key
has the same value as the undefined key („…” or empty)
XXX
unknown value. It cannot be defined, e.g. author’s name in article.
YYY
undefinable
value. It cannot be defined or has no meaning. It cannot be defined or
has no meaning, e.g. gender of author (in collaborative work), gender
of translator (if not a translation).
MIX
mixture. Mixed values, e.g. author is a hermaphrodite.
MSC
other. If the value is not defined in the set of values, e.g. author is a eunuch.
TTT
unknown value which needs to be defined. The annotation must be completed, the value added.

Annotation of the bank

Keys are in the form of title (abbreviation). Its meaning is described
under the corresponding key and its possible values are listed, if not
free.

1. Basic keys with free values:

Author (Auth)

  • Author’s name. As listed in resources under the standards for bibliographic records.

Origauthor (OrgA)

  • Original author’s name.

Translator (Trnr)

  • Name of translator. YYY, if not a translation.

Bibliography (Bibl)

  • Bibliography.

BOGOCONG (BOGO)

  • Multi-letter record of a conglomerate.

Name (Name)

  • Name of text.

Origname (OrgN)

  • Original name of text (in translation).

Conglomerate (Cong)

  • Identification of conglomerate which the text is a part of.

Comment (Comn)

  • Comment. It is used to specify or provide more information about the text.

Date (Date)

  • Issue date.

Dateorig (OrgD)

  • Original issue date (first issue, it might be identical with “Date”), original issue date of translations.

ISBN (ISBN)

  • ISBN number.

ISSN (ISSN)

  • ISSN number.

SourceId (ScId)

  • ID of document of archive (remains the same in the bank).

Id (Id)

  • Identification code of the document.

2. Keys with specific values:

Translation (Trnn)

  • Determines whether the text has been translated.
Values:
trn
translation
org
original text
ftr
loosely translated, retold text
YYY
combination of a translated and original text (e.g. a collection of short stories)

Rhyme (Rhym)

  • It indicates whether the text rhymes in the sense of rhythmic binding or is unrhymed.
Values:
nrh
unrhymed
rhy
rhymed
MIX
partially rhymed

Type (Type)

  • Text Type, the key is important when classifying texts into more homogeneous groups, it divides texts into individual styles.
Values:
img
literary (imaginative) text
inf
journalistic (informative) text
prf
professional text
liv
live communication

Subtype (SubT)

  • Subtype of the text, extended values are used to more precisely specify the style (Type) of the text.
Type and specification according to Subtype
for Type = imgpre Type = inffor Type = prffor Type = liv
(literary (imaginative) text)(journalistic (informative) text)(professional text)(live communication)
poe
poetry
pub
public press
sci
scientific literature, articles, journals, university textbooks
spk
spoken
pro
prose
adv
advertising materials, advertising
pop
popular science, special interest magazines
wri
written (Internet, telex if used interactively, communication of speech-impaired people)
dra
drama
adm
administrative texts
txb
primary and high school textbooks
 
  enc
encyclopedia and similar alphabetically arranged works
 
  man
manuals, operating instructions, recipes,…
 

Genre (Genr)

  • Genre determines other properties of texts, a large group of fixed values is established. The properties of artistic texts are also determined by Subgenre. There is a close relationship between the Type and Genre keys, which is illustrated in the table:
Genre in individual styles
for Type = imgfor Type = inffor Type = prf
(literary (imaginative) text)(journalistic (informative) text)(professional text)
ver
verse
doc (documentary)
minute, protocol, resolution, contract, annual report, resolution
mon
monograph
son
song, libretto
ann (announce)
directive, decree, questionnaire, commercials, announcements, offers
hnd
handbook
scd
drama script, drama play
lst (heslovité)
lists, programmes, rules, statues, content, masthead
dis
dissertation, rigorous theses
scf
film script, film subtitles
rpt (report)
report, interview, announcement, communique
ins
instruction
scr
radio script
anl (analytic)
editorial, comment, gloss, review, critics, discussion, polemic, debate, caricature
dpl
diploma, bachelor and final works
nov
novel
pbb (belles-lettres)
feuilleton, report, feature, column
std
study
col
short story, collection of short stories
spc
speeches (political, occasional)
abs
abstract
ess
essay
dsc
discussion/polemic/debate paper
tcl
article
dia
dialogues
 rfl
reflection
mem
memoirs, biographies, autobiographies
 ref
paper, term paper
let
letters
 lct
lecture
chr
chronicle
 crs
characteristics
sen
short epic genres (sayings, quotes, aphorisms, jokes, etc.)
 crt
short epic genres (quotes, aphorisms etc.)
fac
non-fiction
 opn
opinion

Subgenre (SubG)

  • for Genre: nov, col, ver, fac (for Genre ver and fac, Subgenre is optional)
Values:
crm
crime, detective
scf
sci-fi, fantasy, mystery
adn
adventurous, westerns
rms
romance novels
bel
belles lettres
jun
junior literature
trv
travel literature

Domain (Domn)

  • Domain, thematic area (activities or knowledge).
Values:
ars
artistic science
hum
human science
law
law
nat
natural science
tec
technology
ecn
economy, management
blf
belief, supernatural
lif
life style
ins
interdisciplinary science
plt
politics
gov
state and public administration

Subdomain (SubD)

  • Subdomain — a more detailed definition of a thematic, professional area.
    For Domain “ins” there is no Subdomain.
The scope of relationships between Domain and Subdomain
for Domain = arsfor Domain = humfor Domain = lawfor Domain = natfor Domain = tec
mus
music, opera, operetta, ballet
his
history, archeology
bil
bills, statutes, regulations
agr
agriculture
tra
transport, lines, telecommunication
cin
cinema, film
psy
psychology
jud
judicatures
med
medicine
ene
energetics
arc
architecture
edu
education
jur
jurisdiction (other legal texts)
pha
pharmacy
ind
industry
art
art, photos, sculpture
soc
sociology, communication
 zoo
zoology
com
computer science
the
theatre, theatre studies and critics
phi
philosophy, aesthetics
 bot
botany
bui
building industry
lit
literature, literature science and critics
inf
library science and information sources
 bio
biology
sta
normalization, standardization
 pol
political science
 che
chemistry
 
 lin
linguistics
 mat
mathematics
 
 eth
ethnology, ethnography, anthropology
 ggr
geography
 
 cul
cultural science
 phy
physics (including astronomy)
 
 swo
social work
 met
meteorology
 
 mec
mass media and marketing communication, media, advertising
 geo
geology
 
   env
environmental studies, ecology
 
for Domain = ecnfor Domain = blffor Domain = liffor Domain = pltfor Domain = gov
eco
economy, banking, business
rel
religion, belief, sects
hou
household (flat, garden, handicraft, kitchen, breeding)
reg (optional)
region
uso
central authorities; institutions, centers and businesses with nationwide scope
mng
management, control
teo
theology
fsh
clothing, fashion
 sam
local government and self-government bodies
mer
merchandising, consumer area
exc
the supernatural, occult, magic, astrology
spo
sport
 tvs
professional texts on public administration and self-government
  sct
social life
  
  amu
amusement, games, hobbies, free time, travelling
  
  min
ethnic minorities
  
  reg
region
  
  cnl
counselling
  
  clt
culture
  

Medium (Medi)

  • Medium, refers to the data carrier or text source.
Values:
lib
book
ebk
e-book
nws
newspaper
jou
journal
ste
studying materials
net
the Internet and other (pre-internet) networks. These include specific
Internet newspapers, websites, e-mail, usenet contributions,
contributions to fora, and live communication. Note that print
newspapers downloaded from the Internet are „nws“, electronic books
intended primarily for publishing are „lib“, but the e-books primarily
intended for on-screen viewing are „net“.
for
form
occ
occasional (miscellanies)
npu
non-published texts, handwritings
tvf
television, cinema
rad
radio

Authsex (AutS)

  • Sex of author.
Values:
msc
masculine
fem
feminine

Transsex (TrnS)

  • Sex of translator, see Authsex.

Varieta (Vari)

  • Language variant of document. It is Slovak mostly.
Values:
std
standard Slovak
nst
non-standard Slovak
ost
old standard / before the orthography reform in 1953

Paragraphs (Para)

  • Determines the text division.
Values:
tru
true; text divided into paragraphs
fls
false; information on text division lost

Emphasis (Emph)

  • Information on presence of an original highlighted text.
Values:
tru
true
fls
false

Diacritics (Dcrt)

  • Text with correct or incorrect diacritics.
Values:
tru
true; correct diacritic marks
fls
false; incorrect or missing diacritic marks

Corrected (Corr)

  • Document corrected or not.
Values:
tru
yes
fls
no

License (Lice)

  • Type of licence.

Lang (Lang)

  • Language of work, three-letter abbreviation in ISO format 639-2. It is always Slovak.

Origlang (OrgL)

  • Original language of work according to ISO 639-3. Translations of already translated texts are marked „>“. For example: eng>ger.