INTRODUCTION TO EAGLES TAGS FOR RUSSIAN



The morphological analyzer for Russian uses a set of tags to represent morphological information of words. This set is based on a set of tags proposed by the group EAGLES (http://www.ilc.cnr.it/EAGLES96/annotate/node9.html) for all European languages.

Since it is planned to collect existing grammar constructions in European languages, depending on the language some attributes are not relevant or used. If an attribute is not specified, it means that it either encodes some information not present or not relevant in that language. Underspecified attributes get 0 (zero) as value.

Below the tagset for Russian is presented.

For each Part-of-Speech category, a table encoding attributes, their vaules, and how they are encoded is presented:



Pos

Attribute

Value

Code

...

...

...

...



Column 1 (PoS) contains the position in the tag. Column 2 (Attribute) describes the meaning of the attribute in that position. Column 3 (Value) lists the possible values for that attribute, and Column 4 (Code) contains the digit that encodes each of these values.



Parts of speech:



A: Adjective

D: Adverb

P: Pronominal adverb

Y: Ordinal numeral

R: Pronominal adjective

M: Part of a compound

C: Conjunction

J: Interjection

Z: Numeral

T: Particle

B: Preposition

N: Noun

E: Pronoun

V: Verb

Q: Participle



Adjective

Pos

Attribute

Value

Code

1

Category

Adjective

A

2

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Animacy

Animate

Inanimate

A

I

6

Form of the adjective

Short

Full

S

F

7

Degree of comparison

Superlative

Comparative

Positive

E

C

P

8

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

9

Sign of obscene language

Obscene

Non-obscene

H

0



Examples:



Word

Lemma

Tag

звонких

звонкий

AGP00F000



звонких звонкий AGP00F000:: A - adjective, G - genitive, P - plural, F - full form of the adjective.



Adverb

Pos

Attribute

Value

Code

1

Category

Adverb

D

2

Degree of comparison

Superlative

Comparative

Positive

E

C

P

3

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

4

Sign of obscene language

Obscene

Non-obscene

H

0



Pronominal Adverb

Pos

Attribute

Value

Code

1

Category

Pronominal adverb

P

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Ordinal

In Russian, the ordinal numerals have all the grammatical features of relative adjectives. Parts of complex ordinal numbers (from 21 th) are written separately: the двадцать первый. (Wikipedia)

Pos

Attribute

Value

Code

1

Category

Ordinal

Y

2

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Animacy

Animate

Inanimate

A

I



Examples:



Form

Lemma

Tag

сотый

сотый

YNSM0

сотый

сотый

YFSMI

сотою

сотый

YCSF0

один

один

YNSM0

один

один

YFSMI

одну

один

YFSF0

сотый YNSM0: Y-ordinal, N-nominative case, S-singular, M-masculine

сотый YFSMI: Y-ordinal, N-nominative case, S-singular, M-masculine I-inanimate

сотою YCSF0: Y-ordinal, C-instrumental, S-singular, F-feminine



Pronominal Adjective

Pos

Attribute

Value

Code

1

Category

Pronominal adjective

R

2

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Animacy

Animate

Inanimate

A

I

6

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Part of a compound

Pos

Attribute

Value

Code

1

Category

Compound part

M

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Conjunction

Pos

Attribute

Value

Code

1

Category

Conjunction

C

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Interjection

Pos

Attribute

Value

Code

1

Category

Interjection

J

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

3

Sign of obscene language

Obscene

Non-obscene

H

0



Numeral

Pos

Attribute

Value

Code

1

Category

Numeral

Z

2

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Animacy

Animate

Inanimate

A

I

6

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Particle

Pos

Attribute

Value

Code

1

Category

Particle

T

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Preposition

Pos

Attribute

Value

Code

1

Category

Preposition

B

2

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Noun

Pos

Attribute

Value

Code

1

Category

Noun

N

2

Subcategory

Proper

Common

P

C

3

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

4

Number

Singular

Plural

S

P

5

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

6

Animacy

Animate

Inanimate

A

I

7

Additional noun information

Geographical

Proper name

Patronymic

Surname

G

N

S

F

8

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

9

Sign of obscene language

Obscene

Non-obscene

H

0

10

Named entity class

Person

Place

Organization

Others

P

G

O

V



Examples:



Word

Lemma

Code

скалолазанье

скалолазание

NCNSAI0000

скалолазании

скалолазание

NCOSAI0000

растении

растение

NCOSAI0000



скалолазание скалолазание NCNSAI0000 скалолазание NCFSAI0000

NCNSAI0000: N - noun, C- common, N - nominative, S - singular, A - neuter, I - inanimate;

NCFSAI0000: N - noun, C- common, F - accusative, S - singular, A - neuter, I - inanimate;


ретроградстве ретроградство NCOSAI0000

NCOSAI0000: N - noun, C- Common, O - prepositional, S - singular, A - neuter, I - inanimate;


Иванову Иванов NP0000000P

NP0000000P: N - noun, P – proper, P - person




Pronoun

Pos

Attribute

Value

Code

1

Category

Pronoun

E

2

Case

Nominative

Genitive

Dative

Accusative

Instrumental

Prepositional

Partitive (2nd genitive)

Locative (2nd prepositional)

Vocative

N

G

D

F

C

O

P

L

V

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Animacy

Animate

Inanimate

A

I

6

Person

First

Second

Third

1

2

3

7

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E



Examples:

Form

Lemma

Tag

они

они

ENP0000

ними

они

ECP0000



Verb

Pos

Attribute

Value

Code

1

Category

Verb

V

2

Mood

Gerund

Infinitive

Indicative

Imperative

G

I

D

M

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Tense

Present

Future

Past

P

F

S

6

Person

First

Second

Third

1

2

3

7

Aspect

Perfective

Imperfective

F

N

8

Voice

Active

Passive

A

S

9

Transitivity

Transitive verb

Intransitive verb

M

A

10

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

11

Sign of obscene language

Obscene

Non-obscene

H

0







Examples:

Word

Lemma

Code

сей

скалолазание

VMS000N0000

сею

скалолазание

VDS0F0N0000

нашей

нашивать

VMS00000000



сей сеять VMS000N0000: V - verb, M - imperative, S -singular, N - imperfective

сею сеять VDS0F0N0000: V - verb, D - indicative, S - singular, F - Future, N - imperfective

нашей нашивать VMS00000000: V - verb, M - imperative, S - singular



Participle

Pos

Attribute

Value

Code

1

Category

Participle

Q

2

Mood

Gerund

Infinitive

Indicative

Imperative

G

I

D

M

3

Number

Singular

Plural

S

P

4

Gender

Masculine

Feminine

Neuter

Common

M

F

N

C

5

Tense

Present

Future

Past

P

F

S

6

Person

First

Second

Third

1

2

3

7

Aspect

Perfective

Imperfective

F

N

8

Voice

Active

Passive

A

S

9

Transitivity

Transitive verb

Intransitive verb

M

A

10

Additional Information

Transitions

Difficult formation of form

Corrupted form

Predicative

Spoken form

Uncommon word

Abbreviation

Outdated form

P

D

V

R

I

A

B

E

11

Sign of obscene language

Obscene

Non-obscene

H

0



Examples:



Word

Lemma

Code

мобилизованному

мобилизовать

QDSMSF00000

мобилизованных

мобилизовать

QGP0SFF0000

мобилизованный

мобилизовать

QMSMSF00000



мобилизованному мобилизовать QDSMS0F0000: Q - participle, D - indicative, S - singular, M - masculine, S - past, F - perfective

мобилизованных мобилизовать: QGP0S0F0000: Q - participle, G - gerund, P – plural, S - past, F - perfective

мобилизованный мобилизовать: QMSMS0F0000: Q - participle, M - imperative, S- singular, M- Masculine, S - past, F - perfective