Workpackage 5: Neo-Latin Morphological
Analyzer
Year 2 Progress Reports
June 2003 May 2004
Executive Summary
Accomplishments
During the second year of work, the
WP5 kept on the development of a new version of the Latin morphological
analyser LEMLAT, adding new information about the input word forms. A demo of
this new version (named CHLT LEMLAT) and the source code of the program are
available on the CHLT website (http://www.chlt.org).
Particularly, the following results
have been accomplished in order to develop CHLT LEMLAT:
adding of the gender codes to the
LES belonging to ambiguous morphological categories;
coding of SF;
modifications on the LES archive,
caused by problems in adding gender codes and coding SF;
a MySQL database for the management
of the LES archive has been designed and implemented;
implementation of the algorithm for
the complete morphological analysis of the wordforms with structure LES + SF;
some procedures for the use of the
database by LEMLAT modules have been implemented;
some applications for specific
handlings of the informations contained in the LES archive have been
implemented;
adding of the gender codes to the
LES belonging to ambiguous morphological categories;
building up a client version of
LEMLAT;
building up and testing LEMLAT for
LINUX platform;
reorganising LEMLAT source code;
coding of the SM and management of
the wordforms with structure LES+SM+SF;
coding of FE;
coding as FE of the adjective
wordforms ending in -um, -i, -o, -a that are used as adverbs;
coding of the N, V, PR LES;
coding of the P1-P9 and P18 LES;
management of I LES;
coding of the Type of the no f
Type adjectives;
FE management;
design of a general rule for the
management of LE; in particular, some special rules have been designed in order
to solve the problem of the contrast between SF coding and LE rule and to solve
some exceptions;
tables for Initial graphical
variations and post-final segments;
identification of the morphological
values to be attributed to the LE of COD LES N6*, N7* and Pluralia Tantum;
management of N, V, PR, P1-P9 and
P18 LES;
management of irregular Type;
management of the present
participles, past participles, future participles and irregular gerundives;
output re-organization and XML format;
writing of four articles;
implementation of a library in C
standard language that can be linked in several applications;
implementation of a consolle
interactive application;
implementation of a CGI application
currently running on CHLT website at the WP5 page (http://www.chlt.org/~cnr/);
implementation of an application for
the morphological analysis of a text, producing output in a predefined format
suitable both for visualisation and rielaboration.
testing CHLT LEMLAT lemmatization
results: a number of Latin texts (and/or singular wordforms) will be submitted
to CHLT LEMLAT. The results will be analitically checked in order to find out
possible mistakes;
source code testing and validation;
source code documentation in order
to help developers to bulid up specific applications;
definition and documentation of CHLT
LEMLAT results: producing a TEI compliant DTD specifically designed with
morphological elements and attributes;
implementation of a web-based
application for text, or text fragments morphological analysis.
Insights for the future (CHLT
2?)
The aim should be the development of
a multi-modular tool (that is, a tool using more different modules) that allows
the user to query a corpus of Latin texts.
We focused our attention and
interest on Latin: the same can, obviously, be done even for Old-Norse and
other languages.
The kinds of query we'd like to be
able to answer to are, at least, the following:
on a merely morphological level: for
instance, the user can know all the wordforms inflected as first declension,
singular genitive in -ai nouns occuring in the texts of Cicero. The homographs are not
disambiguated;
on a morpho-syntactic level: the
homographs are disambiguated. The user can know where and if a partucular kind
of syntactic structure occurs in the texts of Cicero;
on a semantic level: the user
searchs for the word love (in English!) and obtains as an answer all the lemmas whose semantic
definition contains love as first, second, third,... meaning, metaphorical use, technical
use...;
statistics: each lemma is
accompained by its use frequency in the corpus (structured per author, age,
book, style of the book,...). Each wordform is bound to its morphological (no
disambiguation of the homographs) and morpho-syntactic frequency in the corpus.
Each lemma is part of a "semantic family" (SF) and a
"morphological family" (MF): an SF contains all the lemmas having a
common meaning in the definition; an MF contains all the lemmas have a common
stem in the stemming procedure;
Greek-Latin relationship through
English: all the Latin lemmas are related to the corrispondent Greek lemma
(linked), selected through the common meaning in the dictionary.
The general structure of the
analysis of a text is the following (the example is about Latin, but is
suitable even for other languages):
1.
Input
Latin text (from the CHLT corpus),
2.
Morphological
analysis (CHLT LEMLAT),
3.
Morpho-syntactic
analysis (Stemming and Syntactic Parser),
4.
Dictionary
entry (lemma) with (a) statistical information, (b) structured semantic
description (SF and MF) and (c) link to Greek dictionary.
Here there is a possible list of the
Workpackages in a no-gerarchic order:
WP1: development of the actual CHLT
corpus of Latin texts (we need even more texts);
WP2: development of CHLT LEMLAT. We
need:
o a wider lexical basis, in order to
cover at least the medieval lexical extension and the proper names
(Onomasticon),
o for the stemming, to reduce the
number of LES, adding lists of affixes and, thus, of rules of morphological
derivation. For instance, design a corpus of rules such as the one that creates
adjectives in -bilis from verbs: amabilis);
WP3: a syntactic parser (to
disambiguate the homographs);
WP4: to extract statistical
information form the CHLT corpus;
WP5: structuring the semantic
description of the lemmas in the dictionary and Greek-Latin linking.
The results of such a multi-modular
tool can be applied in a more general framework. Particularly, the following
fields seem to be the most suitable ones:
Education: e-learning,
Digital libraries: information
retrieval from Latin texts in digital format,
Research: linguistics, lexicography,
grammatical theories
Year 1 Executive Summary
In the context of the CHLT project, the task of the Workpackage 5 is to create a Neo-Latin Morphological Analyser. The people involved are Andrea Bozzi (senior researcher), Giuseppe Cappelli (senior technician), Marco Passarotti (young researcher) and Paolo Ruffolo (young researcher). From 1st september, 2002 to 31th August, 2003, the work done by Marco Passarotti has been paid not with CHLT funds, but with a CNR grant.
This document
summarizes what has been largely described in the two Progress Reports produced
since the beginning of the project by the Workpackage 5.[1]:: all the technical terms here used
are explained in these two reports and, because of this, are not described
anymore in these pages.
In the first part of
this document, the main achievements of the project are summarized up to May
2003.
In the second part,
the next steps, to be achieved by the end of the second year of the project,
are described.
In the third part, the
dissemination of the project by WP5 (through articles, lessons, papers and
posters at congresses) is summarized.
In the conclusion, the
necessity of LEMLAT in the IST context, and its perspectives, in view of an
evolution of its analysis of latin, are given.
1. Main
achievements of the project (up to May, 2003)
From the beginning of the project to May, 2003, the
main achievement of the Workpackage 5. are the following:
after an evaluation period, LEMLAT has been chosen as an automatic
lemmatization tool to be developed for CHLT requirements;
the analysis of the input wordforms done by LEMLAT has been studied, in
order to find out a way to add on the output the new morphological informations
required in the CHLT context; an algorithm has been written for the analysis of
the wordforms with structure LES + SF;
the CHLT LEMLAT analysis required a number of informations to be coded
on the SF elements. The codes to be added to each SF have been decided
according to the reccomandations developed in the context of the international
morpho-syntactic coding standard EAGLES. The list of the codes used for LEMLAT
and the problems of the use of them in a merely morphological context con be
read in the first Periodic Progress Report of the Workpackage 5. (December,
2002).
Choosing EAGLES as the coding standard to be applied
in CHLT LEMLAT allows a large applicability of the resulting morphological
analyser: this implies that, in the context of IST, LEMLAT can be a very useful
tool, to browse and search large latin corpora, either on the web, or on
stand-alone tools.
the SF (endings) related to the nominal, adjectival, paticiple, verbal
inflexions have been coded. This coding have been tested on the LEMLAT results.
The total number of the inserted codes is 27.144;
we started adding the gender codes
to the LES belonging to ambiguous morphological categories. 10812 codes have
been inserted up to now (on a total of 20.984);
the coding of the FE (exceptional
forms) have been finished. 46740 codes have been inserted (10 codes for eache
FE). The FE needed to be coded one by one, because the previous version of LEMLAT
did not segment them, not allowing, as a conseguence, the use of the algorithm
designed for the wordforms segmented as LES + SF.
the coding of the LES with COD LES N
(undeclinated nouns), I (invariable lemmas), V (verbs of a not specified
conjugation), PR, P1, P2, P3, P4, P5, P6, P7, P8, P9, P18 (different kinds of
pronouns) has been finished. 15840 code have been inserted.
the informatic management of all the
data related to the not segmented wordforms (FE, N, I, V, PR, P1-P9, P18) is
under construction.
a basic software system has been
implemented in language C and the results, produced on a specific banchmark,
have been checked. Moreover, a MySQL database for the management of the LES
archive has been planned and implemented;
we made some modifications on the
LES archive. During the coding work we found out that some previous coding
decisions were unfitable for the CHLT LEMLAT new functions (and formalism of
analysis). Thus, we created new COD LES and designed new groups of LES,
morphologically homogeneous. The most important and difficult modification
belongs to verbal conjiugation: see details in the second Periodic Progress
Report (March, 2003);
we implemented a client version of
LEMLAT.;
we compiled and tested LEMLAT for
LINUX platform;
we reorganised LEMLAT code: a C
static library plus I/O structures and functions;
we added the SF table on LEMLAT
database and developed specific C functions to menage such table;
we modified SF information
management in LEMLAT code and implemented the algorithm for the complete
morphological analysis of the wordforms with structure LES + SF;
we evaluated and used some tools for
a user friendly management of the LES archive and for the LEMLAT database in
general;
we built up a cgi version of LEMLAT
and tested it both on LINUX server and on WINDOWS server. The URL where the
actual results of LEMLAT can be seen is: http://webilc.ilc.cnr.it/~ruffolo
The website is still
very simple: in the home page, the user can write a latin wordform to be
analyzed. Then, with a click on the Lemmatize button, the analysis starts,
giving the results. The results of the lemmatization show the lemma(s) followed
by the sequence of the EAGLES codes, saying the morphological value(s) of the
analysed wordform. Since the user is not able to understand the semantics of
these codes, each one of them is esplicitly explained in a number of boxes. For
each code the corresponding attribute and value is given.
In the next future, we
will put on the home page:
since the actual version of CHLT
LMLAT on line is in italian, a link to the english version;
a link to all the official documents
(Reports, Deliverables) produced by the Workpackage 5;
a link to the list of the EAGLES
code used in CHLT LEMLAT, each one explained;
since LEMLAT is an open-source tool,
a link to all the documents related to the LEMLAT dictionary and to its source
code. While the work on LEMLAT is still going on, the access to these data will
be limited to users having a password.
The actual version
on-line of CHLT LEMLAT covers the analysis of all the wordforms with the
structure LES + SF.
CHLT LEMLAT still does
not cover the analysis of the following items:
Reason of the not
covering: the not segmented wordforms have to be analysed with an algorithm
different from the one of the wordforms with structure LES + SF. The linguistic
informations on each not segmented wordform has been already coded in some
tables: what is still missing (and is now under development) is the informatic
management of all this information. The new algorithm of analysis has been
designed and will be tested in the next months. The main steps of this
algorithm are the following:
a.
Receive
as input a wordform: abaddier
b.
LEMLAT
analyses it (that is to say lemmatizes) with no segmentation: abaddier-
c.
Search
for this wordform in the table of the LES with COD LES of one of the following:
FE, V, N, I, P1-P9, P18
d.
If no
items are found, stop the analysis
If an item is found (abaddier is found in the table of the FE),
attach on the output the codes related to that item: NcCnms-- (third declension noun, masculine,
singular, nominative)
e.
Keep
on reading the table where the item has been found. Find if there are other
raws bearing the same item. If yes, attach on the output the codes related to
all the items found. This must be done because a wordform can be analysed in
more than one way: in the tables where the morphological values of the not
segmented wordforms are recorded, there is one raw for each value, with the
coding of that value. For instance, the FE abaddier is recorded on two raws, because
has two different morphological values (NcCnms--: third declension noun, masculine,
singular, nominative and NcCvms--: third declension noun, masculine, singular, vocative)
Reason of the not
covering: the second position code in the EAGLES standard belongs to the Type
of the PoS. For the Adjectives, this information must be added manually, on the
dictionary, to each LES with PoS A (Adjective). Find the list of the Type
values in the first WP5 Periodic Progress Report (December, 2002)
Reason of the not
covering: in the previous LEMLAT version, gerunds, gerundives and participles
were all coded in the same category of the adjectives (N6 and N7). We coded the
SF related to this category with PoS A (Adjective), Type to be defined, no
Flexive Category, no Mood, no Tense, Case, Gender, Number.
But, gerunds,
gerundives and participles need to have PoS V (verb), Type m (Main), and, in
addition to Case, Gender, Number, also Flexive Category (depending on the
lemmas one), Mood and Tense. In order to give this informations on the output,
we are writing an algorithm, whose foundamental steps are the following:
a.
Receive
as input a wordform: amatorum
b.
LEMLAT
recognises in it a LES, a SM (segmento mediano) and an SF: am-at-orum
c.
LEMLAT
creates two lemmas
a.
A
lemma N6, or N7: amatus N6
b. A
lemma V: amo v1
d.
On the
output:
a.
Paste
the codes of Case, Gender, Number from the ones of the SF:
SF orum N6:
i. Genitive, plural, masculine
ii. Genitive, plural, neuter
b.
Write
the codes Vm (Verb, Main) in the first two positions
c.
Write
the code of Flexive Category (third position), according to the one of the
lemma V: v1 means F (verb of the first conjugation)
d.
Write
the codes of mood and tense, according to the SM appearing in the middle of the
segmented wordform: at means k (passive participle), 4 (perfect)
Thus, the wordform
amatorum is analyzed as: Verb, main, I conjugation, passive participle,
perfect, genitive, masculine and neuter, plural
Reason of the not
covering: we are testing algorithms such as the one described above about
gerunds, gerundives and participles.
CHLT LEMLAT has been presented in
the following occasions:
Exploratory
workshop on Computer texts: documentation, linguistic analysis and
interpretation, organized by the Standing
Committee for the Humanities of the European Science Foundation (A. Bozzi and
A. Raggioli, Strasbourg, 14-15/6/02)
XIV
Round Table on Computer-aided Egyptology
(A. Bozzi, Pisa, 8-10/7/02)
Seminars
at the Classical Studies Dpt., Faculty of Letter, Lisboa University, on
e-philology (A, Bozzi, Lisbona, 29 e 30/7/02)
Seminar
on Progettare il digitale. Tecnologie per i beni librari: conservazione e
fruizione in una biblioteca digitale (A. Bozzi, Firenze,
1/10/02)
International congress on Francesco Maurolico e le
matematiche del Rinascimento: l'edizione critica dei testi scientifici e la
sfida delle nuove tecnologie (A. Bozzi, Messina,
16-19/10/02)
Seminar
on Gestione e fruizione di immagini digitali per le biblioteche e gli
archivi, organized by Centro di Ateneo per le Biblioteche
dell'Universit degli Studi di Padova (A. Bozzi and A. Raggioli, Padova, 28/10/02)
Seminar at Istituto Nazionale di Studi sul Rinascimento
(A. Bozzi and A. Raggioli, Firenze, 4/11/02)
Lesson
at the Romanisches Seminar, Berlin Freie Universitaet (A. Bozzi and M.S.
Corradini, Berlin, 27/11/02)
Lesson at the Computational Linguistics course, Milano, Universit Cattolica del Sacro Cuore (M. Passarotti, 4/3/03)
International Colloquium on Antiguidade Clssica: Que fazer com este Patrimnio?, Universitade de Lisboa (G. Cappelli, M. Passarotti, 8/5/03)
Cappelli Giuseppe, Passarotti Marco, LemLat: uno strumento computazionale per lanalisi linguistica del latino. Sviluppo e prospettive, in Euphrosyne, Vol. XXXI, 2003
Poster at the XII International Colloquium on Latin Linguistics, Universit di Bologna (M. Passarotti, June 2003)
Article in the proceedings of the International Colloquium on Antiguidade Clssica: Que fazer com este Patrimnio? (G. Cappelli, M. Passarotti, to be published in 2003)
At the end of its development, CHLT
LEMLAT will be a very useful tool to analyse and filter big latin corpora,
covering a wide range of time in the history of this language.
There is, in fact, an urgent
necessity of management of large corpora, in view of a new information society,
where the users can access on-line many documents and, thus, need to filter
their linguistic contents, first of all lemmatizing them.
At the moment, no latin lemmatizer
is so skillful that can manage so many lemmas as CHLT LEMLAT could do.
The most important thing is that a
powerful lemmatizer means a powerful basis for a good syntactic disambiguator:
this tool, receiving as input a text, reads the wordforms in the syntax and
chooses the correct analysis of the wordforms, between the ones given by the
lemmatizer. For instance, the wordform puella is analysed by the lemmatizer in three
possible ways (noun, common, first declension, singular, feminine, nominative,
vocative and ablative): but, in a syntactic context, only one of these values
is correct: task of a syntactic disambiguator will be to choose the correct
one.
Other perspectives of CHLT LEMLAT
are:
A latin lexical database, where to
the lemmas are added statistical informations, images and sounds (where
possible), translation, etimology, length of the syllables
Building homogeneous groups of
lemmas, according to morphological relativity (morphological families: tema)
and semantic affinity (semantic families: semantema);
Adding an onomasticon and new lemmas
in the dictionary.
Workpackage Progress Reports for Year 2
CHLT Project
IST-2001-32745
1 June - 31 August 2003
Workpackage 5:
Neo-Latin Morphological Analyser
Istituto di
Linguistica Computazionale C.N.R.
Andrea Bozzi
Giuseppe
Cappelli
Marco
Passarotti
Paolo Ruffolo
1. Summary of key indicators of
project progress...2-9
1.1
Overall
assessment of the main milestones achieved
1.1.1
Gender coding
1.1.2
SM coding
1.1.3
Adverbial use
as FE
1.1.4 N, V, PR LES coding
1.1.5
P1-P9 and P18
LES coding
1.1.6
Management of
I LES
1.1.7
Coding of
Type
1.1.8
FE management
1.2 Problems encountered and decisions taken
1.2.1
Gender coding
1.2.2
SM coding
1.2.3
Adverbial use
as FE
1.3 Correspondence between planned project progress and actual
accomplishments
2.
Work progress overview10-11
2.1 Specific objectives for the reporting period
2.2 Achievements
2.2.1 List of Deliverables
2.2.2 Progress by Workpackage/task
2.3 Work planned for the next reporting period
3. Project Management..12
3.1 Contractual Issues
3.2 Co-operation within the consortium
3.3 Participation in workshops and/or conference,
publications
4. Technical annexes13-52
4.1 Coding of N, V, PR LES
4.2 Coding of P1-P9, P18 LES
4.3 Coding of the no f
Type adjectives
4.4 SM coding
4.5 Adverbial use
coding
1. Summary of key indicators of
project progress
This report
concerns the activities realized by the Workpackage 5 in the period from 1st
June 2003 to 31st August 2003.
1.1 Overall assessment of the
main milestones achieved
1.1.1 Gender coding
In the LES archive of the previous version of LEMLAT, lemmas of a
different gender could be coded as belonging to the same paradigmatic category,
because the gender did not come into the morphological information of the
output: for example, both masculine (nauta, pirata) and feminine nouns (rosa, absentia)
belong to the category n1 (first declension nouns, masculine and feminine
gender).
For the CHLT LEMLAT requirements,
the gender information is needed on the output.
In the period covered by this
report, we finished the gender coding: 19.600 gender codes have been applied.
1.1.2 SM coding
An SM (Segmenti
Mediani) is an element occurring in the middle part of a wordform[2].
Instances of SM are and in am-and-us, ant in am-ant-em, ior in pulchr-ior-em.
In the previous
version of LEMLAT the SM archive was as follows:
SM Left COD LES Right ending code
ant v1
n7
ans v1
blk
and v1
n6 n21 n2n
ent v2 v3 v6 n7
ens v2 v3 v6 blk
end v2 v3 v6 n6 n21 n2n
und v3 v6 n6 n21 n2n
ient v4 v5 n7
iens v4 v5 blk
iend v4 v5
n6 n21 n2n
iund v4 v5
n6 n21 n2n
ior n6 n7 n7c
blk
The structure of
the SM archive was the following:
a.
SM;
b.
COD LES
compatible on the left with the SM. For instance, the SM ant can attach, on the
left side, to LES with COD LES v1 (first conjugation verbs): an example is the
wordform am-ant-em, where am is a LES with COD LES v1;
c.
ending
inflexion code(s) compatible on the right with the SM. For instance, the SM ant
can attach, on the right side, to endings with code n7 (endings of the second
class adjectives): in the wordform am-ant-em, em is an ending with code n7.
In the period
covered by this report, a coding of the SM elements have been done, in order to
analyse the input wordforms where an SM is involved.
1.1.3 Adverbial use as FE
In addition to the regular values,
some adjective wordforms ending in -um, -i, -o, -a are also used as adverbs.
For instance, the wordform multo (lemma: multus) is a singular, masculine and
neuter dative and ablative, but is also used as an adverb.
In the period covered by this
report, we found a way to analyse wordforms such as multo also as adverbs.
1.1.4 N, V, PR LES coding
The LES with the following COD LES
are wordforms analysed with no segmentation:
This implies that they cannot be
analysed using the coding of SF, and/or SM as source of the information needed
on the output:
Like all the wordforms analysed with non segmentation (for instance, the FE), the morphological values have been applied to each wordform on its own.
In the annexe 1, the coding of the N,
V, PR LES is reported.
1.1.5 P1-P9 and P18 LES coding
The wordforms formed with a LES with
COD LES P1, P2, P3, P4, P5, P6, P7, P8, P9, P18 are analysed with segmentation,
but the information needed on the output cannot be delivered by the SF. In
fact, in these cases, the SF is not bringer of any of the morphological
information needed: otherwise, they are brought by the LES itself.
In these cases, in fact, we are
facing with SF like libet (aliqui-libet), piam (quis-piam), cumque (qui-cumque),
dam (qui-dam). Thus, in a wordform like quem(LES)-cumque(SF), the morphological
values (masculine, singular, accusative) are delivered not by the SF cumque,
but by the LES quem. This implies that these wordforms must be analysed
differently than the other wordforms segmented LES+SF (puell-am), where the
morphological values are delivered by the SF (am: feminine, singular,
accusative).
In order to obtain the analysis of these wordforms, we coded the morphological values on each LES with COD LES P1, P2, P3, P4, P5, P6, P7, P8, P9, P18: in this way, the analysis of quemcumque results from the following steps:
In the annex 2, the coding of the P1-P9,
P18 LES is reported.
1.1.6 Management of I LES
The wordforms of LES with COD LES I
(invariable; 1804 LES) are analysed with non segmentation and receive
automatically the codes X--------- (invariable).
Example:
Input wordform: abs
abs- is analysed with no segmentation
abs is found in the LES archive having the COD LES I
abs is lemmatized with lemma a
abs receives the morphological codes X---------
1.1.7 Coding of Type
The Type of the adjectives with Type
different than f (qualitative) has been added manually. The main number of
adjectives are qualitative: this allowed us to apply automatically this value
to all the adjectives. Then, we coded manually on the list of adjectives the
ones of no f Type: they are possessive, numeral, personal, indefinite,
In the annex 3, the coding of the
no f Type adjectives is reported.
1.1.8 FE management
A specific table has been designed and implemented to store the information needed to lemmatize the FE. Each FE LES (i.e. LES containing the value FE in the CODLES field) has been linked with linked with one or more entry of FE table.
We implemented a set of
functions that retrieve the necessary information from FE table and put it in
the output data set. We were particularly concerned in avoiding to output
redundant information: some wordform must get information both from the
normal analysis and from the FE analysis.
Here is a list of the C
functions:
The style used is similar to other
functions (and data structure) used to interact with the database in order to
allow an easy code understanding and modification.
We also verified that the management
used for the FE wordform can be used also for other LES (e.g. N, V, PR LES).
1.2 Problems encountered and decisions taken
1.2.1 Gender coding
In order to code the gender information, we marked with gender codes all
the nominal LES. We did it manually for the nominal LES coming into ambiguous
category (for example, n1), that is to say the ones including LES of more than
one gender; on the contrary, we could assign automatically the gender codes to
the nominal LES coming into unambiguous categories, that is to say the ones
including LES of one gender only (for example, n2n: second declension nouns,
neuter gender).
In the period covered by this report, the gender coding has been
finished. We applied 19.600 gender codes, belonging to the following
ambiguous categories[3]:
n1 (first declension nouns, masculine and feminine gender)
5324 LES:
o
38: masculine and feminine
o
4982: feminine
o
304: masculine
n1e (first declension exceptional nouns, masculine and feminine gender)
829 LES
o
7: masculine and feminine
o
463: feminine
o
359: masculine
n2 (second declension nouns, masculine, neuter and feminine gender)
2481 LES
o
5: masculine and feminine
o
102: feminine
o
2365: masculine
o
9: neuter
n2e (second declension exceptional nouns, masculine, feminine and neuter
gender)
1358 LES
o
3: masculine, feminine and neuter
o
109: masculine and neuter
o
11: masculine and feminine
o
30: neuter and feminine
o
145: feminine
o
403: masculine
o
657: neuter
n3 (third declension nouns with plural genitive um/-ium, masculine, feminine and neuter gender)
187 LES
o
1: masculine, feminine and neuter
o
5: masculine and neuter
o
3: masculine and feminine
o
98: feminine
o
78: masculine
o
2: neuter
n31 (third declension nouns with plural genitive um, masculine, feminine and neuter gender)
7289 LES
o
1: masculine and neuter
o
62: masculine and feminine
o
4713: feminine
o
2511: masculine
o
2: neuter
n32 (third declension nouns with plural genitive -ium, masculine, feminine and neuter gender)
458 LES
o
29: masculine and feminine
o
289: feminine
o
138: masculine
o
2: neuter
n3e (third declension exceptional nouns, masculine, feminine and neuter
gender)
505 LES
o
4: masculine and neuter
o
19: masculine and feminine
o
3: neuter and feminine
o
272: feminine
o
158: masculine
o
49: neuter
n4 (fourth declension nouns, masculine, feminine and neuter gender)
1056 LES
o
1: masculine, feminine and neuter gender
o
2: masculine and neuter
o
7: masculine and feminine
o
32: feminine
o
1008: masculine
o
6: neuter
n5 (fifth declension nouns, masculine and feminine gender)
113 LES
o 4: masculine and feminine
o 109: feminine
The unambiguous categories to which the gender code has been assigned
automatically are the following:
n2i: second declension nouns, masculine gender,
n2n: second declension nouns, neuter gender,
n2ni: second declension nouns ending in -ium,
neuter gender,
n3n: third declension nouns with plural genitive in um/-ium, neuter gender,
n3n1: third declension nouns with plural genitive in um, neuter gender,
n3n2: third declension nouns with plural genitive in ium, neuter gender,
The gender codes we established so far are the following:
m: masculine
f: feminine
n: neuter
1: masculine and neuter
2: masculine and feminine
3: neuter and feminine
*: none[4]
1.2.2 SM coding
The input
wordforms segmented by LEMLAT with the structure LES+SM+SF (am-and-us) are
analysed by CHLT LEMLAT through a synergy between the information brought by
the SM and the ones brought by the SF occurring in the input: this means that,
on the output, some information comes from the coding of the SM, some others
from the coding of the SF.
For instance, in
the wordform amandus, segmented am(LES)-and(SM)-us(SF), the SM and brings the
information about the PoS (verb), Type (main), Flexive Category (first) and
Mood (gerundive), while the SF us brings the ones about Case (nominative),
Gender (masculine) and Number (singular). The sum of this information is the
resulting analysis of amandus.
In order to code
on each SM which positions have to be filled with the information coming from
the SF, the code = has been used. The code = means: in the final analysis of
the input wordform, the code that must appear in this position comes from the
coding of the SF occurring in that wordform.
For instance, the
steps done by CHLT LEMLAT for the analysis of the wordform amandus are the
following:
-Input: amandus
-Lemma: amo
-Segmentation: am(LES)-and(SM)-us(SF)
-SF us n6 codes: Af---nms-1
-SM and v1/n6[5]
codes: Vmfr-===--
-Resulting codes: Vmfr-nms--
-Codes conversion: Verb, Main, I
Conjug., Gerundive, Nomin., Masc., Sing.
In the technical annexe 4., the SM
coding file is reported.
1.2.3 Adverbial use as FE
In order to analyse wordforms such
as multo also as
adverbs, we could not code this value on the SF: in the case of multo, in fact, if the value adverb had
been coded on the SF o, we would have had this value on the output analysis of all the input
first class adjectives ending in o. For instance, the adjective pulchro would have been analysed also as an
adverb.
Thus, according to the dictionaries,
we coded as FE (exceptional wordforms) the adjective wordforms ending in -um, -i, -o, -a that are used as adverbs.
In this way, receiving in input a
wordform such as multo, LEMLAT applies on the output analysis the regular values coming from
the SF coding and the adverb value coming from the coding of multo as FE.
The total of the involved wordforms
is 166: they are reported in the technical annexe 5.
1.3 Correspondence between planned project progress and actual accomplishments
The progresses done in Workpackage 5 in the period from 1st
June 2003 to 31st August 2003 respect what planned in the Project
Program.
In particular, they are the following ones:
adding of the gender codes to the
LES belonging to ambiguous morphological categories (finished);
coding of the SM and management of
the wordforms with structure LES+SM+SF (finished);
coding as FE of the adjective
wordforms ending in -um, -i, -o, -a that are used as adverbs (finished);
coding of the N, V, PR LES (finished);
coding of the P1-P9 and P18 LES
(finished);
management of I LES;
coding of the Type of the no f
Type adjectives (finished);
FE management
2.
Work progress overview
2.1
Specific objectives for the reporting period
During the period covered by this
report, we continued the development of LEMLAT in CHLT LEMLAT, following two
paths:
i. FE management
ii. N, V, PR LES coding
iii. P1-P9 and P18 LES coding
iv. I LES management
In addition, we decided to manage
the analysis of the adjective wordforms ending in -um, -i, -o, -a that are used as adverbs, coding them as FE.
2.2
Achievements
2.2.1
List of Deliverables
December,
2002: Periodic Progress Report
March,
2003: Periodic Progress Report
June,
2003: D 5.1
2.2.2
Progress by Workpackage/task
According
to the specific appointed targets, the phase of the work in Workpackage 5
covered by this report has produced the following results:
adding of the gender codes to the
LES belonging to ambiguous morphological categories (finished);
coding of the SM and management of
the wordforms with structure LES+SM+SF (finished);
coding as FE of the adjective
wordforms ending in -um, -i, -o, -a that are used as adverbs (finished);
coding of the N, V, PR LES
(finished);
coding of the P1-P9 and P18 LES
(finished);
management of I LES;
coding of the Type of the no f
Type adjectives (finished);
FE management
Preliminary study for the management
of N, V, PR, P1-P9 and P18 LES
2.3 Work planned for the next
reporting period
The work planned for the next
reporting period is the following:
implementation of new LE rules;
testing the lemmatization results
about the wordforms with structure LES + SF;
testing the lemmatization results
about the FE;
management of N, V, PR, P1-P9 and P18
LES
management of the lemmatization of
the wordforms with structure LES + SM + SF.
Poster at the XII
International Colloquium on Latin Linguistics,
Universit di Bologna (M. Passarotti, June 2003). Th publication of an article in the Conference
Proceedings is forthcoming.
Techical annexes
4.1 Coding of N, V, PR LES
Structure:
aduosem
NcP---m---
n
aerizon
NcP---n---
n
agma
NcP---n--- n
aigleucos
NcP---n---
n
alpum
AfP-------
n
ambrices
NcP-------
n
ami
NcP---n---
n
ammi
NcP---n---
n
amosio
NcP-------
n
anacampseroten
NcP---f---
n
apemphaenonta
NcP---n---
n
appatula
NcP---n---
n
aser
NcP-------
n
assyr
NcP-------
n
assir
NcP-------
n
astu
NcP---n---
n
asty
NcP---n---
n
alipes
AfP-------
n
aedoeon
NcP---n---
n
baripe
NcP-------
n
barripe
NcP-------
n
basileus
NcP---m---
n
billis
NcP------- n
blepharon
NcP---n---
n
boloe
NcP-------
n
borith
NcP---n---
n
bostar
NcP---n---
n
bradys
AfP-------
n
buceras
NcP---n---
n
bugenes
AfP-------
n
beta
NcP-------
n
ballo
NcP-------
n
caccitus
NcP-------
n
calasis
NcP-------
n
calathoides
NcP-------
n
canaster AfP------- n
canentas
NcP-------
n
cappa
NcP-------
n
cappari
NcP-------
n
carensis
NcP-------
n
carnicis
NcP-------
n
carphos
NcP---n--- n
casamo
NcP-------
n
catampo
NcP-------
n
catharmoe
NcP---m---
n
cauillibus
NcP---1---
n
causodes
AfP-------
n
cedo
VmNe1--s2-
n
cette
VmNe1--p2- n
cemos
NcP---f---
n
centum
AnP-------
n
ceratoides
AfP-------
n
cerceris
NcP-------
n
cercolopis
NcP---m---
n
chalcan
NcP---f---
n
chariton
NcP-------
n
chi
NcP-------
n
chiliophyllon
NcP---n---
n
chus
NcP---m---
n
cici
NcP---n---
n
ciprus
AfP-------
n
circumpediles
NcP-------
n
cirritudo
NcP-------
n
cnasonas
NcP-------
n
conflages
NcP-------
n
confluges
NcP-------
n
confrages
NcP-------
n
coniuglae
NcP-------
n
consubigo
NcP-------
n
contrarete
NcP---m---
n
conuiuiones
NcP-------
n
coppa
NcP-------
n
koppa
NcP-------
n
cortizones
NcP---m---
n
cuccuru
I---------
n
cucurru
I---------
n
cuci
NcP---n---
n
cumma
NcP---n---
n
gumma
NcP---n---
n
cummi
NcP---n---
n
commi
NcP---n---
n
gummi NcP---n--- n
cumulter
AfP-------
n
cumalter
AfP-------
n
cunnuliggeter
NcP---m---
n
curmi
NcP---n---
n
cusanies
--P-------
n
cusuc
NcP------- n
cylindroides
AfP-------
n
consentia
AfP-------
n
coecas
NcP---n---
n
dagnades
NcP-------
n
daliuus
AfP-------
n
damnas
AfP-------
n
dasios NcP------- n
decem
AnP-------
n
dece
AnP-------
n
decim
AnP-------
n
dec
Y---------
n
dekem
Y---------
n
dicimbr
Y---------
n
decotes
NcP-------
n
decussissexis
NcP---m---
n
def
Y---------
n
delta
NcP-------
n
dendrites
NcP---m---
n
diaartymaton
NcP---n---
n
diabotanon NcP---n--- n
diacochlecon
NcP---n---
n
diacodion
NcP---n---
n
diacopraegias
NcP-------
n
diaeteon
NcP---n---
n
dialectrum
NcP---n---
n
dialepidos
NcP-------
n
dialibanon
NcP---n---
n
diamannae
NcP-------
n
diameliton
NcP---n---
n
diamelitoton
NcP---n---
n
diamirton
NcP---n---
n
diamisyos
NcP-------
n
diamoron
NcP---n---
n
diaoriganon
NcP---n---
n
diapanton
NcP---n---
n
diapason
NcP-------
n
diapeganon
NcP---n---
n
diapente
NcP-------
n
diarhodon
NcP---n---
n
diasampsuchum NcP---n--- n
diascammonias NcP---n--- n
diasmyrnes
NcP---n---
n
diasmyrnon
NcP---n---
n
diaspermaton
NcP---n---
n
diasteaton
NcP---n---
n
diatessaron
NcP---n---
n
diatheon
NcP---n---
n
dicis
--P-------
n
dieseptumei
L---------
n
dienoni
L---------
n
dienone
L---------
n
diequarti
L---------
n
diequarte L--------- n
diequinti
L---------
n
diequinte
L---------
n
dieseptumi
L---------
n
dieseptume
L---------
n
digamma
NcP-------
n
diox
NcP-------
n
disdiapason
NcP---n---
n
disdiapente
NcP-------
n
disdiatessaron
NcP---n---
n
disice
NcP-------
n
dodran
NcP---m---
n
drabe
NcP-------
n
duapondo NcP------- n
ducentum
AnP-------
n
duodeciaere
NcP-------
n
duodecim
AnP-------
n
duodenonaginta
AnP-------
n
duodeoctoginta
AnP-------
n
duodequadraginta AnP------- n
duodequinquaginta AnP------- n
duodesexaginta
AnP-------
n
duodetriginta
AnP-------
n
duodeuiginti
AnP-------
n
duouiginti
AnP-------
n
echeon
NcP---n---
n
echo
NcP---f---
n
echon
NcP---f---
n
echus
NcP---f---
n
elacatena
NcP-------
n
embryo
NcP-------
n
exfir
NcP-------
n
exspes
AfP-------
n
falado NcP------- n
fas
NcP---n---
n
frit
NcP---n---
n
galeopsis
NcP---f---
n
gamma
NcP-------
n
gau
NcP-------
n
gau
Y---------
n
genesalia
NcP---n---
n
git
NcP---n---
n
gitti
NcP---n---
n
gomor
NcP---n---
n
gugga
NcP-------
n
alieus
NcP---m---
n
harma NcP---n--- n
heluacea
NcP-------
n
hepatizon
NcP---n---
n
hidros
NcP-------
n
hin
NcP---n---
n
hippeus
NcP---m---
n
hosticapas
NcP------- n
hyphen
NcP---n---
n
heroion
NcP---n---
n
idos
NcP---n---
n
ignia
NcP---n---
n
impetix
NcP-------
n
impuges
AfP-------
n
iniussu NcP---m--- n
innox
AfP-------
n
instar
NcP---n---
n
iota
NcP-------
n
ipsilles
NcP-------
n
ipsullices
NcP-------
n
ipsulices
NcP-------
n
isoetes
NcP---n---
n
iuges
NcP---m---
n
infas
NcP---n---
n
inuocatu
NcP---m---
n
labda
NcP-------
n
lace
NcP-------
n
lathyr
NcP---f---
n
lax
NcP-------
n
lecticalis
NcP-------
n
lepton
AfP---n---
n
lessus
NcP-------
n
ligisticum
NcP-------
n
lix
NcP-------
n
longodes
NcP---m---
n
lycophon
NcP---n---
n
macir
NcP---n---
n
maenomenon
AfP---n---
n
mane
NcP---n---
n
mani
NcP---n---
n
marspedis
NcP-------
n
maspedis
NcP-------
n
melander
NcP---m---
n
melichrus
AfP-------
n
menceps
AfP-------
n
menoides
NcP-------
n
menui
NcP-------
n
mictyris
NcP---f--- n
mictiris
NcP---f---
n
milingior
AfP-------
n
mille
AnP-------
n
meile
AnP-------
n
milli
AnP-------
n
min
Pq---a-s3- n
minyanthes
NcP---n---
n
mixcix
AfP-------
n
momar
NcP---n---
n
monoides
AfP-------
n
mucul
NcP-------
n
muger
NcP---m---
n
mulc NcP------- n
multaciam
NcP-------
n
my
NcP-------
n
mydriasis
NcP---f---
n
myrmecias
NcP---m---
n
myrmecitis
NcP---f---
n
myrobreches
AfP-------
n
manna
NcP---n---
n
nabus
NcP---m---
n
necesse
Rg--------
n
necessus
Rg--------
n
necessis
Rg--------
n
necessum
Rg--------
n
necesus
Rg-------- n
nefas
NcP---n---
n
negritu
NcP-------
n
nonaginta
AnP-------
n
nonuncium
NcP---n---
n
nouemdecim
AnP-------
n
nugas
NcP-------
n
nus
NcP------- n
ny
NcP-------
n
nequam
AfP-------
n
nouem
AnP-------
n
ocimoides
NcP---n---
n
octo
AnP-------
n
octodecim
AnP-------
n
octoginta
AnP-------
n
octaginta
AnP-------
n
octuaginta
AnP-------
n
octuplex
AfP-------
n
octus
NcP---m---
n
oenobreches
NcP-------
n
ophis NcP---f--- n
opunculo
NcP-------
n
oscinum
AfP-------
n
osiritis
NcP---f---
n
osyris
NcP---f---
n
oliorum
Rg--------
n
pa
Y---------
n
paedeutes
NcP---m---
n
pandemus
NcP-------
n
pandex
NcP-------
n
passales
NcP-------
n
pathos
NcP---n---
n
pax
I---------
n
pectuscum NcP------- n
pedulla
NcP-------
n
pentameris
AfP-------
n
pernecesse
AfP---n---
n
persillum
NcP-------
n
pertermine
NcP---n---
n
pescia
NcP---n--- n
phascolia
NcP---n---
n
phascola
NcP---n---
n
phaunos
NcP---n---
n
phu
NcP---n---
n
pilates
NcP-------
n
pitpit
--P-------
n
platys
AfP------- n
po
Y---------
n
polteo
AfP-------
n
poricino
NcP-------
n
potami
NcP-------
n
primigenes
NcP-------
n
priuiclioes
AfP-------
n
procos
Y---------
n
prodius
Rg-------2
n
prologumene
AfP-------
n
promeriom
NcP---n---
n
propin
NcP---n---
n
propromptu
NcP---m---
n
propteruia
NcP-------
n
proqu
Y---------
n
prugnum
AfP-------
n
psagdas
NcP---m---
n
psi
NcP-------
n
psin
NcP-------
n
psolocopumai
VmP-------
n
pullinurux
NcP-------
n
purimenstrio
NcP-------
n
pus
X---------
n
peramus
NcP---f---
n
perramus
NcP---f---
n
pagos
NcP-------
n
quadraginta
AnP-------
n
quarranta
AnP-------
n
quatrussis
NcP---m---
n
quattuor
AnP-------
n
quator
AnP-------
n
quatuor
AnP-------
n
quattuordeciaere NcP---n--- n
quattuordecim
AnP-------
n
quattus
NcP---m---
n
quadtus
NcP---m---
n
quindeciaere
NcP---n---
n
quinquaginta
AnP-------
n
quinque
AnP-------
n
quinques
NcP---m--- n
quindecim
AnP-------
n
retricibus
NcP-------
n
rho
NcP-------
n
rhodomeli
NcP---n---
n
rhox
NcP-------
n
rupitias
NcP-------
n
raca
NcP------- n
sacal
NcP---n---
n
sacrima
NcP-------
n
salsipotis
AfP-------
n
sarrapis
NcP-------
n
satan
NcP---m---
n
scordasti
NcP---f---
n
scortes
NcP-------
n
scultimidoni
NcP---m---
n
sedecim
AnP-------
n
sexdecim
AnP-------
n
seimitum
AfP-------
n
selas
NcP---n---
n
semialpha
NcP-------
n
semigomor
NcP-------
n
septem
AnP-------
n
septemdecim
AnP-------
n
septendecim
AnP-------
n
septuaginta
AnP-------
n
septaginta
AnP-------
n
septus
NcP-------
n
serps
NcP---2---
n
setim
NcP-------
n
sex
AfP-------
n
sexs
AnP-------
n
six
AnP-------
n
sexaginta
AnP-------
n
sexis
NcP-------
n
sibi
NcP-------
n
sibones
NcP-------
n
silatum
NcP-------
n
simpludiarea
AfP-------
n
sirpe
NcP---n---
n
socon NcP------- n
sopia
NcP---n---
n
spagas
NcP-------
n
steroma
NcP-------
n
stic
Py--------
n
stomun
NcP---n---
n
subis
NcP---f---
n
subnimium
NcP-------
n
subsilles
NcP---f---
n
subtel
NcP---n---
n
secus
NcP---n---
n
tau
NcP-------
n
taenpoton
NcP-------
n
talam NcP------- n
tama
NcP-------
n
tangomenas
NcP-------
n
tengomenas
NcP-------
n
taos
NcP---m---
n
taratantara
I---------
n
tat
I---------
n
tatae
I---------
n
tau
NcP-------
n
tergenus
NcP---n---
n
thalassomeli
NcP---n---
n
thalassomel
NcP---n---
n
theta
NcP-------
n
tetates
NcP---m--- n
thoti
NcP-------
n
tomis
NcP---m---
n
topanta
NcP---n---
n
tapanta
NcP---n---
n
trachala
NcP---m---
n
trachy
AfP---n---
n
traulizi
VmP-------
n
trecentum
AnP-------
n
tredeciaere
NcP---n---
n
trepondo
NcP-------
n
trigenes
AfP-------
n
trigenus
AfP-------
n
triginta
AnP-------
n
trit
I---------
n
trygodes
NcP---n---
n
uernisera
NcP---n---
n
uicessis
NcP---m---
n
uicesis
NcP---m---
n
uigesis
NcP---m---
n
uigessis
NcP---m---
n
uiginti
AnP-------
n
biginti
AnP-------
n
ueiginti
AnP-------
n
uigenti
AnP-------
n
uiginta
AnP-------
n
uinti
AnP-------
n
uinciam
NcP-------
n
mane
NcP---n---
n
mani
NcP---n---
n
mane
Rg--------
n
mani
Rg--------
n
mille
NnP-------
n
milli NnP------- n
meile
NnP-------
n
stic
Rr--------
n
hypobrachys
NcC--nms--
n
hypobrachyn
NcC--ams--
n
uirops
NcP---f---
n
ullaber
NcP------- n
undecentum
AnP-------
n
undeciaere
NcP---n---
n
undecim
AnP-------
n
undenonaginta AnP------- n
undeoctoginta
AnP-------
n
undequadraginta
AnP-------
n
undequinquaginta AnP------- n
undesexaginta
AnP-------
n
undetriginta
AnP-------
n
undeuiginti
AnP-------
n
ungulatros
NcP-------
n
uoisgram
NcP-------
n
uolligo
NcP-------
n
urru
NcP---n---
n
uau
NcP-------
n
ubertumbis
NcP-------
n
zeta
NcP-------
n
zopyris
NcP-------
n
aliquot
P-P-------
pr
emem
P---------
pr
nescioquis
P4---nms--
pr
nescioqui
P2---dns--
pr
nescioquid
P4---nns--
pr
nescioquae
P2---nnp--
pr
nescioquod
P2---nns--
pr
nescioquoia
P4---nfs--
pr
nescioquot
PbP-------
pr
nescioquo
P2---bms--
pr
nescioqua
P2---bfs--
pr
quotquot
P2P-------
pr
quodquod
P2P-------
pr
quot
PbP-------
pr
tot PuP------- pr
unumquicquid
Pu---nns--
pr
unumquicquid
Pu---ans--
pr
nescioquis
P4---nfs--
pr
nescioquis
P4---bmp--
pr
nescioquis
P4---bnp--
pr
nescioquis
P4---bfp--
pr
nescioquis
P4---dmp--
pr
nescioquis
P4---dnp--
pr
nescioquis
P4---dfp--
pr
nescioquis
P2---bmp--
pr
nescioquis
P2---bnp--
pr
nescioquis
P2---bfp--
pr
nescioquis
P2---dmp--
pr
nescioquis
P2---dnp--
pr
nescioquis
P2---dfp--
pr
nescioqui
P2---dfs--
pr
nescioqui
P2---dms--
pr
nescioqui
P2---bns--
pr
nescioqui
P2---bms--
pr
nescioqui
P2---bfs--
pr
nescioqui
P2---nms--
pr
nescioqui
P2---nmp--
pr
nescioqui
P4---nfp--
pr
nescioqui
P4---nmp--
pr
nescioquid
P4---ans--
pr
nescioquae
P2---anp--
pr
nescioquae
P4---nnp--
pr
nescioquae
P4---anp--
pr
nescioquae
P2---nfp--
pr
nescioquae
P2---nfs--
pr
nescioquod
P2---ans--
pr
nescioquo P2---bns-- pr
nescioquo
P4---bms--
pr
nescioquo
P4---bfs--
pr
nescioquo
P4---bns--
pr
fendo
VmHa1--s1-
v
inipite
I---------
v
impite
I---------
v
oblucuuiasse
VmFg4-----
v
parret
VmGa1--s3-
v
perfines
VmHc1--s2-
v
praedotiont
Vm-----p3-
v
promellere
Vm-g1-----
v
prosagit
I---------
v
pubor Vm-b1--s1- v
pupior
Vm-b1--s1-
v
renancitur
Vm-b1--s3-
v
scilicet
I---------
v
sodes
I---------
v
uagurrit
VmHa1--s3-
v
ualitant
Vm-a1--p3-
v
4.2 Coding of P1-P9, P18 LES
Structure:
iis
P3---dfp--
p1
idem
iis
P3---dnp--
p1
idem
iis
P3---bmp--
p1
idem
iis
P3---bfp--
p1
idem
iis
P3---bnp--
p1
idem
tantun
Pu---nns--
p1
tantusdem
tantum
Pu---ans--
p1
tantusdem
tantum
Pu---nns--
p1
tantusdem
tantae
Pu---dfs--
p1
tantusdem
tantae
Pu---nfp--
p1
tantusdem
tanta
Pu---bfs--
p1
tantusdem
tanta
Pu---nnp--
p1
tantusdem
tanta
Pu---anp--
p1
tantusdem
tanto
Pu---dms--
p1
tantusdem
tanto
Pu---bns--
p1
tantusdem
tanto
Pu---dns--
p1
tantusdem
tanti
Pu---gns-- p1
tantusdem
tanti
Pu---nmp--
p1
tantusdem
tantis
Pu---bmp--
p1
tantusdem
tantis
Pu---dfp--
p1
tantusdem
tantis
Pu---bfp--
p1
tantusdem
tantis
Pu---dnp--
p1
tantusdem
tantis
Pu---bnp--
p1
tantusdem
tantorun Pu---gnp-- p1
tantusdem
tantorum
Pu---gnp--
p1
tantusdem
ecqua
Pt---bfs--
p18
ecquinam
ecquis
Pt---dfp--
p18
ecquinam
ecquis
Pt---dnp--
p18
ecquinam
ecquis
Pt---bmp--
p18
ecquinam
ecquis
Pt---bfp--
p18
ecquinam
ecquis Pt---bnp-- p18
ecquinam
eccui
Pt---dfs--
p18
ecquinam
eccui
Pt---dns--
p18
ecquinam
ecquae
Pt---anp--
p18 ecquinam
ecquae
Pt---nfp--
p18
ecquinam
ecquae
Pt---nfs--
p18
ecquinam
eccuius
Pt---gfs--
p18
ecquinam
eccuius
Pt---gns--
p18
ecquinam
ecquibus
Pt---dfp--
p18
ecquinam
ecquibus
Pt---dnp--
p18
ecquinam
ecquibus
Pt---bmp-- p18
ecquinam
ecquibus
Pt---bfp--
p18
ecquinam
quoiius
Pt---gfs--
p18
quinam
ecquibus
Pt---bnp--
p18
ecquinam
ecquod
Pt---ans--
p18
ecquinam
ecquorum
Pt---gnp--
p18
ecquinam
ecqui
Pt---nms--
p18
ecquinam
ecquo
Pt---bns--
p18
ecquinam
cuiius
Pt---gfs--
p18
quinam
cuiius
Pt---gns--
p18
quinam
queis
Pt---dfp--
p18
quinam
queis
Pt---dnp--
p18
quinam
queis
Pt---bmp--
p18
quinam
queis
Pt---bfp--
p18
quinam
queis
Pt---bnp--
p18
quinam
quei
Pt---nms--
p18
quinam
cuii
Pt---dfs--
p18
quinam
cuii
Pt---dns--
p18
quinam
quoiius
Pt---gns--
p18
quinam
quius
Pt---gfs--
p18
quinam
quius
Pt---gns--
p18
quinam
quo
Pt---bns-- p18
quinam
quo
Rr--------
p18
quinam
quoii
Pt---dns--
p18
quinam
quoii
Pt---dfs--
p18 quinam
quod
Pt---ans--
p18
quinam
quoiei
Pt---dfs--
p18
quinam
quoiei
Pt---dns--
p18
quinam
quoi Pt---dfs-- p18
quinam
quoi
Pt---dns--
p18
quinam
quoi
Pt---nms--
p18
quinam
quoius
Pt---gns--
p18
quinam
quoius
Pt---gfs--
p18
quinam
quorum
Pt---gnp--
p18
quinam
quae
Pt---anp--
p18
quinam
quae Pt---nfp-- p18
quinam
quae
Pt---nfs--
p18
quinam
qua
Pt---bfs--
p18
quinam
qua
Pt---nfs--
p18
quinam
qui
Pt---nms--
p18
quinam
qui
Pt---nmp--
p18
quinam
qui
Pt---dms--
p18
quinam
qui
Pt---dns--
p18
quinam
qui
Pt---dfs--
p18
quinam
qui
Pt---bfs--
p18
quinam
qui
Pt---bms-- p18
quinam
qui
Pt---bns--
p18
quinam
quibus
Pt---dfp--
p18
quinam
quibus
Pt---dnp--
p18
quinam
quibus
Pt---bmp--
p18
quinam
quibus
Pt---bfp--
p18
quinam
quibus
Pt---bnp--
p18
quinam
quis
Pt---dfp--
p18
quinam
quis
Pt---dnp--
p18
quinam
quis
Pt---bmp--
p18
quinam
quis
Pt---bfp--
p18
quinam
quis
Pt---bnp--
p18
quinam
queius
Pt---gns--
p18
quinam
queius
Pt---gfs--
p18
quinam
cuius
Pt---gfs--
p18
quinam
cuius
Pt---gns--
p18
quinam
cui
Pt---dfs--
p18
quinam
cui
Pt---dns--
p18
quinam
coi
Pt---dfs--
p18
quinam
coi
Pt---dns--
p18
quinam
quibus
Pu---dfp--
p2
quidam
quibus Pu---dnp-- p2
quidam
quibus
Pu---bmp--
p2
quidam
quibus
Pu---bnp--
p2
quidam
quibus
Pu---bfp--
p2
quidam
qui
Pu---nmp--
p2
quidam
quod
Pu---ans--
p2
quidam
queius
Pu---gfs--
p2
quidam
queius
Pu---gns-- p2
quidam
queis
Pu---dfp--
p2
quidam
queis
Pu---dnp--
p2
quidam
queis
Pu---bmp--
p2 quidam
queis
Pu---bfp--
p2
quidam
queis
Pu---bnp--
p2
quidam
quorum
Pu---gnp--
p2
quidam
quoius
Pu---gfs--
p2
quidam
quoiius
Pu---gfs--
p2
quidam
coi
Pu---dfs--
p2
quidam
quoius
Pu---gns--
p2
quidam
quoiius
Pu---gns--
p2
quidam
coi
Pu---dns--
p2
quidam
cuius
Pu---gfs--
p2
quidam
cuius
Pu---gns--
p2
quidam
quae
Pu---nfp--
p2
quidam
quae
Pu---nnp--
p2
quidam
quae
Pu---anp--
p2
quidam
cuii
Pu---dfs--
p2
quidam
cuiius
Pu---gfs--
p2
quidam
cuii
Pu---dns--
p2
quidam
cuiius
Pu---gns--
p2
quidam
cui
Pu---dfs--
p2
quidam
cui
Pu---dns--
p2
quidam
quid
Pu---ans--
p2
quidam
quoii
Pu---dns--
p2
quidam
quoii
Pu---dfs--
p2
quidam
quo
Pu---bns--
p2
quidam
quoiei
Pu---dfs--
p2
quidam
quoi
Pu---dfs--
p2
quidam
quoiei
Pu---dns--
p2
quidam
quoi
Pu---dns-- p2
quidam
quoi
Pu---nms--
p2
quidam
quius
Pu---gfs--
p2
quidam
quius
Pu---gns--
p2
quidam
quis
Pu---dfp--
p2
quidam
quis
Pu---dnp--
p2
quidam
quis
Pu---bmp--
p2
quidam
quis Pu---bfp-- p2
quidam
quis
Pu---bnp--
p2
quidam
quius
Pu---gns--
p3
quispiam
queius
Pu---gns--
p3
quispiam
quius
Pu---gfs--
p3
quispiam
queius
Pu---gfs--
p3
quispiam
cuiius
Pu---gfs--
p3
quispiam
cuiius Pu---gns-- p3
quispiam
coi
Pu---dfs--
p3
quispiam
cui
Pu---dfs--
p3
quispiam
cuii
Pu---dfs--
p3 quispiam
coi
Pu---dns--
p3
quispiam
cui
Pu---dns--
p3
quispiam
cuii
Pu---dns--
p3
quispiam
quae
Pu---nfp--
p3
quispiam
quae
Pu---nnp--
p3
quispiam
quae
Pu---anp--
p3
quispiam
quoi
Pu---dfs-- p3
quispiam
quoii
Pu---dfs--
p3
quispiam
quoius
Pu---gfs--
p3
quispiam
quoi
Pu---dns--
p3
quispiam
quoii
Pu---dns--
p3
quispiam
quoius
Pu---gns--
p3
quispiam
quoi
Pu---nms--
p3
quispiam
quoiei
Pu---dfs--
p3
quispiam
quoiei
Pu---dns--
p3
quispiam
quorum
Pu---gfp--
p3
quispiam
quos
Pu---afp--
p3
quispiam
quorum
Pu---gnp--
p3
quispiam
quei
Pu---dns--
p3
quispiam
quei
Pu---dfs--
p3
quispiam
quoiius
Pu---gfs--
p3
quispiam
quoiius
Pu---gns--
p3
quispiam
quem
Pu---afs--
p3
quispiam
quibus
Pu---dfp--
p3
quispiam
quibus
Pu---dnp--
p3
quispiam
quibus
Pu---bmp--
p3
quispiam
quibus
Pu---bfp--
p3
quispiam
quibus Pu---bnp-- p3
quispiam
quid
Pu---ans--
p3
quispiam
quip
Pu---ans--
p3
quispiam
quod
Pu---ans--
p3
quispiam
queis
Pu---dfp--
p3
quispiam
queis
Pu---dnp--
p3
quispiam
queis
Pu---bmp--
p3
quispiam
queis Pu---bnp-- p3
quispiam
queis
Pu---bfp--
p3
quispiam
cuius
Pu---gns--
p3
quispiam
cuius
Pu---gfs--
p3
quispiam
quo
Pu---bfs--
p3
quispiam
quo
Pu---bns--
p3
quispiam
quo
Rr--------
p3
quispiam
quis Pu---nfs-- p3
quispiam
quis
Pu---dmp--
p3
quispiam
quis
Pu---dfp--
p3
quispiam
quis
Pu---dnp--
p3 quispiam
quis
Pu---bmp--
p3
quispiam
quis
Pu---bfp--
p3
quispiam
quis
Pu---bnp--
p3
quispiam
alicuius
Pu---gfs--
p3
aliquispiam
alicuius
Pu---gns--
p3
aliquispiam
alicui
Pu---dns--
p3
aliquispiam
alicui
Pu---dfs-- p3
aliquispiam
aliquos
Pu---afp--
p3
aliquispiam
aliquod
Pu---ans--
p3
aliquispiam
aliquis
Pu---nfs--
p3
aliquispiam
aliquis
Pu---dmp--
p3
aliquispiam
aliquis
Pu---dnp--
p3
aliquispiam
aliquis
Pu---dfp--
p3
aliquispiam
aliquis
Pu---bmp--
p3
aliquispiam
aliquis
Pu---bfp--
p3
aliquispiam
aliquis
Pu---bnp--
p3
aliquispiam
aliquo
Pu---bns--
p3
aliquispiam
aliquo
Pu---bfs--
p3
aliquispiam
aliquip
Pu---ans--
p3
aliquispiam
aliquem
Pu---afs--
p3
aliquispiam
aliquid
Pu---ans--
p3
aliquispiam
aliquae
Pu---anp--
p3
aliquispiam
aliquae
Pu---nfp--
p3
aliquispiam
aliquae
Pu---nfs--
p3
aliquispiam
quibus
Pu---dfp--
p4
quisquam
quibus
Pu---dnp--
p4
quisquam
quibus
Pu---bmp--
p4
quisquam
quibus Pu---bfp-- p4
quisquam
quibus
Pu---bnp--
p4
quisquam
quorum
Pu---gfp--
p4
quisquam
quorum
Pu---gnp--
p4
quisquam
quius
Pu---gfs--
p4
quisquam
quius
Pu---gns--
p4
quisquam
quoius
Pu---gfs--
p4
quisquam
quoius
Pu---gns-- p4
quisquam
quoiius
Pu---gfs--
p4
quisquam
quoiius
Pu---gns--
p4
quisquam
quoii
Pu---dfs--
p4 quisquam
quoii
Pu---dns--
p4
quisquam
quic
Pu---ans--
p4
quisquam
quos
Pu---afp--
p4
quisquam
quoiei
Pu---dfs--
p4
quisquam
alicui
Pu---dms--
p6
aliquilibet
alicuius
Pu---gms--
p6
aliquilibet
aliqua
Pu---nfs--
p6
aliquilibet
aliquae
Pu---nfs--
p6
aliquilibet
aliquam
Pu---afs--
p6
aliquilibet
aliquas
Pu---afp--
p6
aliquilibet
aliquem
Pu---ams--
p6
aliquilibet
aliqui
Pu---nms--
p6
aliquilibet
aliquid
Pu---nns--
p6
aliquilibet
aliquis
Pu---dmp--
p6
aliquilibet
aliquo
Pu---bms--
p6
aliquilibet
aliquod
Pu---nns--
p6
aliquilibet
aliquos
Pu---amp--
p6
aliquilibet
alicui
Pu---dms--
p3
aliquispiam
alicuius
Pu---gms--
p3
aliquispiam
aliqua
Pu---bfs--
p3
aliquispiam
aliquae
Pu---nnp--
p3
aliquispiam
aliquam
Pu---afs--
p3
aliquispiam
aliquas
Pu---afp--
p3
aliquispiam
aliquem
Pu---ams--
p3
aliquispiam
aliquid
Pu---nns--
p3
aliquispiam
aliquip
Pu---nns--
p3
aliquispiam
aliquis
Pu---nms--
p3
aliquispiam
aliquo
Pu---bms-- p3
aliquispiam
aliquod
Pu---nns--
p3
aliquispiam
aliquos
Pu---amp--
p3
aliquispiam
alicui
Pu---dms--
p5
aliquisuis
alicuius
Pu---gms--
p5
aliquisuis
aliqua
Pu---bfs--
p5
aliquisuis
aliquae
Pu---nfp--
p5
aliquisuis
aliquam Pu---afs-- p5
aliquisuis
aliquas
Pu---afp--
p5
aliquisuis
aliquem
Pu---ams--
p5
aliquisuis
aliqui
Pu---bms--
p5
aliquisuis
aliquid
Pu---nns--
p5
aliquisuis
aliquis
Pu---nms--
p5
aliquisuis
aliquo
Pu---bms--
p5
aliquisuis
aliquod Pu---nns-- p5
aliquisuis
aliquos
Pu---amp--
p5
aliquisuis
alicui
P2---dms--
p7
aliquicumque
alicuius
P2---gms--
p7 aliquicumque
aliqua
P2---bfs--
p7
aliquicumque
aliquae
P2---nfs--
p7
aliquicumque
aliquam
P2---afs--
p7
aliquicumque
aliquas
P2---afp--
p7
aliquicumque
aliquem
P2---ams--
p7
aliquicumque
aliqui
P2---nms--
p7
aliquicumque
aliquis
P2---dmp-- p7
aliquicumque
aliquo
P2---bms--
p7
aliquicumque
aliquod
P2---nns--
p7
aliquicumque
aliquos
P2---amp--
p7
aliquicumque
cuiusmodi
Rr--------
p7
cuiusmodicumque
eccui
Pt---dms--
p18
ecquinam
eccuius
Pt---gms--
p18
ecquinam
ecqua
Pt---nfs--
p18
ecquinam
ecquae
Pt---nnp--
p18
ecquinam
ecquam
Pt---afs--
p18
ecquinam
ecquarum
Pt---gfp--
p18
ecquinam
ecquas
Pt---afp--
p18
ecquinam
ecquem
Pt---amp--
p18
ecquinam
ecqui
Pt---nmp--
p18
ecquinam
ecquibus
Pt---dmp--
p18
ecquinam
ecquis
Pt---dmp--
p18
ecquinam
ecquo
Pt---bms--
p18
ecquinam
ecquod
Pt---nns--
p18
ecquinam
ecquorum
Pt---gmp--
p18
ecquinam
ecquos
Pt---amp--
p18
ecquinam
ea
P3---nfs--
p1
idem
eae
P3---nfp-- p1
idem
eam
P3---afs--
p1
idem
ean
P3---afs--
p1
idem
earum
P3---gfp--
p1
idem
earun
P3---gfp--
p1
idem
eas
P3---afp--
p1
idem
ei
P3---dms--
p1
idem
eis P3---dmp-- p1
idem
eius
P3---gms--
p1
idem
eo
P3---bms--
p1
idem
eorum
P3---gmp--
p1
idem
eorun
P3---gmp--
p1
idem
eos
P3---amp--
p1
idem
eum
P3---ams--
p1
idem
eun P3---ams-- p1
idem
i
P3---nms--
p1
idem
ii
P3---nmp--
p1
idem
iis
P3---dmp--
p1 idem
is
P3---dmp--
p1
idem
quandiu
Rr--------
p7
quandiucumque
cuando
Rr--------
p6
quandolibet
quando
Rr--------
p6
quandolibet
cuando
Rr--------
p7
quandocumque
quando
Rr--------
p7
quandocumque
cuicui
P2---dms-- p6
quisquislibet
quaequae
P2---nfp--
p6
quisquislibet
quaqua
P2---bfs--
p6
quisquislibet
quemquem
P2---ams--
p6
quisquislibet
quibusquibus
P2---dmp--
p6
quisquislibet
quicquid
P2---nns--
p6
quisquislibet
quidquid
P2---nns--
p6
quisquislibet
quiqui
P2---nms--
p6
quisquislibet
quisquis
P2---nms--
p6
quisquislibet
quodquod
P2---nns--
p6
quisquislibet
quoiquoi
P2---dms--
p6
quisquislibet
quoquo
P2---bms--
p6
quisquislibet
quosquos
P2---amp--
p6
quisquislibet
quomodo
Rr--------
p18
quomodonam
quomodo
Rr--------
p6
quomodolibet
quomodo
Rr--------
p7
quomodocumque
quot
Rr--------
p5
quotuis
quot
A-P-------
p6
quotlibet
quot
A-P-------
p7
quotcumque
quotiens
Rr--------
p6
quotienslibet
quoties
Rr--------
p6
quotienslibet
quotiens Rr-------- p7
quotienscumque
quoties
Rr--------
p7
quotienscumque
coi
Pu---dms--
p2
quidam
cui
Pu---dms--
p2 quidam
cuii
Pu---dms--
p2
quidam
cuiius
Pu---gms--
p2
quidam
cuius
Pu---gms--
p2
quidam
qua Pu---bfs-- p2
quidam
quae
Pu---nfs--
p2
quidam
quam
Pu---afs--
p2
quidam
quarum
Pu---gfp--
p2 quidam
quas
Pu---afp--
p2
quidam
quei
Pu---nms--
p2
quidam
queis
Pu---dmp--
p2
quidam
queius
Pu---gms--
p2
quidam
quem
Pu---ams--
p2
quidam
qui
Pu---nms--
p2
quidam
quibus
Pu---dmp--
p2
quidam
quid
Pu---nns--
p2
quidam
quis
Pu---dmp--
p2
quidam
quius
Pu---gms--
p2
quidam
quo
Pu---bms--
p2
quidam
quod
Pu---nns--
p2
quidam
quoi
Pu---dms--
p2
quidam
quoiei
Pu---dms--
p2
quidam
quoii
Pu---dms--
p2
quidam
quoiius
Pu---gms--
p2
quidam
quoius
Pu---gms--
p2
quidam
quorum
Pu---gmp--
p2
quidam
quos
Pu---amp--
p2
quidam
coi
Pu---dms--
p3
quispiam
cui
Pu---dms--
p3
quispiam
cuii
Pu---dms--
p3
quispiam
cuiius
Pu---gms--
p3
quispiam
cuius
Pu---gms--
p3
quispiam
qua
Pu---bfs--
p3
quispiam
quae
Pu---nfs--
p3
quispiam
quam
Pu---afs--
p3
quispiam
quarum
Pu---gfp-- p3
quispiam
quas
Pu---afp--
p3
quispiam
quei
Pu---dms--
p3
quispiam
queis
Pu---dmp--
p3
quispiam
queius
Pu---gms--
p3
quispiam
quem
Pu---ams--
p3
quispiam
quibus
Pu---dmp--
p3
quispiam
quid Pu---nns-- p3
quispiam
quip
Pu---nns--
p3
quispiam
quis
Pu---nms--
p3
quispiam
quius
Pu---gms--
p3 quispiam
quo
Pu---bms--
p3
quispiam
quod
Pu---nns--
p3
quispiam
quoi
Pu---dms--
p3
quispiam
quoiei Pu---dms-- p3
quispiam
quoii
Pu---dms--
p3
quispiam
quoiius
Pu---gms--
p3
quispiam
quoius
Pu---gms--
p3 quispiam
quorum
Pu---gmp--
p3
quispiam
quos
Pu---amp--
p3
quispiam
coi
Pu---dms--
p4
quisquam
cui
Pu---dms--
p4
quisquam
cuii
Pu---dms--
p4
quisquam
cuiius
Pu---gms--
p4
quisquam
cuius
Pu---gms-- p4
quisquam
quei
Pu---dms--
p4
quisquam
queis
Pu---dfp--
p4
quisquam
queius
Pu---gms--
p4
quisquam
quem
Pu---ams--
p4
quisquam
qui
Pu---bms--
p4
quisquam
quibus
Pu---dmp--
p4
quisquam
quic
Pu---nns--
p4
quisquam
quid
Pu---nns--
p4
quisquam
quis
Pu---nms--
p4
quisquam
quius
Pu---gms--
p4
quisquam
quo
Pu---bms--
p4
quisquam
quoi
Pu---dms--
p4
quisquam
quoiei
Pu---dms--
p4
quisquam
quoii
Pu---dms--
p4
quisquam
quoiius
Pu---gms--
p4
quisquam
quoius
Pu---gms--
p4
quisquam
quorum
Pu---gmp--
p4
quisquam
quos
Pu---amp--
p4
quisquam
coi
Pu---dms--
p5
quiuis
cui
Pu---dms--
p5
quiuis
cuii
Pu---dms-- p5
quiuis
cuiius
Pu---gms--
p5
quiuis
cuius
Pu---gms--
p5
quiuis
qua
Pu---bfs--
p5
quiuis
quae
Pu---nfs--
p5
quiuis
quam
Pu---afs--
p5
quiuis
quarum
Pu---gfp--
p5
quiuis
quas Pu---afp-- p5
quiuis
quei
Pu---dms--
p5
quiuis
queis
Pu---dmp--
p5
quiuis
queius
Pu---gms--
p5
quiuis
quem
Pu---ams--
p5
quiuis
qui
Pu---nms--
p5
quiuis
quibus
Pu---dmp--
p5
quiuis
quid Pu---nns-- p5
quiuis
quis
Pu---dmp--
p5
quiuis
quius
Pu---gms--
p5
quiuis
quo
Pu---bms--
p5 quiuis
quod
Pu---nns--
p5
quiuis
quoi
Pu---dms--
p5
quiuis
quoiei
Pu---dms--
p5
quiuis
quoii
Pu---dms--
p5
quiuis
quoiius
Pu---gms--
p5
quiuis
quoius
Pu---gms--
p5
quiuis
quorum
Pu---gmp-- p5
quiuis
quos
Pu---amp--
p5
quiuis
coi
Pu---dms--
p6
quilibet
cui
Pu---dms--
p6
quilibet
cuii
Pu---dms--
p6
quilibet
cuiius
Pu---gms--
p6
quilibet
cuius
Pu---gms--
p6
quilibet
qua
Pu---bfs--
p6
quilibet
quae
Pu---nfs--
p6
quilibet
quam
Pu---afs--
p6
quilibet
quarum
Pu---gfp--
p6
quilibet
quas
Pu---afp--
p6
quilibet
quei
Pu---dms--
p6
quilibet
queis
Pu---dmp--
p6
quilibet
queius
Pu---gms--
p6
quilibet
quem
Pu---ams--
p6
quilibet
ques
Pu---nmp--
p6
quilibet
qui
Pu---nms--
p6
quilibet
quibus
Pu---dmp--
p6
quilibet
quid
Pu---nns--
p6
quilibet
quis
Pu---dmp--
p6
quilibet
quius Pu---gms-- p6
quilibet
quo
Pu---bms--
p6
quilibet
quod
Pu---nns--
p6
quilibet
quoi
Pu---dms--
p6 quilibet
quoiei
Pu---dms--
p6
quilibet
quoii
Pu---dms--
p6
quilibet
quoiius
Pu---gms--
p6
quilibet
quoius
Pu---gms-- p6
quilibet
quorum
Pu---gmp--
p6
quilibet
quos
Pu---amp--
p6
quilibet
coi
P2---dms--
p7 quicumque
cui
P2---dms--
p7
quicumque
cuii
P2---dms--
p7
quicumque
cuiius
P2---gms--
p7
quicumque
cuius
P2---gms--
p7
quicumque
qua
P2---bfs--
p7
quicumque
quae
P2---nfs--
p7
quicumque
quam
P2---afs--
p7
quicumque
quarum
P2---gfp--
p7
quicumque
quas
P2---afp--
p7
quicumque
quei
P2---nms--
p7
quicumque
queis
P2---dmp--
p7
quicumque
queius
P2---gms--
p7
quicumque
quem
P2---ams--
p7
quicumque
ques
P2---nmp--
p7
quicumque
qui
P2---nms--
p7
quicumque
quibus
P2---dmp--
p7
quicumque
quis
P2---dmp--
p7
quicumque
quius
P2---gms--
p7
quicumque
quo
P2---bms--
p7
quicumque
quod
P2---nns--
p7
quicumque
quoi
P2---dms--
p7
quicumque
quoiei
P2---dms--
p7
quicumque
quoii
P2---dms--
p7
quicumque
quoiius
P2---gms--
p7
quicumque
quoius
P2---gms--
p7
quicumque
quorum
P2---gmp--
p7
quicumque
quos
P2---amp--
p7
quicumque
coi
Pu---dms-- p8
quiuiscumque
cui
Pu---dms--
p8
quiuiscumque
cuii
Pu---dms--
p8
quiuiscumque
cuiius
Pu---gms--
p8
quiuiscumque
cuius
Pu---gms--
p8
quiuiscumque
qua
Pu---bfs--
p8
quiuiscumque
quae
Pu---nfs--
p8
quiuiscumque
quam Pu---afs-- p8
quiuiscumque
quarum
Pu---gfp--
p8
quiuiscumque
quas
Pu---afp--
p8
quiuiscumque
quei
Pu---dms--
p8 quiuiscumque
queis
Pu---dmp--
p8
quiuiscumque
queius
Pu---gms--
p8
quiuiscumque
quem
Pu---ams--
p8
quiuiscumque
qui Pu---nms-- p8
quiuiscumque
quibus
Pu---dmp--
p8
quiuiscumque
quis
Pu---dmp--
p8
quiuiscumque
quius
Pu---gms--
p8 quiuiscumque
quo
Pu---bms--
p8
quiuiscumque
quod
Pu---nns--
p8
quiuiscumque
quoi
Pu---dms--
p8
quiuiscumque
quoiei
Pu---dms--
p8
quiuiscumque
quoii
Pu---dms--
p8
quiuiscumque
quoiius
Pu---gms--
p8
quiuiscumque
quoius
Pu---gms-- p8
quiuiscumque
quorum
Pu---gmp--
p8
quiuiscumque
quos
Pu---amp--
p8
quiuiscumque
coi
P2---dms--
p9
quisque
cui
P2---dms--
p9
quisque
cuii
P2---dms--
p9
quisque
cuiius
P2---gms--
p9
quisque
cuius
P2---gms--
p9
quisque
quei
P2---dms--
p9
quisque
queis
P2---dmp--
p9
quisque
queius
P2---gms--
p9
quisque
quem
P2---ams--
p9
quisque
qui
P2---bms--
p9
quisque
quibus
P2---dmp--
p9
quisque
quic
P2---nns--
p9
quisque
quid
P2---nns--
p9
quisque
quis
P2---nms--
p9
quisque
quius
P2---gms--
p9
quisque
quo
P2---bms--
p9
quisque
quod
P2---nns--
p9
quisque
quoi
P2---dms--
p9
quisque
quoiei P2---dms-- p9
quisque
quoii
P2---dms--
p9
quisque
quoiius
P2---gms--
p9
quisque
quoius
P2---gms--
p9
quisque
quorum
P2---gmp--
p9
quisque
quos
P2---amp--
p9
quisque
coi
Pt---dms--
p18
quinam
cui Pt---dms-- p18
quinam
cuii
Pt---dms--
p18
quinam
cuiius
Pt---gms--
p18
quinam
cuius
Pt---gms--
p18
quinam
qua
Rr--------
p18
quinam
quae
Pt---nnp--
p18
quinam
quam
Pt---afs--
p18
quinam
quarum
Pt---gfp--
p18
quinam
quas
Pt---afp--
p18
quinam
queis
Pt---dmp--
p18
quinam
queius
Pt---gms--
p18
quinam
quem
Pt---ams--
p18
quinam
qui
Rr--------
p18
quinam
quibus
Pt---dmp--
p18
quinam
quis
Pt---dmp--
p18
quinam
quius
Pt---gms--
p18
quinam
quo
Pt---bms--
p18
quinam
quod
Pt---nns-- p18
quinam
quoi
Pt---dms--
p18
quinam
quoiei
Pt---dms--
p18
quinam
quoii
Pt---dms--
p18
quinam
quoiius
Pt---gms--
p18
quinam
quoius
Pt---gms--
p18
quinam
quorum
Pt---gmp--
p18
quinam
quos
Pt---amp--
p18
quinam
qualiter
Rr--------
p7
qualitercumque
quemadmodum
Rr--------
p7
quem-ad-modum-cumque
tanta
Pu---nfs--
p1
tantusdem
tantae
Pu---gfs--
p1
tantusdem
tantam
Pu---afs--
p1
tantusdem
tantan
Pu---afs--
p1
tantusdem
tantarum
Pu---gfp--
p1
tantusdem
tantarun
Pu---gfp--
p1
tantusdem
tantas
Pu---afp--
p1
tantusdem
tanti
Pu---gms--
p1
tantusdem
tantis
Pu---dmp--
p1
tantusdem
tanto
Pu---bms--
p1
tantusdem
tantorum
Pu---gmp--
p1
tantusdem
tantos Pu---amp-- p1
tantusdem
tantum
Pu---ams--
p1
tantusdem
tantun
Pu---ans--
p1
tantusdem
tantus
Pu---nms--
p1
tantusdem
tantorun
Pu---gmp--
p1
tantusdem
ubi
Rr--------
p6
ubilibet
ubi
Rr--------
p7
ubicumque
unde Rr-------- p6
undelibet
unde
Rr--------
p7
undecumque
ut
Rr--------
p7
utcumque
ea
Rr--------
p1 idem
ei
P3---dfs--
p1
idem
ei
P3---dns--
p1
idem
ei
P3---nmp--
p1
idem
ei P3---nms-- p1
idem
eius
P3---gfs--
p1
idem
eius
P3---gns--
p1
idem
eis
P3---dfp--
p1
idem
eis
P3---dnp--
p1
idem
eis
P3---bmp--
p1
idem
eis
P3---bfp--
p1
idem
eis
P3---bnp--
p1
idem
eorum
P3---gnp--
p1
idem
i
P3---nns--
p1
idem
i
P3---ans--
p1
idem
i
P3---nmp--
p1
idem
is
P3---dfp--
p1
idem
is
P3---dnp--
p1
idem
is
P3---bmp--
p1
idem
is
P3---bfp--
p1
idem
is
P3---bnp--
p1
idem
quoi
Pu---dfs--
p4
quisquam
quoiei
Pu---dns--
p4
quisquam
quoi
Pu---dns--
p4
quisquam
quoi
Pu---nms--
p4
quisquam
coi
Pu---dfs--
p4
quisquam
coi
Pu---dns--
p4
quisquam
quis
Pu---nfs--
p4
quisquam
quo
Pu---bfs-- p4
quisquam
quo
Pu---bns--
p4
quisquam
quo
Rr--------
p4
quisquam
quid
Pu---ans--
p4
quisquam
cuius
Pu---gfs--
p4
quisquam
cuius
Pu---gns--
p4
quisquam
quem
Pu---afs--
p4
quisquam
cuiius Pu---gfs-- p4
quisquam
cuiius
Pu---gns--
p4
quisquam
cui
Pu---dfs--
p4
quisquam
cuii
Pu---dfs--
p4 quisquam
cui
Pu---dns--
p4
quisquam
cuii
Pu---dns--
p4
quisquam
queius
Pu---gfs--
p4
quisquam
queius Pu---gns-- p4
quisquam
queis
Pu---dmp--
p4
quisquam
queis
Pu---dnp--
p4
quisquam
queis
Pu---bfp--
p4 quisquam
queis
Pu---bmp--
p4
quisquam
queis
Pu---bnp--
p4
quisquam
qui
Pu---bfs--
p4
quisquam
qui
Pu---bns--
p4
quisquam
qui
Pu---nmp--
p4
quisquam
qui
Pu---nfp--
p4
quisquam
quei
Pu---dfs-- p4
quisquam
quei
Pu---dns--
p4
quisquam
quius
Pu---gns--
p5
quiuis
queius
Pu---gns--
p5
quiuis
cuiius
Pu---gns--
p5
quiuis
cuius
Pu---gns--
p5
quiuis
quius
Pu---gfs--
p5
quiuis
queius
Pu---gfs--
p5
quiuis
cuiius
Pu---gfs--
p5
quiuis
cuius
Pu---gfs--
p5
quiuis
quoii
Pu---dfs--
p5
quiuis
quoii
Pu---dns--
p5
quiuis
quoiius
Pu---gfs--
p5
quiuis
quoiius
Pu---gns--
p5
quiuis
quoius
Pu---gfs--
p5
quiuis
quoius
Pu---gns--
p5
quiuis
queis
Pu---dfp--
p5
quiuis
queis
Pu---dnp--
p5
quiuis
queis
Pu---bmp--
p5
quiuis
queis
Pu---bnp--
p5
quiuis
queis
Pu---bfp--
p5
quiuis
qui
Pu---bms-- p5
quiuis
qui
Pu---nmp--
p5
quiuis
quam
Rr--------
p5
quiuis
quae
Pu---nfp--
p5
quiuis
quae
Pu---nnp--
p5
quiuis
quae
Pu---anp--
p5
quiuis
quod
Pu---ans--
p5
quiuis
qua Rr-------- p5
quiuis
quo
Pu---bns--
p5
quiuis
quo
Rr--------
p5
quiuis
cui
Pu---dfs--
p5
quiuis
cuii
Pu---dfs--
p5
quiuis
cui
Pu---dns--
p5
quiuis
cuii
Pu---dns--
p5
quiuis
coi Pu---dfs-- p5
quiuis
coi
Pu---dns--
p5
quiuis
quoi
Pu---dfs--
p5
quiuis
quoiei
Pu---dfs--
p5 quiuis
quoi
Pu---dns--
p5
quiuis
quoiei
Pu---dns--
p5
quiuis
quoi
Pu---nms--
p5
quiuis
quorum
Pu---gnp--
p5
quiuis
quibus
Pu---dfp--
p5
quiuis
quibus
Pu---dnp--
p5
quiuis
quibus
Pu---bmp-- p5
quiuis
quibus
Pu---bfp--
p5
quiuis
quibus
Pu---bnp--
p5
quiuis
quei
Pu---dfs--
p5
quiuis
quei
Pu---dns--
p5
quiuis
quei
Pu---nms--
p5
quiuis
quid
Pu---ans--
p5
quiuis
quis
Pu---dfp--
p5
quiuis
quis
Pu---dnp--
p5
quiuis
quis
Pu---bmp--
p5
quiuis
quis
Pu---bnp--
p5
quiuis
quis
Pu---bfp--
p5
quiuis
alicuius
Pu---gfs--
p5
aliquisuis
alicuius
Pu---gns--
p5
aliquisuis
alicui
Pu---dns--
p5
aliquisuis
alicui
Pu---dfs--
p5
aliquisuis
aliquos
Pu---afp--
p5
aliquisuis
aliquae
Pu---nnp--
p5
aliquisuis
aliquae
Pu---anp--
p5
aliquisuis
aliquis
Pu---nfs--
p5
aliquisuis
aliquis
Pu---dmp--
p5
aliquisuis
aliquis Pu---dfp-- p5
aliquisuis
aliquis
Pu---dnp--
p5
aliquisuis
aliquis
Pu---bmp--
p5
aliquisuis
aliquis
Pu---bfp--
p5
aliquisuis
aliquis
Pu---bnp--
p5
aliquisuis
aliquod
Pu---ans--
p5
aliquisuis
aliquo
Pu---bfs--
p5
aliquisuis
aliquo
Pu---bns-- p5
aliquisuis
aliqua
Pu---nnp--
p5
aliquisuis
aliqua
Pu---anp--
p5
aliquisuis
aliquid
Pu---ans--
p5 aliquisuis
aliqui
Pu---bns--
p5
aliquisuis
aliqui
Pu---nmp--
p5
aliquisuis
aliqui
Pu---nfp--
p5
aliquisuis
aliquem
Pu---afs--
p5
aliquisuis
quoquo
P2---bfs--
p6
quisquislibet
quoquo
P2---bns--
p6
quisquislibet
quemquem
P2---afs--
p6
quisquislibet
quoiquoi
P2---dfs--
p6
quisquislibet
quoiquoi
P2---dns--
p6
quisquislibet
quidquid
P2---ans--
p6
quisquislibet
quodquod
P2---ans--
p6
quisquislibet
quicquid
P2---ans--
p6
quisquislibet
quisquis
P2---nfs--
p6
quisquislibet
quibusquibus P2---dfp-- p6
quisquislibet
quibusquibus
P2---dnp--
p6
quisquislibet
quibusquibus
P2---bmp--
p6
quisquislibet
quibusquibus
P2---bfp--
p6
quisquislibet
quibusquibus
P2---bnp--
p6
quisquislibet
quiqui
P2---bms--
p6
quisquislibet
quiqui
P2---bns--
p6
quisquislibet
quiqui
P2---nmp--
p6
quisquislibet
cuicui
P2---dfs--
p6
quisquislibet
cuicui
P2---dns--
p6
quisquislibet
quaequae
P2---nnp--
p6
quisquislibet
quaequae
P2---anp--
p6
quisquislibet
ut
X---------
p7
utcumque
quo
Pu---bns--
p6
quilibet
quo
Rr-------- p6
quilibet
quei
Pu---nms--
p6
quilibet
quei
Pu---dfs--
p6
quilibet
quei
Pu---dns--
p6
quilibet
quod
Pu---ans--
p6
quilibet
quorum
Pu---gnp--
p6
quilibet
quoiius
Pu---gfs--
p6
quilibet
quoii Pu---dfs-- p6
quilibet
quoiei
Pu---dfs--
p6
quilibet
quoiius
Pu---gns--
p6
quilibet
quoii
Pu---dns--
p6 quilibet
quoiei
Pu---dns--
p6
quilibet
qui
Pu---nmp--
p6
quilibet
quibus
Pu---dfp--
p6
quilibet
quibus Pu---dnp-- p6
quilibet
quibus
Pu---bmp--
p6
quilibet
quibus
Pu---bfp--
p6
quilibet
quibus
Pu---bnp--
p6 quilibet
coi
Pu---dfs--
p6
quilibet
coi
Pu---dns--
p6
quilibet
ques
Pu---nfp--
p6
quilibet
quid
Pu---ans--
p6
quilibet
queius
Pu---gfs--
p6
quilibet
queius
Pu---gns--
p6
quilibet
cui
Pu---dfs-- p6
quilibet
cui
Pu---dns--
p6
quilibet
queis
Pu---dfp--
p6
quilibet
queis
Pu---dnp--
p6
quilibet
queis
Pu---bmp--
p6
quilibet
queis
Pu---bfp--
p6
quilibet
queis
Pu---bnp--
p6
quilibet
quoius
Pu---gfs--
p6
quilibet
quoius
Pu---gns--
p6
quilibet
quoi
Pu---dns--
p6
quilibet
quoi
Pu---dfs--
p6
quilibet
quoi
Pu---nms--
p6
quilibet
quius
Pu---gfs--
p6
quilibet
quius
Pu---gns--
p6
quilibet
cuiius
Pu---gfs--
p6
quilibet
cuius
Pu---gfs--
p6
quilibet
cuiius
Pu---gns--
p6
quilibet
cuius
Pu---gns--
p6
quilibet
cuii
Pu---dfs--
p6
quilibet
cuii
Pu---dns--
p6
quilibet
qua
Rr--------
p6
quilibet
quam
Rr-------- p6
quilibet
quae
Pu---nfp--
p6
quilibet
quae
Pu---nnp--
p6
quilibet
eo
P3---bns--
p1
idem
eo
Rr--------
p1
idem
eorun
P3---gnp--
p1
idem
ea
P3---bfs--
p1
idem
ea P3---nnp-- p1
idem
ea
P3---anp--
p1
idem
quae
Pu---anp--
p6
quilibet
quis
Pu---dfp--
p6
quilibet
quis
Pu---dnp--
p6
quilibet
quis
Pu---bmp--
p6
quilibet
quis
Pu---bnp--
p6
quilibet
quis Pu---bfp-- p6
quilibet
aliquae
Pu---nfp--
p6
aliquilibet
aliqua
Pu---bfs--
p6
aliquilibet
aliqua
Pu---nnp--
p6 aliquilibet
aliqua
Pu---anp--
p6
aliquilibet
alicui
Pu---dfs--
p6
aliquilibet
alicui
Pu---dns--
p6
aliquilibet
alicuius
Pu---gns--
p6
aliquilibet
alicuius
Pu---gfs--
p6
aliquilibet
aliquod
Pu---ans--
p6
aliquilibet
aliquo
Pu---bns-- p6
aliquilibet
aliquis
Pu---dfp--
p6
aliquilibet
aliquis
Pu---dnp--
p6
aliquilibet
aliquis
Pu---bmp--
p6
aliquilibet
aliquis
Pu---bfp--
p6
aliquilibet
aliquis
Pu---bnp--
p6
aliquilibet
aliquid
Pu---ans--
p6
aliquilibet
aliqui
Pu---nmp--
p6
aliquilibet
quei
P2---dms--
p7
quicumque
quei
P2---dfs--
p7
quicumque
quei
P2---dns--
p7
quicumque
quis
P2---dfp--
p7
quicumque
quibus
P2---dfp--
p7
quicumque
quis
P2---dnp--
p7
quicumque
quibus
P2---dnp--
p7
quicumque
quis
P2---bmp--
p7
quicumque
quibus
P2---bmp--
p7
quicumque
quis
P2---bfp--
p7
quicumque
quibus
P2---bfp--
p7
quicumque
quis
P2---bnp--
p7
quicumque
quibus
P2---bnp--
p7
quicumque
quae P2---nfp-- p7
quicumque
quae
P2---nnp--
p7
quicumque
quae
P2---anp--
p7
quicumque
cuii
P2---dfs--
p7 quicumque
cuii
P2---dns--
p7
quicumque
queius
P2---gfs--
p7
quicumque
queius
P2---gns--
p7
quicumque
qua Rr-------- p7
quicumque
cuius
P2---gfs--
p7
quicumque
cuiius
P2---gfs--
p7
quicumque
cuius
P2---gns--
p7 quicumque
cuiius
P2---gns--
p7
quicumque
cui
P2---dfs--
p7
quicumque
cui
P2---dns--
p7
quicumque
quo
P2---bns-- p7
quicumque
quo
Rr--------
p7
quicumque
quius
P2---gfs--
p7
quicumque
quius
P2---gns--
p7
quicumque
queis
P2---dfp--
p7
quicumque
queis
P2---dnp--
p7
quicumque
queis
P2---bmp--
p7
quicumque
queis
P2---bnp--
p7
quicumque
queis
P2---bfp--
p7
quicumque
qui
P2---nmp--
p7
quicumque
quorum
P2---gnp--
p7
quicumque
quoius
P2---gfs--
p7
quicumque
quoiius
P2---gfs--
p7
quicumque
quoii
P2---dfs--
p7
quicumque
quoiei
P2---dfs--
p7
quicumque
quoi
P2---dfs--
p7
quicumque
quoius
P2---gns--
p7
quicumque
quoiius
P2---gns--
p7
quicumque
quoii
P2---dns--
p7
quicumque
quoiei
P2---dns--
p7
quicumque
quoi
P2---dns--
p7
quicumque
quoi
P2---nms--
p7
quicumque
coi
P2---dfs--
p7
quicumque
coi
P2---dns--
p7
quicumque
quod
P2---ans-- p7
quicumque
aliquo
P2---bns--
p7
aliquicumque
aliquae
P2---nfp--
p7
aliquicumque
aliquae
P2---nnp--
p7
aliquicumque
aliquae
P2---anp--
p7
aliquicumque
aliqui
P2---nmp--
p7
aliquicumque
alicuius
P2---gfs--
p7
aliquicumque
alicuius P2---gns-- p7
aliquicumque
alicui
P2---dfs--
p7
aliquicumque
alicui
P2---dns--
p7
aliquicumque
aliquod
P2---ans--
p7
aliquicumque
aliquis
P2---dnp--
p7
aliquicumque
aliquis
P2---dfp--
p7
aliquicumque
aliquis
P2---bmp--
p7
aliquicumque
aliquis P2---bfp-- p7
aliquicumque
aliquis
P2---bnp--
p7
aliquicumque
quius
Pu---gfs--
p8
quiuiscumque
quius
Pu---gns--
p8 quiuiscumque
quorum
Pu---gnp--
p8
quiuiscumque
quae
Pu---nfp--
p8
quiuiscumque
quae
Pu---nnp--
p8
quiuiscumque
quae
Pu---anp--
p8
quiuiscumque
cuius
Pu---gfs--
p8
quiuiscumque
cuiius
Pu---gfs--
p8
quiuiscumque
cuii
Pu---dfs-- p8
quiuiscumque
cuius
Pu---gns--
p8
quiuiscumque
cuiius
Pu---gns--
p8
quiuiscumque
cuii
Pu---dns--
p8
quiuiscumque
cui
Pu---dfs--
p8
quiuiscumque
coi
Pu---dfs--
p8
quiuiscumque
cui
Pu---dns--
p8
quiuiscumque
coi
Pu---dns--
p8
quiuiscumque
quoi
Pu---nms--
p8
quiuiscumque
quoi
Pu---dfs--
p8
quiuiscumque
quoi
Pu---dns--
p8
quiuiscumque
quei
Pu---nms--
p8
quiuiscumque
quei
Pu---dfs--
p8
quiuiscumque
quei
Pu---dns--
p8
quiuiscumque
quis
Pu---dfp--
p8
quiuiscumque
quis
Pu---dnp--
p8
quiuiscumque
quis
Pu---bmp--
p8
quiuiscumque
quis
Pu---bnp--
p8
quiuiscumque
quis
Pu---bfp--
p8
quiuiscumque
queis
Pu---dfp--
p8
quiuiscumque
queis
Pu---dnp--
p8
quiuiscumque
queis
Pu---bmp-- p8
quiuiscumque
queis
Pu---bfp--
p8
quiuiscumque
queis
Pu---bnp--
p8
quiuiscumque
queius
Pu---gns--
p8
quiuiscumque
queius
Pu---gfs--
p8
quiuiscumque
quod
Pu---ans--
p8
quiuiscumque
quoius
Pu---gfs--
p8
quiuiscumque
quoiius Pu---gfs-- p8
quiuiscumque
quoii
Pu---dfs--
p8
quiuiscumque
quoius
Pu---gns--
p8
quiuiscumque
quoiius
Pu---gns--
p8
quiuiscumque
quoii
Pu---dns--
p8
quiuiscumque
quo
Pu---bns--
p8
quiuiscumque
quibus
Pu---dfp--
p8
quiuiscumque
quibus
Pu---dnp--
p8
quiuiscumque
quibus
Pu---bmp--
p8
quiuiscumque
quibus
Pu---bfp--
p8
quiuiscumque
quibus
Pu---bnp--
p8 quiuiscumque
quoiei
Pu---dns--
p8
quiuiscumque
quoiei
Pu---dfs--
p8
quiuiscumque
qui
Pu---nmp--
p8
quiuiscumque
quo
P2---bns--
p9
quisque
quo
P2---bfs--
p9
quisque
quius
P2---gfs--
p9
quisque
quius
P2---gns-- p9
quisque
quis
P2---nfs--
p9
quisque
quis
P2---bmp--
p9
quisque
quis
P2---bfp--
p9
quisque
quis
P2---bnp--
p9
quisque
quoiius
P2---gfs--
p9
quisque
quoiius
P2---gns--
p9
quisque
quid
P2---ans--
p9
quisque
quic
P2---ans--
p9
quisque
quem
P2---afs--
p9
quisque
qui
P2---bfs--
p9
quisque
qui
P2---bns--
p9
quisque
qui
P2---nmp--
p9
quisque
qui
P2---nfp--
p9
quisque
quod
P2---ans--
p9
quisque
quibus
P2---dnp--
p9
quisque
quibus
P2---dfp--
p9
quisque
quibus
P2---bmp--
p9
quisque
quibus
P2---bnp--
p9
quisque
quibus
P2---bfp--
p9
quisque
queius
P2---gfs--
p9
quisque
quoi P2---dfs-- p9
quisque
quoiei
P2---dfs--
p9
quisque
quoii
P2---dfs--
p9
quisque
queius
P2---gns--
p9
quisque
quoi
P2---dns--
p9
quisque
quoiei
P2---dns--
p9
quisque
quoii
P2---dns--
p9
quisque
queis P2---dfp-- p9
quisque
queis
P2---dnp--
p9
quisque
quorum
P2---gfp--
p9
quisque
coi
P2---dfs--
p9 quisque
quoius
P2---gfs--
p9
quisque
cui
P2---dfs--
p9
quisque
cuii
P2---dfs--
p9
quisque
cuiius
P2---gfs--
p9
quisque
cuius
P2---gfs--
p9
quisque
coi
P2---dns--
p9
quisque
quoius
P2---gns--
p9
quisque
cui
P2---dns--
p9
quisque
cuii
P2---dns--
p9
quisque
cuiius
P2---gns--
p9
quisque
cuius
P2---gns--
p9
quisque
quei
P2---dfs--
p9
quisque
quei
P2---dns--
p9
quisque
quae
Pt---anp--
p18
quisnam
quoiei
Pt---dns--
p18
quisnam
quoi
Pt---dfs--
p18
quisnam
quoi
Pt---dns--
p18
quisnam
quoius
Pt---gns--
p18
quisnam
quoius
Pt---gfs--
p18
quisnam
quorum
Pt---gnp--
p18
quisnam
quos
Pt---afp--
p18
quisnam
quoii
Pt---dns--
p18
quisnam
qua
Pt---nnp--
p18
quisnam
qua
Pt---anp--
p18
quisnam
qui
Pt---nmp--
p18
quisnam
qui
Pt---nfp--
p18
quisnam
quorum
Pt---gfp--
p18
quisnam
cui
Pt---dfs-- p18
quisnam
cuiius
Pt---gns--
p18
quisnam
queis
Pt---dfp--
p18
quisnam
queis
Pt---dnp--
p18
quisnam
queis
Pt---bmp--
p18
quisnam
queis
Pt---bfp--
p18
quisnam
queis
Pt---bnp--
p18
quisnam
quei Pt---dfs-- p18
quisnam
quei
Pt---dns--
p18
quisnam
cuii
Pt---dfs--
p18
quisnam
quoiei
Pt---dfs--
p18
quisnam
cuii
Pt---dns--
p18
quisnam
quoiius
Pt---gns--
p18
quisnam
quius
Pt---gfs--
p18
quisnam
quius Pt---gns-- p18
quisnam
quo
Pt---bns--
p18
quisnam
quo
Pt---bfs--
p18
quisnam
quo
Rr--------
p18 quisnam
quoiius
Pt---gfs--
p18
quisnam
cuiius
Pt---gfs--
p18
quisnam
quoii
Pt---dfs--
p18
quisnam
quid
Pt---nns--
p18
quisnam
cuius
Pt---gms--
p18
quisnam
qua
Rr--------
p18
quisnam
quae
Pt---nnp-- p18
quisnam
quei
Pt---dms--
p18
quisnam
queis
Pt---dmp--
p18
quisnam
queius
Pt---gms--
p18
quisnam
quem
Pt---ams--
p18
quisnam
ques
Pt---nmp--
p18
quisnam
cuiius
Pt---gms--
p18
quisnam
quibus
Pt---dmp--
p18
quisnam
quoi
Pt---dms--
p18
quisnam
quis
Pt---dmp--
p18
quisnam
quit
Pt---nns--
p18
quisnam
quius
Pt---gms--
p18
quisnam
quo
Pt---bms--
p18
quisnam
quoiei
Pt---dms--
p18
quisnam
cuius
Pt---gfs--
p18
quisnam
quoiius
Pt---gms--
p18
quisnam
quoius
Pt---gms--
p18
quisnam
quorum
Pt---gmp--
p18
quisnam
quos
Pt---amp--
p18
quisnam
qui
Rr--------
p18
quisnam
quibus
Pt---dnp--
p18
quisnam
quoii
Pt---dms-- p18
quisnam
cuii
Pt---dms--
p18
quisnam
quibus
Pt---dfp--
p18
quisnam
quibus
Pt---bmp--
p18
quisnam
quibus
Pt---bfp--
p18
quisnam
quibus
Pt---bnp--
p18
quisnam
ques
Pt---nfp--
p18
quisnam
quis Pt---dfp-- p18
quisnam
quis
Pt---dnp--
p18
quisnam
quis
Pt---bmp--
p18
quisnam
quis
Pt---bfp--
p18
quisnam
coi
Pt---dfs--
p18
quisnam
cui
Pt---dms--
p18
quisnam
coi
Pt---dms--
p18
quisnam
quid Pt---ans-- p18
quisnam
quis
Pt---bnp--
p18
quisnam
coi
Pt---dns--
p18
quisnam
cui
Pt---dns--
p18
quisnam
quem
Pt---afs--
p18
quisnam
quis
Pt---nms--
p18
quisnam
quis
Pt---nfs--
p18
quisnam
queius
Pt---gfs--
p18
quisnam
quit
Pt---ans--
p18
quisnam
queius
Pt---gns--
p18
quisnam
cuius
Pt---gns-- p18
quisnam
eccui
Pt---dfs--
p18
ecquisnam
eccuius
Pt---gns--
p18
ecquisnam
eccuius
Pt---gfs--
p18
ecquisnam
ecquid
Pt---ans--
p18
ecquisnam
eccui
Pt---dns--
p18
ecquisnam
ecquos
Pt---afp--
p18
ecquisnam
ecquis
Pt---nfs--
p18
ecquisnam
ecquae
Pt---anp--
p18
ecquisnam
ecquo
Pt---bfs--
p18
ecquisnam
ecquo
Pt---bns--
p18
ecquisnam
ecqui
Pt---nfp--
p18
ecquisnam
ecquis
Pt---nms--
p18
ecquisnam
ecquorum
Pt---gnp--
p18
ecquisnam
ecquos
Pt---amp--
p18
ecquisnam
ecquibus
Pt---bnp--
p18
ecquisnam
ecquibus
Pt---bfp--
p18
ecquisnam
ecquibus
Pt---bmp--
p18
ecquisnam
ecquem
Pt---afp--
p18
ecquisnam
ecquibus
Pt---dfp--
p18
ecquisnam
ecquorum
Pt---gfp--
p18
ecquisnam
ecquis Pt---dmp-- p18
ecquisnam
eccui
Pt---dms--
p18
ecquisnam
eccuius
Pt---gms--
p18
ecquisnam
ecquae
Pt---nnp--
p18
ecquisnam
ecquem
Pt---amp--
p18
ecquisnam
ecqui
Pt---nmp--
p18
ecquisnam
ecquis
Pt---dfp--
p18
ecquisnam
ecquid
Pt---nns-- p18
ecquisnam
ecquis
Pt---bnp--
p18
ecquisnam
ecquo
Pt---bms--
p18
ecquisnam
ecquibus
Pt---dmp--
p18 ecquisnam
ecquorum
Pt---gmp--
p18
ecquisnam
ecquibus
Pt---dnp--
p18
ecquisnam
ecquis
Pt---dnp--
p18
ecquisnam
ecquis
Pt---bmp--
p18
ecquisnam
ecquis
Pt---bfp--
p18
ecquisnam
ecqua
Pt---nnp--
p18
ecquisnam
ecqua
Pt---anp--
p18
ecquisnam
4.3 Coding of the no f Type
adjectives
Structure:
ali
u
aliquammult u
aliquamplur u
aliquant
u
aliquantul
u
alter
6
alterutr
u
altr
6
bestr
s
bin
d
caeter
u
centen
d
centensim
o
centensum
o
centesim
o
ceter
u
complur
u
complurim
u
conct
u
cot
5
cui
5
cuiat
5
cunct
u
decim
o
decum
o
den
d
dodecim
o
du
n
ducen
d
ducent
n
ducenten
d
ducentesim
o
duocent
n
duocenten
d
duodec
n
duodecim
o
duodecum
o
duoden
d
duodequadragen d
duodequadragesim o
duodequinquagen d
duodequinquagesim o
duodesexagesim o
duodetricesim
o
duodeuicen
d
duodeuicensim
o
duodeuicesim o
duoetuicensim
o
duoetuicesim o
duouicesim
o
eccill
y
ecqual
t
ill
y
insecund
o
ips
3
iss
3
issul
3
ist
y
me
s
mi
s
milensim
o
milesim
o
millen
d
millensim
o
millesim
o
necull
u
neutr
u
non
o
nonagen
d
nonagensim
o
nonagesim o
nongen
d
nongent
n
nongenten
d
nongentensim o
nongentesim o
nongesim
o
noningent
n
noningentesim
o
nonnull
u
nostr
s
nouen
d
noun
o
null
u
nungentesim o
octagensim
o
octagesim
o
octau
o
octigesim
o
octingen
d
octingent
n
octingenten d
octingentesim
o
octogen
d
octogensim
o
octogesim
o
octon
d
oen
n
oin
n
oll
y
omn
u
perplurim
u
pleor u
pler
u
ploer
u
plur
u
plurim
u
plurum
u
preim
o
prim
o
prior
o
quadragen
d
quadragensim o
quadragensum o
quadragesim o
quadrigent
n
quadrin
d
quadringen
d
quadringent n
quadringenten
d
quadringentensim o
quadringentesim o
qual
5
quamplurim u
quamt
5
quant
5
quantul
5
quart
o
quatern
d
quin
d
quincent
n
quinct
o
quindecim
o
quinden
d
quindenar d
quingen
d
quingent
n
quingenten
d
quingentesim o
quinquagen
d
quinquagensim
o
quinquagesim o
quinquegen
d
quint
o
quoi
5
quoiat
5
quomplur
u
quot
5
quoten
4
relic
u
relicu
u
reliq
u
reliqu
u
secund
o
seden
d
sen
d
septen
d
septim
o
septingen
d
septingent
n
septingenten d
septingentesim o
septuagen
d
septuagensim o
septuagesim o
septum
o
sescen d
sescent
n
sescentesim o
sesquidecim o
setim
o
sexagen
d
sexagensim
o
sexagensum
o
sexagesim
o
sexagesum
o
sexcent
n
sexcentesim o
sext
o
singol
d
singul
d
sou
s
su
s
subneutr
u
tal
u
tamt
u
tant
u
tantul
u
terdecim
o
terdecum
o
terden
d
tern
d
tert
o
tot
u
tot
u
tr
n
trecen
d
trecent
n
trecenten d
trecentesim o
tredecim
o
tricen
d
tricensim
o
tricent
n
tricesim
o
trigen
d
trigensim
o
trigesim
o
tu
s
uestr
s
uicen
d
uicensim
o
uicensum
o
uicesim
o
uicesm
o
uicesum
o
uicin
d
uigen
d
uigesim
o
uigesum o
ull
u
un
n
unadeuicensim
o
unaetuicensim
o
unaetuicesim o
undecentensim
o
undecentesim o
undecim
o
undecum
o
unden
d
undenonagesim o
undequadragesim o
undequinquagesim o
undesexagesim o
undetricen
d
undetricesim o
undeuicen
d
undeuicensim o
undeuicesim o
undeuigesim o
unetuicensim o
unetuicesim o
unietuicensim
o
unietuicesim o
uniuers
u
uniuors
u
uostr
s
utr
4
4.4 SM coding
Structure:
au +v1 v7 ==F=======
am-au-i
a +v1 v8 ==F=======
am-a-sti
at +v1 n6 VmFk4===--
am-at-us
at +v1 n41 ==F=======
am-at-u
at +v1 n2n VmFk4===-- emigr-at-um
atur
+v1
n6
VmFj3===--
am-atur-us
eu +v2 v7 ==G=======
abol-eu-i
e +v2 v8 ==G=======
abol-e-sti
et +v2 n6 VmGk4===--
fl-et-us
et +v2 n41 ==G=======
fl-et-u
et +v2 n2n VmGk4===--
etur +v2 n6 VmGj3===--
fl-etur-us
iu +v4 v7 ==L=======
aud-iu-i
i +v4 v7 ==L=======
aud-i-i
it +v4 n6 VmLk4===--
aud-it-us
it +v4 n41 ==L=======
aud-it-um
it +v4 n2n VmLk4===--
barr-it-um
itur +v4 n6 VmLj3===--
aud-itur-us
ant
-v1
n7
VmFj1===--
am-ant-em
ans
-v1
blk
VmFj1nms--
am-ans
ans
-v1
blk
VmFj1vms--
am-ans
ans
-v1
blk
VmFj1nfs--
am-ans
ans
-v1
blk
VmFj1vfs--
am-ans
ans
-v1 blk VmFj1nns--
am-ans
ans
-v1
blk
VmFj1ans--
am-ans
ans -v1 blk VmFj1vns--
am-ans
and -v1 n6 VmFr-===--
am-and-us
and -v1 n21 ==F=======
am-and-i
and -v1 n2n VmFr-===--
emigr-and-um
ent
-v2
n7
VmGj1===--
abol-ent-em
ent
-v3
n7
VmHj1===--
trah-ent-em
ent
-v6
n7
VmNj1===--
adi-ent-em
ens
-v2
blk VmGj1nms--
vid-ens
ens
-v2
blk
VmGj1vms--
vid-ens
ens
-v2
blk
VmGj1nfs--
vid-ens
ens
-v2
blk
VmGj1vfs--
vid-ens
ens
-v2
blk
VmGj1nns--
vid-ens
ens
-v2 blk VmGj1ans--
vid-ens
ens
-v2
blk
VmGj1vns--
vid-ens
ens
-v3
blk
VmHj1nms--
trah-ens
ens
-v3
blk
VmHj1vms--
trah-ens
ens
-v3
blk
VmHj1nfs--
trah-ens
ens
-v3
blk
VmHj1vfs--
trah-ens
ens
-v3
blk
VmHj1nns--
trah-ens
ens
-v3
blk
VmHj1ans--
trah-ens
ens
-v3
blk
VmHj1vns--
trah-ens
ens
-v6
blk
VmNj1nms--
adi-ens
ens
-v6
blk
VmNj1vms--
adi-ens
ens
-v6
blk
VmNj1nfs--
adi-ens
ens
-v6
blk
VmNj1vfs--
adi-ens
ens
-v6
blk
VmNj1nns--
adi-ens
ens
-v6 blk VmNj1ans--
adi-ens
ens -v6 blk VmNj1vns--
adi-ens
end -v2 n6 VmGr-===--
vid-end-us
end -v3 n6 VmHr-===--
trah-end-us
end -v6 n6 VmNr-===--
fer-end-us
end -v2 n21 ==G=======
vid-end-i
end -v3 n21 ==H=======
trah-end-i
end -v6 n21 ==N=======
fer-end-i
end -v2 n2n VmGr-===--
sol-end-um
end -v3 n2n VmHr-===--
occurr-end-um
end
-v6
n2n
VmNr-===--
adi-end-um
und
-v3
n6
VmHr-===--
leg-und-us
und
-v6
n6
VmNr-===--
fer-und-us
und
-v3
n21
==H=======
leg-und-i
und
-v6
n21
==N=======
fer-und-i
und
-v3
n2n
VmHr-===--
occurr-und-um
und
-v6
n2n
VmNr-===--
adi-und-um
ient
-v4
n7
VmLj1===--
aud-ient-em
ient
-v5 n7 VmMj1===--
fac-ient-em
iens
-v4
blk
VmLj1nms--
aud-iens
iens
-v4
blk
VmLj1vms--
aud-iens
iens
-v4
blk
VmLj1nfs--
aud-iens
iens
-v4
blk
VmLj1vfs--
aud-iens
iens
-v4
blk
VmLj1nns--
aud-iens
iens
-v4
blk
VmLj1ans--
aud-iens
iens
-v4
blk
VmLj1vns--
aud-iens
iens
-v5
blk
VmMj1nms--
fac-iens
iens
-v5
blk
VmMj1vms--
fac-iens
iens
-v5
blk
VmMj1nfs--
fac-iens
iens
-v5
blk
VmMj1vfs--
fac-iens
iens
-v5
blk
VmMj1nns--
fac-iens
iens
-v5
blk
VmMj1ans--
fac-iens
iens
-v5 blk VmMj1vns--
fac-iens
iend -v4 n6 VmLr-===--
aud-iend-us
iend -v5 n6 VmMr-===--
fac-iend-us
iend -v4 n21 ==L=======
aud-iend-i
iend -v5 n21 ==M======= fac-iend-i
iend -v4 n2n VmLr-===--
barr-iend-um
iend
-v5
n2n
VmMr-===--
confug-iend-um
iund -v4 n6 VmLr-===--
aud-iund-us
iund -v5 n6 VmMr-===--
fac-iund-us
iund -v4 n21 ==L=======
aud-iund-i
iund -v5 n21 ==M=======
fac-iund-i
iund
-v4
n2n
VmLr-===--
barr-iund-um
iund
-v5
n2n
VmMr-===--
confug-iund-um
ior -n6 n7c ==========
saev-ior-em
ior -n7 n7c ==========
prudent-ior-em
ior -n6 blk A*---nms-2
saev-ior
ior -n6 blk A*---vms-2
saev-ior
ior -n6 blk A*---nfs-2
saev-ior
ior -n6 blk A*---vfs-2
saev-ior
ior -n7 blk A*---nms-2
prudent-ior
ior -n7 blk A*---vms-2
prudent-ior
ior -n7 blk A*---nfs-2
prudent-ior
ior -n7 blk A*---vfs-2
prudent-ior
ius -n6 blk A*---nns-2
saev-ius
ius
-n6
blk
A*---ans-2
saev-ius
ius -n6 blk A*---vns-2
saev-ius
ius -n6 blk A*---r---2
saev-ius
ius -n7 blk A*---nns-2
prudent-ius
ius -n7 blk A*---ans-2
prudent-ius
ius -n7 blk A*---vns-2
prudent-ius
ius -n7 blk A*---r---2
prudent-ius
issim -n6
n6
=========3
saev-issim-us
issim n7
n6
=========3
prudent-issim-us
issum -n6
n6
=========3
saev-issum-us
issum n7
n6
=========3
prudent-issum-us
rim -n6 n6 =========3 acer-rim-us
rim -n7 n6 =========3
pauper-rim-us
rum -n6 n6 =========3
acer-rum-us
rum
-n7
n6
=========3
pauper-rum-us
lim -n7 n6 =========3
facil-lim-us
lum -n7 n6 =========3
facil-lum-us
iior -n6i
n7c
==========
aedilit-iior-em
iior -n6i
blk
A*---nms-2
aedilit-iior
iior -n6i
blk
A*---vms-2
aedilit-iior
iior -n6i
blk
A*---nfs-2
aedilit-iior
iior -n6i
blk
A*---vfs-2
aedilit-iior
iius -n6i
blk
A*---nns-2
aedilit-iius
iius -n6i
blk
A*---ans-2
aedilit-iius
iius -n6i blk A*---vns-2 aedilit-iius
iius -n6i blk A*---r---2
aedilit-iius
iissim -n6i n6 =========3
aedilit-iissim-us
iissum -n6i n6 =========3
aedilit-iissum-us
4.5 Adverbial use coding
Structure:
o
Adverb as FE
o
EAGLES codes
aestiuo
Rr--------
aeterno
Rr--------
affectato
Rr--------
aliquammultum
Rr--------
aliquanto
Rr--------
aliquantulum Rr--------
aliquantum
Rr--------
alternis
Rr--------
ambiguo
Rr--------
arbitrario
Rr--------
arcano
Rr--------
arkano
Rr--------
assiduo
Rr--------
augurato
Rr--------
auspicato
Rr--------
bipartito
Rr--------
breui
Rr--------
caro
Rr--------
centuplicato Rr--------
certo
Rr--------
cetera
Rr--------
cetero
Rr--------
ceterum
Rr--------
cito
Rr--------
clandestino
Rr--------
commodo
Rr--------
complurimum
Rr--------
consulto
Rr--------
continuo
Rr--------
cotidiano
Rr--------
cottidiano Rr--------
crastino
Rr--------
crebra
Rr--------
crebro
Rr--------
derecto
Rr--------
destinato
Rr--------
directo
Rr--------
duplicato
Rr--------
exiguo
Rr--------
faenerato
Rr--------
falso
Rr--------
fenerato
Rr--------
festinato
Rr--------
figurato
Rr--------
fortuito
Rr--------
frequentato
Rr--------
gratiis
Rr--------
gratis
Rr--------
gratuito
Rr--------
hesterno
Rr--------
horno
Rr--------
immerito
Rr--------
inaugurato
Rr--------
inauspicato Rr--------
incerto
Rr--------
inconsulto
Rr--------
infinito
Rr--------
ingratiis
Rr--------
ingratis
Rr--------
insperato
Rr--------
inueito
Rr--------
inuersum Rr--------
inuito
Rr--------
isto
Rr--------
iterato
Rr--------
karo
Rr--------
limo
Rr--------
liquido
Rr--------
longincuo
Rr--------
longinquo
Rr--------
magni
Rr--------
magnum
Rr--------
manifesto
Rr--------
manufesto
Rr--------
matutino
Rr--------
maximi
Rr--------
merito
Rr--------
minimi
Rr--------
minoris
Rr--------
modico
Rr--------
multimodis
Rr--------
multo
Rr--------
multum
Rr--------
mutuo
Rr--------
necessario
Rr--------
nihili
Rr--------
nimio
Rr--------
nimium
Rr--------
omnimodis
Rr--------
parui
Rr--------
paucum
Rr--------
paullo
Rr--------
paullulo Rr--------
paullum
Rr--------
paulo
Rr--------
paululo
Rr--------
paulum
Rr--------
pauxillo
Rr--------
permagni
Rr--------
permulto
Rr--------
permultum
Rr--------
pernimium
Rr--------
perpetuo
Rr--------
perplurimum
Rr--------
perraro
Rr--------
plerumque
Rr--------
plurimi
Rr--------
plurimum
Rr--------
primo
Rr--------
primum
Rr--------
profecto
Rr--------
properato
Rr--------
proximo
Rr--------
proxsimo
Rr--------
proxumo
Rr--------
quadripartito
Rr--------
quadripertito
Rr--------
quadrupertito
Rr--------
quadruplicato
Rr--------
quanti
Rr--------
quanto
Rr--------
quantulum
Rr--------
quantum
Rr--------
quotidiano Rr--------
raro
Rr--------
rato
Rr--------
recta
Rr--------
recto
Rr--------
repentino
Rr--------
satisdato
Rr--------
secreto
Rr--------
secundo Rr--------
secundum
Rr--------
sedulo
Rr--------
sempiterno
Rr--------
septimo
Rr--------
septumo
Rr--------
sero
Rr--------
setimo
Rr--------
simulato
Rr--------
solito
Rr--------
solum
Rr--------
sortito
Rr--------
subito
Rr--------
superfluo
Rr--------
superuacaneo Rr--------
suppremo
Rr--------
supremo
Rr--------
suspecto
Rr--------
tacito
Rr--------
tanti
Rr--------
tantulum
Rr--------
tantum
Rr--------
tempestiuo
Rr--------
tempori Rr--------
tertiato
Rr--------
tertio
Rr--------
testato
Rr--------
tranquillo
Rr--------
tumultuario
Rr--------
tuto
Rr--------
uero
Rr--------
uerum Rr--------
ultimo
Rr--------
ultumo
Rr--------
una
Rr--------
CHLT Project
IST-2001-32745
1 September - 30
November 2003
Workpackage 5: Neo-Latin Morphological Analyser
C.N.R.
Istituto di Linguistica Computazionale
Andrea
Bozzi
Giuseppe
Cappelli
Marco
Passarotti
Paolo Ruffolo
1. Summary of key indicators
of project progress...2-10
1.4
Overall
assessment of the main milestones achieved
1.4.1
The LE rule
1.4.2
Management of
the lemmatization of the wordforms with structure LES + SM + SF
1.4.3
Testing the
lemmatization results about FE and wordforms with structure LES + SF and LES +
SM + SF
1.4.4
Tables for
Initial graphical variations and post-final segments
1.5 Problems encountered and decisions taken
The LE rule
1.5.1
Contrast
between SF coding and LE rule
1.5.2 Special LE rules
1.6 Correspondence between planned project progress and actual
accomplishments
2.
Work progress overview11-12
2.1 Specific objectives for the reporting period
2.2 Achievements
2.2.1 List of Deliverables
2.2.2 Progress by Workpackage/task
2.3 Work planned for the next reporting period
3. Project Management...13-14
3.1 Contractual Issues
3.2 Co-operation within the consortium
3.3 Participation in workshops and/or conference,
publications
1. Summary of key indicators of
project progress
This report
concerns the activities realized by the Workpackage 5 in the period from 1st
September 2003 to 30st November 2003.
1.1 Overall assessment of the
main milestones achieved
1.1.1 The LE rule
When a lemma is not created in a
regular way (exceptional lemma: LE), this is registered in the field LEM of
the LES archive. For instance, the LES n1e (first declension exceptional nouns)
aconti has lemma aconti-as, instead of the regular one aconti-a. The n1e LES,
in fact, regularly, produce automatically its own lemma, adding an a after the
last character of the LES.
In the LEM field there are three
possible types of information:
1.
an SF
(for instance, -e for the lemma mare)
2.
the
lemma (for instance, stipamen)
3.
an =
(for instance, for the LES auster)
The lemmas are
even wordforms. When a LE is given in input to LEMLAT, it is not segmented,
since all the necessary information about lemmatization are recorded in the LES
archive. In this way, LEMLAT uses for the lemmatization only the LES archive
and does not need to segment the input wordform.
This implies that no LES, non SM and
no SF are recognized in these input wordforms.
Since the morphological values
needed on the output come from the coding of the wordform elements (LES, SM,
SF), if they are not recognized, there cannot be the morphological values on
the output.
In order to
solve this problem, we found a solution, whose task is to manage the analysis
of the LE. This is the LE rule.
The rule is about only the LES
belonging to the nominal inflexion (the ones coded with the codes from n1 to n7),
since the LES belonging to the verbal inflexion are already correctly managed
by LEMLAT.
The rule says that when a wordform
to be analysed is not segmented by LEMLAT, since is an LE and if its LES has a
code n1*[7],
n2*, n3*, n4*, n5*, n6*, or n7* the morphological values to be applied on the
output are the following:
o nominative and vocative singular, if the LES is masculine, or feminine;
o nominative, vocative and accusative singular, if the LES is neuter[8].
These values will be now referred to as automatic values.
1.1.2 Management of the
lemmatization of the wordforms with structure LES + SM + SF
A table containing the information
about SM segment has been added to the database. Each record specifies the
segment value, its compatibility with other segments (one code for the segment
on the left and one for the segment on the right) and its morphological values.
A set of function has been designed
and implemented in order to retrieve the correct segment for the input wordform
according to the compatibility codes. Existing functions has been modified in
order to add the morphological information carried by SM segments to that carried by the LESs and SF
segments.
1.1.3 Testing the lemmatization
results about FE and wordforms with structure LES + SF and LES + SM + SF
The new version of LEMLAT have been
tested on the list of latin wordforms registered in P. Tombeur, Thesaurus
Formarum Totius Latinitatis: a Plauto usque ad saeculum Xxum, Turnhout, 1998.
On a total of 554.004 different
wordforms, 359.964 (i.e. 64,97%) have been recognized and lemmatized by LEMLAT[9].
The not recognized wordforms are
proper names (an onomasticon is missing in the LEMLAT dictionary), or belonging
to lemmas not registered in the dictionries used as lexical basis for LEMLAT[10].
1.1.4 Tables for Initial
graphical variations and post-final segments
Information used by LEMLAT to treat
initial graphical variations has been stored in a table of the database
(tabsai): variations for the initial part of a wordform are grouped by mean of
a code, e.g. (abf, b01), (af, b01), (auf, b01). The set of codes is the same used in the table lessario
(the LES archive) in the field a_gra: e.g. the LES aufer, stored in the LES
archive the value b01 for a_gra, allow to treat also the variants abfer and affer.
Also the information about
post-final segments (e.g. que) has been put in a table of the database
(tabspf): a code specify the compatibility between each segment and a LES.
For both tables C functions has been
implemented to retrieve information needed for wordforms analysis.
1.2 Problems encountered and decisions taken
The LE rule
The following two parts belong to rules to be applied to LES with COD
LES included between n1* and n5*. Rules about the LES with COD LES n6* and n7*
are still under construction.
1.2.1 Contrast between SF coding and LE rule
The LE rule belongs to the irregular lemmas, that is to the irregular
singular nominative wordforms (the pluralia tantum are not affected by this
problem).
The regular ending of the nominative is an SF (for instance, the SF a
is the regular ending of the first declension nouns: puell-a): like all the SF,
even the SF of the nominative ending are coded with their own morphological
values[11].
This imples that, when an LES has a lemma ending in an irregular way
(that is, not with the expected SF), if the wordform to be analysed by LEMLAT
is formed with that LES and ends with an SF which, between its codes, has also
the codes of the singular nominative. Thus, the resulting analysis has to be
limited, applying a rule that cuts the following values:
o Singular nominative,
o Singular nominative (if present: for instance, it is not present in the
coding of the singular nominative ending of the second declension nouns us),
o Singular accusative (if the LES is neuter)
For instance, the n1e LES sybot has LE sybotes (created through the
information es registered in the LEM field of the LES archive). If the
wordform sybota enters in input, it is segmented sybot-a and, according to the
SF a n1e coding, receives in output the regular values singular nominative,
vocative and ablative. The above mentioned rule cuts the singular nominative
and vocative values, while the correct singular ablative value remains written
on the output.
Below, all the particular application of this rule are reported. They
contain three information:
1. COD LEM: the morphological parading code of the lemma (for instance, n1
for the first declension nouns lemmas)
2. Singular nominative regular SF (for instance, -a for the n1 lemmas)
3. Which regular values have to be cut on the output analysis
o COD LEM: n1
o SF: a
o Cut: singular nominative and vocative
o COD LEM: n2
A)
o SF: us, ius
o Cut: singular nominative
B)
o
SF: um, ium
o Cut: singular nominative, vocative and accusative
o COD LEM: n3
A)
o SF: is
o Cut: singular nominative and vocative
B)
o SF: e
o Cut: singular nominative, vocative and accusative
o COD LEM: n4
A)
o SF: us
o Cut: singular nominative and vocative
B)
o SF: u
o Cut: singular nominative, vocative and accusative
o
COD LEM: n5
o
SF: es
o Cut: singular nominative and vocative
1.2.2 Special LE rules
In addition to the LE rule, the following exceptions have been found and
solved with special LE rules. These rules are written according to the COD LEM.
COD LEM n1
an LE n1e ending in -es (or formed with the SF -es), masculine, or
feminine, instead of the automatic values, receive the following one:
o
singular nominative
Examples:
poet-es (SF -es)
orchites (LE)
an LE n1, or n1e ending in -as (or formed with the SF -as), masculine,
or feminine, instead of the automatic values, receive the following ones:
o singular nominative
o plural accusative
Examples:
aconti-as (SF -as)
ophtalmias (LE)
an LE n1e ending in -e, but not ae (or formed with the SF e, but not
-ae) feminine, in addition to the automatic values, receive the following one:
o singular ablative
Examples:
enallag-e
(SF -e)
spathomele (LE)
an LE n1, or n1e ending in a (or formed with the SF a) masculine, or
feminine, in addition to the automatic values, receive the following ones:
o singular ablative
Examples:
pyct-a (SF -a)
sphaera (LE)
COD LEM n2
an LE n2, or n2e ending in -os (or formed with the SF -os), masculine,
or feminine, instead of the automatic values, receive the following ones:
o
singular nominative
o
plural accusative
Examples:
capn-os (SF -os)
rhythmos (LE)
an LE n2, or n2e ending in -us (or formed with the SF -us), masculine,
or feminine, instead of the automatic values, receive the following one:
o singular nominative
Examples:
papyr-us (SF os)
sponsus (LE)
an LE n2i ending in -ius (or formed with the SF -ius) masculine, instead
of the automatic values, receive the following one:
o
singular nominative
Examples:
lacunar-ius (SF ius)
flammarius (LE)
an LE n2e, n2n, or n2ni ending in um (or formed with the SF um, -ium)
neuter, in addition to the automatic values, receive the following one:
o
plural genitive
Examples:
florifert-um (SF -um)
thoracium (LE)
COD LEM n3
an LE n3, n31, n32, n3e or n3p formed with the SF is, in addition to
the automatic values, receive the following ones:
o singular genitive
o plural nominative, accusative and vocative (only if the LES is
masculine, or feminine)
Example: fust-is
an LE n3, n31, n32, n3e or n3p formed with the SF es, in addition to
the automatic values, receive the following ones[12]:
o plural nominative, accusative and vocative (only if the LES is
masculine, or feminine)
Example: compag-es
an LE n3n, n3n2, n3e or n3p formed with the SF i, in addition to the
automatic values, receive the following one:
o singular dative
Example: stib-i
o For the LES sinap and senap (S1727), the values to be added to the
automatic ones, are the following:
Singular dative and ablative
the following LE formed with the SF e, in addition to the automatic
values, receive the following one:
o
singular
ablative
mare
(M0378); rete (R0574); pane (P0232); proconsule (LES: proconsul P3669)
the following LE formed with a lemma in the LEM field receive, in
addition to the automatic values, the following ones:
o singular genitive
o plural nominative, accusative, vocative
suis (S3834); trabis (T0903); stirpis (S2515); dapis
(D0052); phthiriasis (P1974); orchitis (O0961); myrrhis (M9993); lentis
(L0528); iris (I2795); ibis (I0052); hypozeuxis (H0816); hiemis (H0366); hepteris
(H0267); gruis (G0632); gliris (G0379); nubis (N0596); epenthesis (E0683);
echis (E0051); cotis (C4181); celthis (C1086); carnis (C0681); calcis (C0291);
assis (A3009); lienis (L0732); cidaris (C1602); vomeris (u1205); incudis
(i1171); dapis (d0052); sortis (s2029); mesis (m0857); astytis (a3268);
amystis (a1815); eranthemis (e0848); phthois (p1981); utris (u1322)
the following LE formed with a lemma in the LEM field receive, in
addition to the automatic values, the following ones:
o plural nominative, accusative, vocative
trabes (T0903); stirpes (S2515); phthisis (P1978);
municipes (M1726); monades (M1279); diesis/dihesis (D1447); antipathes (A2236)
the following LE formed with a lemma in the LEM field receive, in
addition to the automatic values, the following one:
o singular ablative
lacte (L0040); sale (S0136)
the following LE formed with a lemma in the LEM field receive, in
addition to the automatic values, the following one:
o singular genitive
cucumis (C4511)
the following LE formed with a lemma in the LEM field receive, in
addition to the automatic values, the following ones:
o
singular dative
gingiberi/zingiberi (Z0025)
COD LEM n4
an LE n4 ending in us (or formed with the SF us) masculine, or
feminine, in addition to the automatic values, receive the following one:
o singular genitive
o plural nominative, accusative, vocative
Examples:
corn-us (SF -us)
magistratus (LE)
an LE n4 ending in u (or formed with the SF u) neuter, in addition to
the automatic values, receive the following one:
o singular dative
o singular ablative
Example:
pec-u (SF u)
no examples of LE n4 ending in -u
COD LEM n5
No LE and no SF in the LEM field.
1.3 Correspondence between planned project progress and actual
accomplishments
The progresses done in Workpackage 5 in the period
from 1st September 2003 to 30st November 2003 respect
what planned in the Project Program.
In particular, they are the following ones:
Design of a general rule for the
management of LE; in particular, some special rules have been designed in order
to solve the problem of the contrast between SF coding and LE rule and to solve
some exceptions (see 1.1.1 and 1.2.1);
Management of the lemmatization of
the wordforms with structure LES + SM + SF (see 1.1.2);
Test of the lemmatization results
about FE and wordforms with structure LES + SF and LES + SM + SF. Application
of the results to a research about the Latin homography: writing of an article
about (see 1.1.3 and 3.3);
Tables for Initial graphical
variations and post-final segments (see 1.1.4).
Management of N, V, PR, P1-P9 and P18
LES scheduled for this period has not yet been done but data structures and
functions have been already designed and will be implemented in the next
period.
In our previous Periodic Progress Report (September, 2003), the implementation of new LE rules had been planned for the period covered by this report. This implementation has not been done because of the following reasons:
before implementig single LE rules,
we decided to write all the LE rules, in order to describe the problem to be
solved in the clearest way. The writing of these rules is still going on for
the n6 and n7 LES (see 1.2.1.2),
the testing on Tombeurs list of
worforms took much time of elaboration and validattion of the resulting data
(see 1.1.3),
much time has been devoted for
writing articles and presenting papers at conferences (see 3.3)
2.
Work progress overview
2.1
Specific objectives for the reporting period
During the period covered by this report, we continued the development
of LEMLAT in CHLT LEMLAT, obtaining particularly the following results:
Design of a general rule for the
management of LE; in particular, some special rules have been designed in order
to solve the problem of the contrast between SF coding and LE rule and to solve
some exceptions (see 1.1.1 and 1.2.1);
Management of the lemmatization of
the wordforms with structure LES + SM + SF (see 1.1.2);
Test of the lemmatization results
about FE and wordforms with structure LES + SF and LES + SM + SF. Application
of the results to a research about the Latin homography: writing of an article
about (see 1.1.3 and 3.3);
Tables for Initial graphical
variations and post-final segments (see 1.1.4).
2.2
Achievements
2.2.1
List of Deliverables
December,
2002: Periodic Progress Report
March,
2003: Periodic Progress Report
June,
2003: D 5.1
September,
2003: Periodic Progress Report
2.2.2
Progress by Workpackage/task
According
to the specific appointed targets, the phase of the work in Workpackage 5
covered by this report has produced the following results:
Design of a general rule for the
management of LE; in particular, some special rules have been designed in order
to solve the problem of the contrast between SF coding and LE rule and to solve
some exceptions (see 1.1.1 and 1.2.1);
Management of the lemmatization of
the wordforms with structure LES + SM + SF (see 1.1.2);
Test of the lemmatization results
about FE and wordforms with structure LES + SF and LES + SM + SF. Application
of the results to a research about the Latin homography: writing of an article
about (see 1.1.3 and 3.3);
Tables for Initial graphical
variations and post-final segments (see 1.1.4).
2.3 Work planned for the next
reporting period
The work planned for the next
reporting period is the following:
writing LE rules for the management
of the n6 and n7 LE (first and second class adjectives) and Pluralia Tantum LE;
implementation of some LE rule;
testing lemmatization results on
some particular literary texts. In this phase, the texts will be chosen into
the ones belonging to the classical Latin (from first century b.C. to first
century a.C.). The texts will be in prose and in verses and belonging to
different kinds of styles (epic, philosophic, elegiac, historic,);
management of N, V, PR, P1-P9 and P18
LES.
Project meeting:
November 19th-20th,
2002: Kansas City, UMKC
Marco Passarotti, Paolo Ruffolo
In the context of this meeting a
cooperation with the WP4 have been established, in order to optimize the
standardization of the morphological coding for Latin and Old-Norse. The EAGLES
codes used for the new version of LEMLAT have been shared, in order to be
applied, with the necessary modifications, even on the Old-Norse wordforms.
Passarotti
Marco, LEMLAT. A computational tool for latin lemmatization, Communication at the Euroconference Philological
Disciplines and Digital Technology.
Computational Philology: Tradition versus Innovation,, 6th-11th
September 2003, Il Ciocco, Castelvecchio Pascoli (Lu), Italy.
Passarotti
Marco, La lemmatizzazione. CosՏ, perch si deve fare, come io credo
convenga farla, published in the on-line
review Griselda (www.griseldaonline.it).
Alberto Paulo F., Cappelli Giuseppe, Passarotti Marco, Pena Abel N., Instrumentos informticos para anlise de texto, in Proceedings of the Conference Antiguidade Clssica: Que fazer com este Patrimnio?, Lisbona.
Passarotti Marco, LEMLAT: a computational tool for latin lemmatization. Development and perspectives, in Proceedings of the Round Table From
manuscript to digital text. Problems of interpretation and markup in the
context of the Conference Latling: 12th International Colloquium on Latin
Linguistics, Bologna.
CHLT Project
IST-2001-32745
1 December 2003 - 28
February 2004
Workpackage 5
Neo-Latin Morphological Analyser
C.N.R.
Andrea Bozzi
Giuseppe Cappelli
Marco Passarotti
Paolo Ruffolo
1. Summary of key indicators
of project progress...2-19
Overall assessment of the main
milestones achieved
1.6.1
n6, n7 and
Pluralia Tantum LE values
1.6.2
Management of
N, V, PR, P1-P9 and P18 LES
1.6.3
Management of
the irregular Type
1.6.4
Management of
the present participles, past participles, future participles and irregular gerundives
1.6.5
Output
re-organization and XML format
1.7 Problems encountered and decisions taken
1.7.1
n6, n7 and
Pluralia Tantum LE
1.7.2
Management of
P1P9, P18 LES
1.8 Correspondence between planned project progress and actual
accomplishments
2.
Work progress overview20-21
2.1 Specific objectives for the reporting period
2.2 Achievements
2.2.1 List of Deliverables
2.2.2 Progress by Workpackage/task
2.3 Work planned for the next reporting period
3. Project Management........22
3.1 Contractual Issues
3.2 Co-operation within the consortium
3.3 Participation in workshops and/or conference,
publications
1. Summary of key indicators of
project progress
This report
concerns the activities realized by the Workpackage 5 in the period from 1st
December 2003 to 29st February 2004[13].
1.1 Overall assessment of the
main milestones achieved
1.1.1 n6, n7 and Pluralia
Tantum LE values
In the next sections indications
about the attribution of the morphological values to the LE are reported: about
how some of them have been implemented, see 1.1.3.
1.1.1.1 N6, N6r, N6i
When a LE is registered with COD LES N6, N6r, or N6i[14],
the values to be automatically attributed to this form are:
nominative singular masculine
if the form ends in -(i)us generated with:
LE ending in -(i)us
or with
SF containing -US: ex. -US, -US/-OS, -OS/-US, -IUS, -IUS/-IOS, =/-US,
-US/LE
Examples:
scholasticus (LE)
paral-ius (SF -IUS/-IOS)
miser-us (SF =/-US)
nomin/vocat singular masculine
if the LE ends in -er
Example: subniger
nomin/vocat singular masculine
if, in the field Lemma, there is an =
Example: flammiger (LES), Lemma field =
nominative singular masculine
accusative plural masculine
if the form ends in -os generated with:
LE ending in -*OS
or with
SF containing -OS: es. -OS, -US/-OS, -OS/-US
Examples:
tetrachordos (LE)
acosm-os (SF -OS)
nominative singular masculine
accusative plural masculine
if the form ends in -ios generated with SF containing IOS (ex.
-IUS/-IOS, -IOS)
Example: paral-ios
The pronoun forms (COD LES N6p) registered as LE (or with =) in the
field Lemma (only 14 LES) receive the following values:
PoS: P (pronominal)
Type: comes from the table Type coding[15]
Case: n (nominative)
Gender: m (masculine)
Number: s (singular)
1.1.1.2 N7, N71, N72 and N7c
When a LE is registered with
COD LES N7, N71, N72, or N7c, the values to be automatically attributed
to this form are:
Nominative/vocative singular masculine, feminine, neuter
and
Accusative singular neuter
If the LE is automatically created
by one of the following rules:
o LES ending in c, lemma ending in x: audax, audac-is
o LES ending in g, lemma
ending in x: exlex, exleg-is
o LES ending in d, lemma ending in s: aeripes, aeriped-is
o LES ending in t, lemma ending in s: cordalens, cordalent-is
If the LE in the field Lemma ends
in:
o -l: autumnal
o consonant + -s: viripotens
o -x: triplex
o -es: superstes
o -us: vetus
o -ur: incicur
o -on: chrysizon
o -as: cuias
o -os: exos
If the SF in the field Lemma is:
-ex: senex
-s: inops
If in the field Lemma there is an
= and the LES does not end in -ior: dispar
Nominative/vocative singular masculine, feminine, neuter
and
Accusative singular neuter
and
Nominative/Vocative/Accusative plural masculine e feminine
If the SF in the field Lemma is:
o -es: aphrod-es
But, if the LE with these characteristics is a Pluralia Tantum, this indication has
not to be applied: see 1.2.1.3.
Nominative/vocative singular masculine, feminine
and
Genitive singular masculine, feminine, neuter
and
Nomin/voc/acc plural masculine, feminine
If the LE in the field Lemma ends in -is: verisimilis
If the SF in the field Lemma is -is: grandis
Nominative/vocative singular masculine, feminine, comparative degree:
If in the field Lemma there is an = and the LES is registered with COD
LES N7c: peior
If in the field Lemma there is an = and the LES ends in -ior:
retrosior
Nominative/vocative singular masculine
If the LE in the field
Lemma ends in -er: acer
But not the LE veter, that behaves like vetus (see
above)
The Pluralia Tantum are LE, since they have the SF of the nominative plural, or the whole plural LE registered in the field Lemma.
For instance, the form tusillae is
registered with the SF -AE in the field Lemma.
The solution is the same
adopted for the nouns (from N1 to
N5) and for the adjectives (N6 and N7), according to the following rules (the
gender is attributed according to the gender of the LES: for instance, the LES
tusill is N1 feminine, receiving therefore Nomin/Voc Plural feminine):
The LE in the field Lemma formed
in the following ways receives the following values:
Nominative/Vocative plural
masculine/feminine
LE formed with a SF -AE with LES N1, N1E
Examples:
tusill-ae (N1)
nug-ae (N1E)
LE formed with a SF -I with LES N2, N2E
Examples:
syric-i (N2)
adelf-i (N2E)
LE formed with a SF -OE with LES N2E
Example:
fescemn-oe (N2E)
The LE in the field Lemma formed in the following ways receives the
following values:
Nominative/Vocative plural masculine
LE formed with a SF -II with LES
N2I
Example
cupidinar-ii (N2I)
LE formed with a SF -I with LES N6
Example:
octogeni (N6)
The LE in the field Lemma formed in the following ways receives the
following values:
Nominative/Accusative/Vocative
plural neuter
LE formed with a SF -IA with LES N2NI,
N3N2, N3E
Examples:
acetar-ia (N2NI)
accubital-ia (N3N2)
LE formed with a SF -UA with LES N4
Example
fulgitr-ua (N4)
LE formed with a SF -A with LES
N2N, N2E, N3N, N3N1, N3E
Examples:
aerum-a (N2N)
anthologic-a (N2E)
tragemat-a (N3N)
flemin-a (N3N1)
1.1.2 Management of N, V, PR,
P1-P9 and P18 LES
1.1.2.1 Management of N, V, PR
LES
For the treatment of the LES codified as N, V and PR have been used the same mechanisms applied to the exceptional forms: in the table forme_ecc have been put the references to the LES implicated with the respective patterns of morphological codes to be produced in output.
In the following table is reported the total number of the treated LES grouped codles:
codles |
Numero di LES |
N |
467 |
PR |
65 |
V |
16 |
1.1.2.3 Management of P1P9, P18
LES
The LES classified as P1P9 and P18
have been treated using the mechanism of the post-final segments opportunely
modified (see section 1.2.2). In the LES table (lessario) has been put an
identifying code for the post-final segments admitted for each LES implicated:
for simplicity to the field spf has been assigned the same value of the codles.
In the post-final segments table (tabspf) have been added the new codes with
the respective codes of compatibility:
segment |
comp_cod |
dem |
p1 |
nam |
p18 |
dam |
p2 |
piam |
p3 |
quam |
p4 |
uis |
p5 |
libeat |
p6 |
libet |
p6 |
lubeat |
p6 |
lubet |
p6 |
cumque |
p7 |
cunque |
p7 |
qunque |
p7 |
quomque |
p7 |
uiscumque |
p8 |
uiscunque |
p8 |
uisqunque |
p8 |
uisquomque |
p8 |
que |
p9 |
The post-final segment reported in
the field segment is compatible
with the LES or with the corresponding LESs, indicated in the field comp_cod.
Each LES of codles P1P9, P18 is recognized if and only if is linked ,in input,
to a post-final segment which is compatible with it. The patterns of
morphological codes are drawn in the same way to the exceptional forms: in the
table forme_ecc have been inserted some records relative to the LES P1P9, P18.
The following table shows the number
of LES implicated grouped by codles:
codles |
num_les |
p1 |
88 |
p18 |
242 |
p2 |
76 |
p3 |
119 |
p4 |
75 |
p5 |
121 |
p6 |
156 |
p7 |
120 |
p8 |
77 |
p9 |
72 |
1.1.3 Management of the
irregulat Type
A modification to the software and to the database has been done to solve the problem of the incorrect attribution of the Type for the LES of which to PPR of the September 2003 (Annex 4.1) and to the Deliverable D5.2.
To the table of the LES has been
added the field Type that permits the memorization of an automatic Type for all
those LESs that would not receive a correct Type from the normal wordform
elaboration process. The system just reveals the probable value of default (
Type field unempty) and to subscribe it to the calculated value.
The number of the LES modified are
reported in the following table, grouped by codles:
codles |
num_les |
fe |
4 |
n2n |
4 |
n31 |
2 |
n6 |
211 |
n6i |
2 |
n6p |
18 |
n6p2 |
1 |
n6s |
4 |
n7 |
5 |
n71 |
8 |
n7c |
1 |
n7p3 |
1 |
v3s |
1 |
v3sa |
1 |
v62a |
2 |
v64a |
3 |
v65a |
1 |
v6sf |
1 |
v7s |
1 |
1.1.4 Management of the
present participles, past participles, future participles and irregular gerundives.
Some LESs, that were stored in the lessario
with codles N6*, N7* and N2N, are in fact roots of present participles, past participles, future participles or
of gerundives. Because of their codification, the patterns of morphological
codes produced in output by the system of automatic lemmatization were those
ones of first and second class adjectives or of neuter names of second
declension. In the normal working, actually, the morphological values relative
to participles or to gerundives are given by the middle segments, e.g.
am-ant-em, am-at-us, am-atur-us, am-and-us. For the class of these LES, the
absence of middle segments makes a correct classification impossible e.g.
lect-us was analyzed as first class adjective.
The problem has been solved by an
identification of the LES interested and in a re-classification of them by
opportune codles. Moreover, some records have been added in the table tabsf.
They allow the compatibility of the ending segments with the new codles.
Lastly, middle segments table ( tabsm ) has been updated in order to guarantee
the recognition of superlatives and
comparatives relative to the implicated LES.
1.1.4.1 LES Individualization
The LES related to the irregular
forms of the participles and of the gerundives have been individualized by the
execution of SQL queries on the table lessario, using, in particular, les field ending and to the information
contained in the fields codles and clem.
1.1.4.1.1 Present Participles
Individualization.
The LES that refer to irregular
Present Participles have been individualized by queries that translate the
following conditions:
Terminal part
of the field LES: nt
1.1.4.1.2 Past Participles
Individualization.
The LES that refer to irregular Past
Participles have been individualized by queries that translate the following
conditions:
1.1.4.1.3 Individualization of
Future Participles
The LES that refer to irregular Future Participles have been individualized by queries that translatethe following conditions:
Terminal part of the field LES: ur
1.1.4.1.4 Gerundives
Individualization
The LES refer to irregular Present
Participles have been individualized by query that translate the following
conditions:
1.1.4.1 Modifications brought to
the database
For each category of LES has been
operated a substitution of the codles according to the following table:
Category |
Old codles |
New codles |
Present Participle |
N7* |
N7P3 |
Past Participle |
N6* |
N6P1 |
Past Participle |
N2N |
N2NP |
Future Participle |
N6* |
N6P2 |
Gerundive |
N6* |
N6G |
In tabsf table have been reduplicated the endingss compatible with the old codles, the codes of compatibility have been substituted with the new codes and participles and gerundives patterns of morphological codes have been stored. For instance, the record:
segment |
comp_cod |
c01 |
c02 |
c03 |
c04 |
c05 |
c06 |
c07 |
c08 |
c09 |
c10 |
us |
n6p1 |
v |
m |
* |
k |
4 |
n |
m |
s |
- |
- |
It allows the recognition of the
nominative masculine singular of the LES that are relative to irregular past
participles codified with codles N6P1 and gives in output the correct pattern
of morphological codes.
Lastly, in the table tabsm have been
added the record that are necessary for the formation of the superlatives and
of the comparatives of irregular participles and of gerundives. For instance,
in order to permit the formation of the future participle has been inserted
record:
segment |
pm |
comp_cod_prec |
comp_cod_succ |
c01 |
c02 |
c03 |
c04 |
c05 |
c06 |
c07 |
c08 |
c09 |
c10 |
ior |
- |
n6p2 |
n7c |
v |
m |
* |
j |
3 |
= |
= |
= |
= |
= |
1.1.5 Output
re-organization and XML format
A new data stucture has been
designed and implemented in order to record the results of the lemmatization so
that to satisfy the following properties:
The following data structure is
useful for the recording of lemmas :
typedef struct Lemma
{
char out_lemma[MAX_FORM_LENGTH];
char
cod_lemma[CODLEM_LENGTH];
char
cod_morf_1_3[3];
LEM_TYPE
type;
char
n_id[N_ID_LENGTH];
char
gen;
char
codles[CODLES_LENGTH];
} Lemma;
In the field cod_morf_1_3 are
recorded the morphological codes that are relative to the lemma. The field TYPE
records the typology of the lemma (IPERLEMMA o IPOLEMMA).
The following structure is used in
order to record the sequence of lemmas
contained in each analysis:
typedef struct Lemmas
{
short
numL;
short
ind;
Lemma
lems[MAX_N_LEMS];
} Lemmas;
The analysis of a form is recorded in the data
structure Analysis, in which there are the fields that are relative to the
sequence of morphological patterns and to the sequence of the lemmas.
typedef struct Analysis
{
char
segments[NUM_SEGMENTS][MAX_SEG_LENGTH];
short
part;
short
n_cod_morf;
char
cod_morf_4_10[MAX_N_COD_MORF][7];
Lemmas
lemmas;
} Analysis;
The result of the lemmatization of a form is a sequence of analyses, recorded by the following:
typedef struct Analyses
{
char in_form[MAX_FORM_LENGTH];
char alt_in_form[MAX_FORM_LENGTH];
short
numAnalysis;
Analysis
analysis[MAX_N_ANALYSIS];
} Analyses;
Some functions, which operate an
automatic conversion of CODLEM in the first three morphological codes by the
simple look-up on a database table, have been designed and implemented.
The following function has been
designed to give in output the data of the lemmatization in format XML:
void XMLOut()
{
int
i,l,j,a;
Lemma
*curLemma;
Analysis
*curAnalysis;
//xml
header:
fprintf(po,"<?xml
version=\"1.0\" encoding=\"ISO-8859-1\"?>\n");
//Analyses:
fprintf(po,"<Analyses>\n");
fprintf(po,"<form>%s</form>\n<alt_form>%s</alt_form>\n",
analyses.in_form, analyses.alt_in_form);
for
(a=0;a<analyses.numAnalysis; a++)
{
//Analysis:
curAnalysis=(analyses.analysis+a);
fprintf(po,"<Analysis>\n");
//enclitica:
if
(*(curAnalysis->segments[6]))
fprintf(po,"<enc>%s</enc>\n",curAnalysis->segments[6]);
//particle:
if
(curAnalysis->part)
fprintf(po,"<part>%s</part>\n",curAnalysis->segments[5]);
//segmentation:
fprintf(po,"<segmentation>\n");
if
(*(curAnalysis->segments[0]))
fprintf(po,"<alt>%s</alt>\n",(curAnalysis->segments)[0]);
fprintf(po,"<les>%s</les>\n",(curAnalysis->segments)[1]);
if
(*(curAnalysis->segments[2]))
fprintf(po,"<sm1>%s</sm1>\n",(curAnalysis->segments)[2]);
if
(*(curAnalysis->segments[3]))
fprintf(po,"<sm2>%s</sm2>\n",(curAnalysis->segments)[3]);
if
(*(curAnalysis->segments[4]))
fprintf(po,"<sf>%s</sf>\n",(curAnalysis->segments)[4]);
if
(!curAnalysis->part && *(curAnalysis->segments[5]) )
fprintf(po,"
<spf>%s</spf>\n",(curAnalysis->segments)[5]);
fprintf(po,"</segmentation>\n");
//sequence
of morphological codes:
fprintf(po,"<morphological_analyses>\n");
for
(i=0;i<curAnalysis->n_cod_morf; i++)
{
fprintf(po,"<morphological_codes>\n");
for
(j=0;j<7;j++)
if
(curAnalysis->cod_morf_4_10[i][j]!='-')
fprintf(po,"<%s>%c</%s>\n",codes[j+3],
curAnalysis->cod_morf_4_10[i][j],codes[j+3]);
fprintf(po,"</morphological_codes>\n");
}
fprintf(po,"</morphological_analyses>\n");
//Lemmi:
fprintf(po,"<Lemmas>\n");
for
(l=0;l<curAnalysis->lemmas.numL;l++)
{
curLemma=(curAnalysis->lemmas.lems+l);
if
(curAnalysis->lemmas.numL>1)
if
(curLemma->type==IPERLEMMA)
fprintf(po,"<Lemma
type=\"iper\">\n");
else
fprintf(po,"<Lemma
type=\"ipo\">\n");
else
fprintf(po,"<Lemma>\n");
fprintf(po,"<lemma>%s</lemma>\n",curLemma->out_lemma);
//morphological
code of the lemma:
if
((curLemma->type==IPERLEMMA)||(curAnalysis->lemmas.numL==1))
{
fprintf(po,"<lemma_morphological_codes>\n");
for
(j=0;j<3;j++)
if
(curLemma->cod_morf_1_3[j]!='-')
fprintf(po,"<%s>%c</%s>\n",codes[j],
curLemma->cod_morf_1_3[j],codes[j]);
fprintf(po,"</lemma_morphological_codes>\n");
}
fprintf(po,"</Lemma>\n");
}
fprintf(po,"</Lemmas>\n");
fprintf(po,"</Analysis>\n");
}
fprintf(po,"</Analyses>\n");
}
The version demo, currently running in the project Web site, converts the output XML produced by this function in format HTML using the open source library libxslt ( http://xmlsoft.org/XSLT ) according to an opportune style sheet (.xsl file).
Soon it will be fixed with the other
partners of the project a common format for the output XML and defined a
relative DOM.
1.2 Problems encountered and decisions taken
1.2.1 n6, n7 and Pluralia Tantum LE
During the writing of the indications about the attribution of the
morphological values to the LE,
some specific cases that are beyond a possible generalization have been
recognised. They are reported in the next sections.
1.2.1.1 N6, N6r, N6i
To the LE duo and ambo (formed with the addition of the SF O to the
LES, as reported in the field Lemma) it is necessary to attribute the
following values: nomin/acc/voc plural masculine and neuter.
1.2.1.2 N7, N71, N72 e N7c
The LE heptameres receives the following values:
Nomin/voc sing masc/fem/neuter
Acc sing neuter
Nomin/acc/voc plural masc/fem
The value to be attributed is nominative singular masculine, if the LES
with = in the field Lemma is one of the following: auster (a3599), celer
(c1051); debil (d0103); perceler (p0940); praeceler (p2818)
If the LE is dis, the values are the following:
Nomin/voc sing masc/fem/neuter
Acc sing neuter
If the LES is tr, the values are the following:
Nomin/acc/voc plural
1.2.1.3 Pluralia Tantum
The following LE formed with a whole
LE in the field Lemma receive the values nominative and vocative plural
masculine (excluded, therefore, from all the other generalizations about the LE
registered with COD LES included between N1 and N7): transrhenani; sinistri;
ischiaci; octoni-deni; inferi
Equally, the following LE receive the values nominative, vocative,
accusative plural neuter: scholastica; rhagadia; pulchralia; palustria;
hepatia; sponsalia; parentalia; phalera; medica (the one registered with COD
LES N2N: in fact, another LE medica exists, but registered with COD LES N1 and
receiving the values nominative, vocative, ablative singular, according to what
has been stated above).
The values nominative, accusative, vocative plural masculine/feminine
have been attributed tp:
The LE formed with a SF -ES of LES
registered with COD LES N3, N31, N32, N3E (the other LE with these same
features and not listed here receive the automatic values according to what,
for instance, is written about the LE compag-es in our previous Periodic
Progress Report: see there in 1.2.1):
o N3
archigeront-es;
armit-es; axit-es; bancal-es; burgon-es; decur-es; detud-es; felicon-es;
filicon-es; fifeltar-es; gaesat-es; penat-es
o N31
can-es;
ophiogen-es; myrtit-es
o N3E
argyraspid-es;
argyroaspid-es; cardac-es; chrysoaspid-es; epod-es; hyad-es; hyperballont-es;
megistan-es; antipod-es
o N32
ant-es; confanens-es;
decatrens-es; decatress-es; decumat-es; foederal-es; grat-es; lact-es; maan-es;
man-es; perplur-es; pseudocomitatens-es; quamplur-es; transfluminal-es;
magnat-es; auxiliar-es; flexunt-es; quinquatr-es;
The following LE formed with a SF -ES of LES registered with COD LES
N71 (the other LE with these same features and not listed here receive the
values according to what is reported in 1.1.1.2):
o N71
alcionid-es; amsedent-es;
chamaerep-es; complur-es; corregional-es; illic-es; quomplur-es; tr-es (unico
LES non N71, ma N7)
The following LE formed with a whole
LE in the field Lemma (excluded, therefore, from all the other indications
about the LE registered with COD
LES included between N1 and N7):
o plures; aliquam-plures
The following LE formed with a whole LE in the field Lemma receive the
values nominative, accusative,
vocative plural masculine: scholares; antichthones; thetates
The following LE formed with a whole
LE in the field Lemma receive the values nominative, vocative plural
feminine: phalerae; inferiae
The LE rhagades formed with a whole
LE in the field Lemma receive the values nominative, accusative, vocative
plural feminine.
1.2.2 Management of P1P9, P18
LES
In the implementation
of the mechanism of recognition of the LES P1P9, P18 two problems have been
individualized. The first one has to do with the presence of post-final
segments for which the ordinary rules of compatibility were not suited. In the
case of the post-final segments that are compatible with the LES in question
the system has to verify their presence in the form, has to compare their
compatibility, and only then has to recognize the wordform. The second problem
has to do with the limitation in the highest number of post-final segments
admitted for each form: the old version of LEMLAT admitted, at most, the
presence of one post-final segment. This prevented the recognition of a
possible form of type quod-libet-que, in which, is present both the post-final
segment libet and the enclitic que (classified as post-final segment).
Both the problems have been solved modifying the code in the part that has to do with the recognition of the post-final segments and doubling the data structures that have to memorize the eventual segments in the form.
The final result has permitted to extend the number if forms
that are recognizable by the system (presence of two post-final segments) and
to operate a precise classification of the post-final segments:
Segments that have necessarily to be
part of a word form (e.g. the segmentlibet)
Segments that can be present in some
forms (e.g. the particle
met)
Segments that can be part of
whichever form (e.g. the enclitic que)
Lastly, a revision of the output of
the post-final segments has been operated so that to give some information that
are relative to the nature of the very segments: for those ones of the first
class is not given any indication in output (because they are indeclinable
parts of a form), for those ones of the second class is given the indication
particle, for those ones of the third class the indication enclitic.
1.3 Correspondence between planned project progress and actual
accomplishments
The progresses done in Workpackage 5 in the period from 1st
December 2003 to 28st February 2004 respect what planned in the
Project Program.
In particular, they are the following ones:
Identification of the morphological
values to be attributed to the LE of COD LES N6*, N7* and Pluralia Tantum;
Management of N, V, PR, P1-P9 and
P18 LES
Management of irregular Type
Management of the present
participles, past participles, future participles and irregular gerundives
Output re-organization and XML format
The test on literary texts, planned
for this period, has not been done, since we preferred to dedicate time to the
attribution of the morphological values to the LE, so that to dispose of LEMLAT
version, whose proof is mostly significant.
2. Work progress overview
2.1
Specific objectives for the reporting period
During the period covered by this report, we continued the development
of LEMLAT in CHLT LEMLAT, obtaining particularly the following results:
Identification of the morphological
values to be attributed to the LE of COD LES N6*, N7* and Pluralia Tantum;
Management of N, V, PR, P1-P9 and
P18 LES
Management of irregular Type
Management of the present
participles, past participles, future participles and irregular gerundives
Output re-organization and XML format
2.2
Achievements
2.2.1
List of Deliverables
December,
2002: Periodic Progress Report
March,
2003: Periodic Progress Report
June,
2003: D 5.1
September,
2003: Periodic Progress Report
December,
2003: Periodic Progress Report
February,
2004: D 5.2
2.2.2
Progress by Workpackage/task
According
to the specific appointed targets, the phase of the work in Workpackage 5
covered by this report has produced the following results:
Identification of the morphological
values to be attributed to the LE of COD LES N6*, N7* and Pluralia Tantum;
Management of N, V, PR, P1-P9 and
P18 LES
Management of irregular Type
Management of the present
participles, past participles, future participles and irregular gerundives
Output re-organization and XML format
2.3 Work planned for the next
reporting period
The work planned for the next
reporting period is the following:
Implementation of LE management
algorithms
Software tasting and validation
Starting software documentation
3. Project Management
3.1 Contractual Issues
No contractual issues in the period
covered by this report.
A discussion about the XML format
with CHLT partners just started. At the end of this stage a common and suitable
XML DOM will be implemented and used for LEMLAT output.
Bozzi Andrea, Informatica
per la linguistica storica: sperimentazione su testi greci e latini, Convegno di studi in memoria di Tristano Bolelli,
Scuola Normale Superiore, Pisa.
Alberto Paulo
F., Cappelli Giuseppe, Passarotti Marco, Pena Abel N., Instrumentos informticos para anlise de
texto, in Atti del
Colloquio Antiguidade Clssica: Que fazer com este Patrimnio?, Lisbona.
Passarotti Marco,
LEMLAT: a computational tool for latin lemmatization. Development and
perspectives, in Philological
Disciplines and Digital Technologies, Linguistica Computazionale, 22-22.
Bozzi Andrea, Informatica
per la linguistica storica: sperimentazione su testi greci e latini, in Studi e saggi linguistici.
CHLT Project
IST-2001-32745
1 March - 31 May 2004
Workpackage 5: Neo-Latin Morphological Analyser
1 March - 31 May 2004
C.N.R.
Andrea
Bozzi
Giuseppe
Cappelli
Marco
Passarotti
Paolo Ruffolo
1. Summary of key indicators
of project progress.2-6
1.9
Overall
assessment of the main milestones achieved
1.9.1
Wordforms
index, lemmas index and rationarium
1.9.2
Implementation
of LE management algorithms
1.9.2.1
Database
renewing
1.9.2.2
Alghorythm
for LE recognition and analysis
1.9.3
Implementation
of text morphological analysis utilities
1.10
Problems encountered and decisions
taken
1.11
Correspondence between planned
project progress and actual accomplishments
2.
Work progress overview7
2.1 Specific objectives for the reporting period
2.2 Achievements
2.2.1 List of Deliverables
2.2.2 Progress by Workpackage/task
2.3 Work planned for the next reporting period
3. Project Management......8
3.1 Co-operation
within the consortium
1. Summary of key indicators of
project progress
This report
concerns the activities realized by the Workpackage 5 in the period from 1st
March 2004 to 31th May 2004[16].
1.1 Overall assessment of the
main milestones achieved
The last three
months have been devoted to:
1.
the
mamangement of LE analysis,
2.
the
organisation of the data coming from the morpohological analysis performed by
CHLT LEMLAT about an input text.
1.1.1 Wordforms index, lemmas
index and rationarium
The coding of the wordforms segments
(LES, SF, SM) and the management of the different input wordforms segmentations
(LES+SF; LES+SM+SF; LES+SM+SM+SF; LE; FE) have been completed.
Thus, presently, all the wordforms
that can be analysed by LEMLAT are analysed also by CHLT LEMLAT with the adding
of the requested information about.
Since the help brought by a
morphological analyser is at the level of the analysis of a text and not just,
or only of a particular wordform, we are deciding about how to organize the
results of CHLT LEMLAT analysis when it is applied on a text.
We plan that the results of CHLT
LEMLAT analysis are organized in three documents:
1.
Wordforms
index: it is the alphabetical list of all the wordforms occurring in the input
text. Each wordform is followed by its occurrence number and by the number of
different lemmas to which is reduced;
2.
Lemmas
index: it is the alphabetical list of all the lemmas resulting from the
analysis of the text wordforms. Each lemma is followed by the total number of
the occurrences of its wordforms and by the number of its different wordforms
occurring in the text;
3.
Rationarium:
it it the alphabetical list of the lemmas, where each lemma is followed by the
complete morphological analysis of each its wordform.
Each of these documents should be
also listed as an index retrogradus, that is to say reading the lexical entry
not form left to right, but from right to left.
1.1.2 Implementation of LE
management algorithms
In order to implement the rules
described in PPR 12-2003 (1.1.1; 1.2) and PPR 03-2004 (1.1.1; 1.2.1) concerning
the LE analysis, we enriched the database with new tables and built up new
alghorythms in CHLT LEMLAT code source.
1.1.2.1 Database renewing
We implemented a table containing
the different tipologies of morphological codes patterns to be automatically
applied to LE:
This table, named cod_le, contains
two kinds of fields:
1.
alphanumeric
codes of the tipology of LE (field cod_le),
2.
pattern
of codes (fields from c04 to c10): the codes of the first three positions (c01:
PoS; c02: Type; c03: Flexive Category) are not involved here, because they
belong to the lemma and not to the worform (the LE, like all the lemmas, is
also a wordform).
For instance, the code 5 in the
field cod_le concerns the LE that must receive the neuter nominative,
accusative and vocative singular and has the following values in the fields c04c10:
c04: - c04:
- c04:
-
c05: - c05:
- c05:
-
c06: n (case: nominative) c06: v
(case: nominative) c06:
a (case: accusative)
c07: n (gender: neuter) c07: n (gender: neuter) c07: n (gender: neuter)
c08: s (number: singular) c08:
s (number: singular) c08:
s (number: singular)
c09: - c09:
- c09:
-
c10: - c10:
- c10:
-
The LE are stored in the table:
tab_le (lemma, CodLE, LES_ID)
This table contains three fields:
1.
lemma:
contains the LE itself,
2.
CodLE:
contains the code of the LE tipology (according to the codes stored in the cod_le
field of the cod_le table),
3.
LES_ID:
links to the corresponding LES in the lessario table.
For instance, for the LE arcion we
stored the following row in tab_le
tab_le (arcion, 5, 4553)
(in lessario the LE arcion has the value
4553 in the key field)
Furthermore, a new field (CodLE) has
been added in the lessario table, in order to filter out conflicting patterns
(see 1.1.2.2).
1.1.2.2 Alghorythm for LE
recognition and analysis
A specific function for LE
management has been implemented
int lemmi_ecc (const char *str)
This function, called before
performing the segmentation of an input wordform, uses the information stored
in the tab_le and cod_le tables in order to treat the possible LE.
For each input wordform a query is
performed on the data stored in the tab_le table: if the input wordform is
contained in the field lemma of such a table, the morphological pattern(s) of
codes (coming from cod_le table) corresponding to the alphanumeric code stored
in the CodLE field are included in the morphological analysis (by default). The
lemma is produced according to the LES row in the lessario table.
In the case of the input wordform arcion
(LE), the following analysis is performed:
-
Lemma:
arcion
-
Morphological
pattern of codes: NcB--nns--; NcB--vns--; NcB--ans--
The first three codes (NcB) are
produced through a conversion from the LEMLAT COD LEM (n2 in the case of arcion:
second declension nouns): n2 corresponds to NcB.
The corresponding between the LEMLAT
COD LEM and the first three EAGLES codes is stored in the EAGLES table.
The next seven codes patterns (--nns--;
--vns--; --ans--) are produced from the tipology of LE defined according to the
value of the CodLE field in the tab_le table: in the case of arcion, this value
is 5. Since in the cod_le table the cod_le 5 corresponds to the patterns --nms--, --vns-- and --ans--, these are included
in the morphological analysis of arcion.
The alghorythms for segmentation and
recognition of the SF have been modified in order to avoid possible conflicts
with the analysis coming from the new LE alghorythm.
For instance, since arcion is a
second declension noun, the wordform arcium would be segmented as arci-um and
analysed as a singular neuter nominative, vocative and accusative, and as a reduced
form of the plural genitive arciorum (in fact, the SF um compatible with a LES
coded with COD LES n2e, like the LES arci is, brings these information). But,
the actual singular nominative, vocative and accusative form of arcion is arcion
itself and not the regular one arcium (in fact, arcion is a LE): thus, the analysis
of the wordform arcium as a singular neuter nominative, vocative and accusative
of the arcion lemma must be avoided, and only the analysis as a plural genitive
is performed. Furthermore, the wordform arcium is analysed also as a feminine
genitive plural of the third declension nominal lemma arx.
If the field CodLE in the lessario
table selected for the candidate analysis is not empty, a query is performed in
order to exclude the rows of the SF table (tabsf) that bring the same
morphological pattern(s) recorded in the cod_le table for the value of the codLE
field in the lessario table.
In the case of arcion, the value 5
is recorded in the codLE field in the lessario table: this implies that the
rows of the SFs compatible with the LES arci coded with COD LES n2e (second
declension nouns with irregular inflexion) bringing the same pattern of codes
brought by the code 5 are excluded from the analysis.
Basically, the codes recorded in the
CodLE fields in the lessario table and in the tab_le table perform respectively
the exclusion and the inclusion of the corresponding patterns of codes.
All the LE have been extracted from
the lessario table and stored in the tab_le table according to the rule given
in PPR 12-2003 (1.1.1; 1.2) and PPR 03-2004 (1.1.1; 1.2.1).
1.1.3 Implementation of text
morphological analysis utilities
Using CHLT LEMLAT library, a text
analysis application has been implemented, producing an output in XML format.
The results of the analysis are
organized in three documents:
-
Wordforms
index
-
Lemmas
index
-
Rationarium
-
Wordforms
analysis
The format of these documents is
still under construction: we plan to define them according to TEI standards.
1.2 Problems encountered and decisions taken
No problems have been encountered in
the context of the achievement of the period covered by this report.
1.3 Correspondence between planned project progress and actual
accomplishments
The progresses done in Workpackage 5 in the period from 1st
March 2003 to 31st May 2004 respect what planned in the Project
Program.
In particular, they are the following ones:
Planning about how to organise the
CHLT LEMLAT results;
Implementation of LE management
algorithms
Implementation of text morphological
analysis utilities
In our previous PPR (March, 2004) we
planned to test and validate the software, and to start writing some
documentation about: since the work on the text analysis and alghorythms for
the LE management required more time than what we planned, we will do that when
the software will be in a almost definitive stage.
2.
Work progress overview
2.1
Specific objectives for the reporting period
During the period covered by this report, we continued the development
of LEMLAT in CHLT LEMLAT, obtaining particularly the following results:
Planning about how to organise the
CHLT LEMLAT results;
Implementation of LE management algorithms
Implementation of text morphological
analysis utilities
2.2
Achievements
2.2.1
List of Deliverables
December,
2002: Periodic Progress Report
March,
2003: Periodic Progress Report
June,
2003: D 5.1
September,
2003: Periodic Progress Report
December,
2003: Periodic Progress Report
February,
2004: D 5.2
March,
2004: Periodic Progress Report
2.2.2
Progress by Workpackage/task
According
to the specific appointed targets, the phase of the work in Workpackage 5
covered by this report has produced the following results:
Planning about how to organise the
CHLT LEMLAT results;
Implementation of LE management algorithms
Implementation of text morphological
analysis utilities
2.3 Work planned for the next
reporting period
The work planned for the next
reporting period is the following:
testing CHLT LEMLAT lemmatization
results: a number of Latin texts (and/or singular wordforms) will be submitted
to CHLT LEMLAT. The results will be analitically checked in order to find out
possible mistakes;
source code testing and validation;
source code documentation in order
to help developers to bulid up specific applications;
definition and documentation of CHLT
LEMLAT results: producing a TEI compliant DTD specifically designed with
morphological elements and attributes.
[1] December 2002 and March 2003.
[2] Some SM can sometimes occur also in
the wordform final position: for instance, the SM ior in the wordform pulchr-ior. Neverthless, they still remain SM:
in fact, wordforms like pulchrior are segmented by LEMLAT recognizing a LES (pulchr), an SM (ior) and an empty ending (blk: blank ending).
[3] The number of applied gender codes
(19.600) is different from the one reported in D 5.1 (20.984), because of some
modifications on the LES archive.
[4] Assigned, for instance, to verbal LES.
[5] v1/n6 means that the SM and is compatible, on the left, with a v1 LES and, on the right, with an n6 SF.
[6] For more details about the Type
codes, see our previous Reports and Deliverable. In any case, the semantics of
the Type code here used is the following:
o 3: determinative
o 4: Indefinitive/Interrogative
o 5: Interrogative/Relative
o 6: Indefinitive/Ordinal
o d: Distributive
o n: Cardinal
o o: Ordinal
o s: Possessive
o t: Interrogative
o u: Indefinitive
o y: Demonstrative
[7] *
means each character. This is done, in order to cover even codes such as n1e.
[8] For the n6* and n7* LES (first and
second class adjectives), the gender is obviously absent. The morphological
values to be applied are not filtered through this information. Thus, they are
all applied on the output analysis. For instance, the LE audax receive the
automatic values singular nominative, vocative (masculine, feminine and neuter)
and accusatice (only neuter).
[10] See our first Periodic Progress
Report for the details.
[11] See our previous Period Progress
Reports for the coding details.
[12] Special rules will be written about
hte Pluralia Tantum.
[13] For an easy understanding of the
technical terms and acromyms used in LEMLAT (and, thus, in this deliverable),
we write below a short table, where each term is briefly explained. For more
details, see our previous reports and deliverables:
o LES: the invariable part of the
inflected forms (for instance, am, in am-at-us);
o SM: the middle part of the inflected
forms (for instance, at in am-at-us);
o SF: the final part of the inflected
forms (for instance, us in am-at-us);
o COD LES: it is the code assigned to
each LES; each COD LES refers to a particular type of inflexion (for instance, n1e means first declension nouns with
exceptionally inflected);
o COD LEM: it is the code assigned to
each output lemma; each COD LEM refers to a general type of inflexion (for
instance, n1 refers to all
the first declension nouns, including also the ones coded with COD LES n1e);
o FE: exceptional wordform;
o LE: exceptional lemma.
[14] A form is recognized without segmentation when:
1.
In the
field Lemma there is a whole LE,
2.
In the
field Lemma there is a SF (-as),
3.
In the
field Lemma there is a =.
[15] For the detailed lists of the Type
coding, see our Periodic Progress Report of the September 2003 (section 1.1.7).
[16] For an easy understanding of the
technical terms and acromyms used in LEMLAT (and, thus, in this deliverable),
we write below a short table, where each term is briefly explained. For more
details, see our previous reports and deliverables:
o LES: the invariable part of the
inflected forms (for instance, am, in am-at-us);
o SM: the middle part of the inflected
forms (for instance, at in am-at-us);
o SF: the final part of the inflected
forms (for instance, us in am-at-us);
o COD LES: it is the code assigned to
each LES; each COD LES refers to a particular type of inflexion (for instance, n1e means first declension nouns with
exceptionally inflected);
o COD LEM: it is the code assigned to
each output lemma; each COD LEM refers to a general type of inflexion (for
instance, n1 refers to all
the first declension nouns, including also the ones coded with COD LES n1e);
o FE: exceptional wordform;
o LE: exceptional lemma.