ITRI-03-02

Roger Evans, Carole Tiberius, Dunstan Brown, Greville Corbett

A large-scale inheritance-based morphological lexicon for Russian

Also published in Proceedings of the EACL'03 Workshop on Morphological Processing of Slavic Languages, pp. 9-16

In this paper we describe the mapping of Zaliznjak's (1977) morphological classes into the lexical representation language DATR (Evans and Gazdar 1996). On the basis of the resulting DATR theory a set of fully inflected forms together with their associated morphosyntax can automatically be generated from the electronic version of Zaliznjak's dictionary (Ilola and Mustajoki 1989). From this data we plan to develop a wide-coverage morphosyntactic lemmatizer and tagger for Russian.