Syntactical Analysis of Georgian Texts

Jemali Antidze, Nana Gulua, David Mishelashvili

I.Vekua Scientific Institute of Applied Mathematics, Tbilisi State University

2 University, Tbilisi, 380043, Georgia

Phone: 99532 305079

E-mail: antidze@viam.hepi.edu.ge

Abstract

The article describes the algorithm of syntactical analysis of Georgian texts and its realization.

The first version was published in ([1]) and detailed description was presented in the dissertation

work of N.Gulua ([2]).The new version is based on PCPATR formalism ([3]) and realized

with OS LINUX 6.2 version. The linguistic approach is described based on specific features of the

Georgian language. Particularly, this is the role of a verbform in the formation of Georgian sentence.

Keywords: syntactical analysis, formal grammar, parse tree, feature structure, morphological

analysis.

The article describes the algorithm of syntactical analysis of Georgian texts and its

realization. The first version was published in ([1]) and detailed description was presented in the

dissertation work of N.Gulua ([2]).THE new version is based on PCPATR formalism ([3]) and

realized with OS LINUX 6.2 version. The linguistic approach is described based on specific

features of the Georgian language. Particularly, this is the role of a verb form in the formation of

Georgian sentence. Our approach facilitates recognition of relations bitween the subject, objects

and predicate in a sentence irrespective of their order. Besides, Georgian verb has many forms.

This demands identification of the form from its root and affixes before syntactical analysis.

Therefore, in order to use PCPATR formalism it is necessary to establish in advance the feature

structure for each word of a sentence. This isn’t feasible without morphological analysis. Otherwise,

we would need to include each form in the dictionary,which increases the volume of the

dictionary. Therefore we have done the morphological analysis with a special approach and the

result is represented in acceptable form for PCPATR. We established the feature structure for

each lexical item, which is used widely for composition of restrictions on the rules. Also, the

restrictions provide the semantical compatibility of the words of a sentence. The result of the

syntactical analysis is the parse tree of a sentence. Now, our morphological analysis das not

provide the composition of a word from its root and affixes,but in the future we provide to use

PC-KIMMO for this goal. The program is tested on the scientific texts and the experiments

continue for future improvement.

References

[1]. J.Antidze, N.Gulua. On selection of Georgian Text Analysis Formalism, Bulletin of the

Georgian Academy of Sciences, 162, 2, 2000.

[2]. N.Gulua. Formalized Description of Georgian Texts, its Software and its Application to the

Construction of a Teaching System, PhD Dissertation, Tbilisi,1999.

[3]. Stephen McConnel. PC-PATR Reference Manual, version 1.2.2, 2000.