Dependency Grammar Annotator (DGA) is a tool conceived in order to facilitate the syntactic annotation of texts (of a corpus) within the formal framework of Dependency Grammars. It has been designed in order to minimize the human effort necessary during the process of corpus creation.
The base of DGA is the graphical representation of the dependency relations. During the entire annotation process the user acts directly upon this graphical representation. As a consequence, besides the advantage of convenient usage, the accuracy of annotation increases, since the user receives an immediate graphical feedback regarding any changes performed in the syntactic structure. Operating upon the syntactic structure is extremely easy and intuitive: in order to create a dependency relation only two mouse clicks are necessary, while for labeling a word with the corresponding part of speech or for establishing the type of a dependency relation only one mouse click is needed, followed by the selection of the label from an appropriate list.
Usage easiness: the fact that the user operates directly with the graphical representation induces great easiness in using DGA and highly increases speed.
Portability: DGA has been written in Java 2. Being a pure Java application, DGA can run practically on any platform / under any operating system for which a Java 2 runtime environment (JRE) exists. Since it uses the pluggable look and feel technology, from the point of view of the interface, DGA will behave as a native application relatively to the platform on which it runs, the user already being familiar with the basic items of the interface, such as menus, buttons, standard dialog boxes etc.
Conformity with up-to-date standards: DGA is designed according to the EAGLES recommendations concerning syntactic annotation. The annotated texts are saved in XML format, as representing the standard in data description adopted by the linguistic community as the standard way of representing corpora. Although a standard set of XML tags for syntactic annotation does not exist yet, as is the case for morpho-syntactic annotation (XCES), DGA uses a minimal set of tags inspired by XCES. Thus, the XML files produced by DGA can be easily transformed, by means of XSLT, into XML files which are based on a different vocabulary (tag set) meeting the requirements of the user or being in conformity with a future standard.
Flexibility: besides the fact that syntactic analysis must have the form of dependency relations, DGA does not impose any other restrictions upon the user. The latter may easily define and at any time modify his own parts of speech and dependency relations sets, which will be used in annotation.
For more details see the online documentation |
You can ask any question concerning DGA here |
If you have a Java 2 enabled browser try the online demo |
You can also download the full application |
Here are some Romanian texts annotated with this tool |