Réseau Relex                     Accueil

3. TYPES OF ELECTRONIC DICTIONARIES


  • 3.1. Dictionary of simple words
  • 3.2. Phonological dictionaries
  • 3.3. Dictionary of compound terms
  • 3.4. Meaning
  • 3.5. Semantic markers
  •  

    Dictionaries and grammars have been recognized as crucial components of most applications of Natural Language Processing (NLP). Numerous prototypes of language analyzers and generators have been built, but, practically none of these prototypes incorporate full scale dictionaries and grammars. This general situation has been dubbed: processing with "toy dictionaries" and "toy grammars".

    Defining a full scale dictionary is already a problem in itself and this question must be addressed in several steps and constitutes in fact the core of the project.

    3.1. Dictionary of simple words

     

    The first step is the level of graphically simple words, namely words as they appear as entries of commercial dictionaries. In order to match a dictionary of canonical entries with words as they are found in texts, entries must be inflected. The general inflection scheme consists in appending inflection codes to canonical entries in order to generate all inflected forms. This approach seems straightforward, and even well-prepared by existing material such as conjugation dictionaries built for pedagogical purposes. However, few such dictionaries exist to-day, either in academic or industrial environments. There are indeed various questions to be solved both at the practical and at the theoretical level, in order to reach an operational stage of coverage for a dictionary.The members of the RELEX group have all built such a dictionary (DELA) for their language (cf. annex 1). These dictionaries are to be completed by many derivatives and technical words. This is the subject of task T1.