Leveraging the consumption of linguistic open data for educational purposes / Antonio Jesus Roa Valverde
AuthorRoa-Valverde, Antonio Jesus
Thesis advisorFensel, Dieter ; Sicilia, Miguel-Angel
DescriptionXIII, 165 S. : Ill., graph. Darst.
Institutional NoteInnsbruck, Univ., Diss., 2015
Date of SubmissionJanuary 2015
Document typeDissertation (PhD)
Keywords (DE)linguistic data / ontologies and vocabularies / data ranking / foreign second language learning
Keywords (GND)Sprachdaten / Ranking
 The Document is only available locally
The Web is turning into a platform where data is one of its main citizens. Collaborative initiatives like Linked Open Data (LOD) rely on accepted standards and provide with a set of best practices to promote the sharing and consumption of data at large scale. A growing portion of this data is populated by linguistic information, which tackles the description of lexicons and their usage. An important resource within this field is Wiktionary, which can be seen as the leading data source containing lexical information nowadays. Wiktionary is an online collaborative project based on the principle of the "Wisdom of the Crowd'' that tries to build an open multilingual dictionary available for everybody. Since its inception in 2002, Wiktionary has grown considerably and, therefore, caught the attention of many researchers. Several attempts have tried to compare the usability of Wiktionary with traditional expert-edited lexicographical efforts. Others have relied on reusing the data provided by Wiktionary for accomplishing certain information retrieval and natural language processing tasks. Additional approaches have focused on converting Wiktionary to make it compatible with the LOD principles and then align the content with other available resources.

In this thesis we exploit the practical dimension of Wiktionary, i.e., we focus our attention on the educational context and try to use Wiktionary as a resource to help users learning a foreign second language (FSL). For achieving this task we target the multilingual dimension of Wiktionary. We perform a quantitative analysis of the existing translations in order to measure their level of reliability and guarantee a minimum of quality during their usage. We rely our analysis on the use of ranking approaches, which have shown to provide successful results in information retrieval scenarios.

A part of our research focuses on the study of existing formats and mechanisms to exchange linguistic data. We use the gained expertise to design an ontological model in order to cope with the interoperability issues associated to our scenario. We use this model to share the generated data as part of the public open data cloud.

Continuing with our educational use case, we show how our research can be put in practice in the context of a mobile application designed exclusively for supporting users in the acquisition of new FSL vocabulary.

