Ontology-based Information-Filtering and Retrieval

Dr. Dominik Kuropka
Hasso-Plattner-Inst. für Softwaresystemtechnik
Prof.-Dr.-Helmertstr. 2-3, 14482 Potsdam
Fon: (0331) 55 09-193
Fax: (0331) 55 09-189
http://bpt.hpi.uni-potsdam.de

 

Motivation

Cheap mass storage and the increasing interconnectivity of computers lead to a rapid increase of available documents. This has risen the flow of information in business, sciences and administration to a point, where its exceeds the human processing capacity. To cope with this problem automated systems for Information Filtering and Retrieval are needed. This tutorial will give a short overview on linguistics and a classification and theoretical evalutation of popular IR&IR models which motivates the need for ontology based IF&IR models. The main part will deal with two ontology based IF&IR models: the Topic-based Vector Space Model (TVSM) and the Enhanced TVSM. Finally some implementation aspects and practical issues of those models as well as quantitative evaluation methods on IF&IR systems in general will be addressed.

Outline

1. Introduction to the issue of Information Filtering (IF) and Information Retrieval (IR)
2. Basic definitions
2.1 Architecture of IF&IR systems
2.2 Basics of computer linguistics
2.3 Ontologies
3. Classification of popular IF&IR models and theory based evaluation
3.1 models without term interdependencies
3.2 models with immanent term interdependencies
3.3 models with transcendent term interdependencies
4. Topic-based Vector Space Model (TVSM)
4.1 Concept
4.2 Stopword, stemming and synonym lemmas
4.3 comparison with other models and critics
5. Enhanced TVSM (eTVSM)
5.1 Concept
5.2 Connection to Ontologies
5.3 Implementation using relational databases
5.4 Comparison with other models and critics
6. Practical usage of the eTVSM
6.1 Ontology creation and reuse of available ontologies
6.2 Application to IF and IR
6.3 Quantitative evaluation of IF and IR systems


 

German Informatics Society   Naukowe Towarzystwo Informatyki Ekonomicznej
Gazeta IT