| |
Information Retrieval Methods for
XML Documents
Norbert Fuhr
|
Abstract
Information Retrieval methods for XML Documents –
Introduction XML is going to be established as standard document format,
especially for Web-based applications. The major purpose of XML markup is
the explicit representation of the logical structure of a document. Given
this markup, different kinds of operations referring to the logical
structure can be performed on XML documents: multiple views on a document
can be generated, specific elements of an XML document can be extracted,
or documents fulfilling specific structural conditions can be retrieved
from a document base. Overall, if information is represented in XML
format, exchange of this information between different software systems
(especially on the Web) is simplified, thus supporting interoperability.
Looking at the broad variety of XML applications and systems that are
currently under development, one can see that there are in fact two
different views on XML: The document-centric view focuses on structured
documents in the traditional sense (based on concepts from electronic
publishing, especially SGML). The data-centric view uses XML for
exchanging formatted data in a generic, serialized form between different
applications (e.g. spreadsheets, database records).
|
Bio
Norbert Fuhr is professor at the CS department of the University of
Dortmund (Germany) since 1991. He received a master (diploma) in
technical computer science in 1980 and a PhD (Dr) in 1986 from the
Technical University of Darmstadt (Germany), where he also worked in the
CS department from 1980-91.
His major research areas are information retrieval models and methods,
especially for the integration of information retrieval and database
systems, semistructured data and multimedia. Currently, he is involved in
a number of national and international research projects dealing with the
application of these concepts for digital libraries and XML documents. |
|
Norbert Fuhr
Informatik VI
University of Dortmund, 44221 Dortmund, Germany
Email: fuhr@cs.uni-dortmund.de
WWW: http://ls6-www.cs.uni-dortmund.de/ir/
Postal: August-Schmidt-Str. 12, 44227 Dortmund, Germany |
|
|