2009 - Edgar Meij

A query model based on normalized log-likelihood

19/11/2009 Conference Papers Publications No Comments

A query is usually a brief, sometimes imprecise expression of an underlying information need . Examining how queries can be transformed to equivalent, potentially better queries is a theme of recurring interest to the information retrieval community. Such transformations include expansion of short queries to long queries, paraphrasing queries using…

Histogram indicating the number of documents vs the number of keyphrases

A Comparative Study of Features for Keyphrase Extraction

Keyphrases are short phrases that reflect the main topic of a document. Because manually annotating documents with keyphrases is a time-consuming process, several automatic approaches have been developed. Typically, candidate phrases are extracted using features such as position or frequency in the document text. Many different features have been suggested,…

Learning Semantic Query Suggestions

25/10/2009 Conference Papers Publications No Comments

An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete,…

Investigating the Semantic Gap through Query Log Analysis

25/10/2009 Conference Papers Publications No Comments

Significant efforts have focused in the past years on bringing large amounts of metadata online and the success of these efforts can be seen by the impressive number of web sites exposing data in RDFa or RDF/XML. However, little is known about the extent to which this data fits the…

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

20/10/2009 Conference Papers Publications No Comments

Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed.…

A Semantic Perspective on Query Log Analysis

17/09/2009 Publications Unrefereed No Comments

We present our views on the CLEF log file analysis task. We argue for a task definition that focuses on the semantic enrichment of query logs. In addition, we discuss how additional information about the context in which queries are being made could further our understanding of users’ information seeking…

An evaluation of entity and frequency based query completion methods

16/07/2009 Conference Papers Publications No Comments

From the days of boolean search on library catalogues, users have reformulated their queries after an inspection of initial search results. Traditional information retrieval studies this in frameworks such as query expansion, relevance feedback, interactive retrieval, etc. These methods mostly exploit document contents because that is typically all information that…

Investigating the Demand Side of Semantic Search through Query Log Analysis

17/04/2009 Publications Workshop Papers No Comments

Semantic search is by its broadest definition a collection of approaches that aim at matching the Web’s content with the information need of Web users at a semantic level. Most of the work in this area has focused on the supply-side of semantic search, in particular elevating Web content to…

Semantic disclosure in an e-Science environment

16/03/2009 Books Publications No Comments

The Virtual Laboratory for e-Science (VL-e) project serves as a backdrop for the ideas described in this chapter. VL-e is a project with academic and industrial partners where e-science has been applied to several domains of scientific research. Adaptive Information Disclosure (AID), a subprogram within VL-e, is a multi-disciplinary group…

A Generative Language Modeling Approach for Ranking Entities

16/02/2009 Conference Papers Publications No Comments

We describe our participation in the INEX 2008 Entity Ranking track. We develop a generative language modeling approach for the entity ranking and list completion tasks. Our framework comprises the following components: (i) entity and (ii) query language models, (iii) entity prior, (iv) the probability of an entity for a…