Questions and Answers signpost

Learning Semantic Query Suggestions

An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download.

  • [PDF] E. Meij, M. Bron, B. Huurnink, L. Hollink, and M. de Rijke, “Learning semantic query suggestions,” in Proceedings of the 8th international conference on the semantic web, 2009.
    [Bibtex]
    @inproceedings{ISWC:2009:Meij,
    Abstract = {Learning Semantic Query Suggestions by Edgar Meij, Marc Bron, Laura Hollink, Bouke Huurnink and Maarten de Rijke is available online now. An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download. },
    Author = {E. Meij and M. Bron and B. Huurnink and Hollink, L. and de Rijke, M.},
    Booktitle = {Proceedings of the 8th International Conference on The Semantic Web},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:45:04 +0000},
    Series = {ISWC 2009},
    Title = {Learning Semantic Query Suggestions},
    Year = {2009}}
Distribution of structured data embedded in XHTML

Investigating the Semantic Gap through Query Log Analysis

Significant efforts have focused in the past years on bringing large amounts of metadata online and the success of these efforts can be seen by the impressive number of web sites exposing data in RDFa or RDF/XML. However, little is known about the extent to which this data fits the needs of ordinary web users with everyday information needs. In this paper we study what we perceive as the semantic gap between the supply of data on the Semantic Web and the needs of web users as expressed in the queries submitted to a major Web search engine. We perform our analysis on both the level of instances and ontologies. First, we first look at how much data is actually relevant to Web queries and what kind of data is it. Second, we provide a generic method to extract the attributes that Web users are searching for regarding particular classes of entities. This method allows to contrast class definitions found in Semantic Web vocabularies with the attributes of objects that users are interested in. Our findings are crucial to measuring the potential of semantic search, but also speak to the state of the Semantic Web in general.

  • [PDF] P. Mika, E. Meij, and H. Zaragoza, “Investigating the semantic gap through query log analysis.,” in Proceedings of the 8th international semantic web conference, 2009.
    [Bibtex]
    @inproceedings{ISWC:2009:mika,
    Author = {Peter Mika and Edgar Meij and Hugo Zaragoza},
    Booktitle = {Proceedings of the 8th International Semantic Web Conference},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:45:11 +0000},
    Series = {ISWC 2009},
    Title = {Investigating the Semantic Gap through Query Log Analysis.},
    Year = {2009},
    Bdsk-Url-1 = {http://dblp.uni-trier.de/db/conf/semweb/iswc2009.html#MikaMZ09}}
Type completion

An evaluation of entity and frequency based query completion methods

From the days of boolean search on library catalogues, users have reformulated their queries after an inspection of initial search results. Traditional information retrieval studies this in frameworks such as query expansion, relevance feedback, interactive retrieval, etc. These methods mostly exploit document contents because that is typically all information that is available. The situation is very different in web search engines because of the large amounts of users whose queries are collected in query logs. Query logs reflect how large numbers of users express their queries and can be a rich source of information when optimizing search results or determining query suggestions.

In this paper we study a special case of query suggestion: query completion, which aims to help users complete their queries. In particular, we are interested in comparing a commonly adopted frequency-based approach with methods that exploit an understanding of the type of entities in queries. Our intuition is that completion for rare queries can be improved by understanding the type of entity being sought. For example, if we know that “LX354” is a kind of digital camera, we can generate sensible completions by choosing them from the set of completions used with other digital cameras. Besides suggesting queries, the obtained completions can also function as facets for faceted browsing or as input for ontology engineering since they represent query refinements common to a class of entities. In this paper, we address the following questions: (i) How can we recognize entities and their types in queries? (ii) How can we rank possible completions given an entity type? (iii) How can our methods be evaluated and how do they perform? To address (iii), we propose a novel method which evaluates the prediction of real web queries. We show that a purely frequency-based approach without any entity type information works quite well for more frequent queries, but is surpassed by type-based methods for rare queries.

  • [PDF] E. Meij, P. Mika, and H. Zaragoza, “An evaluation of entity and frequency based query completion methods,” in Proceedings of the 32nd international acm sigir conference on research and development in information retrieval, 2009.
    [Bibtex]
    @inproceedings{SIGIR:2009:meij,
    Author = {Meij, Edgar and Mika, Peter and Zaragoza, Hugo},
    Booktitle = {Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:43:25 +0000},
    Series = {SIGIR 2009},
    Title = {An evaluation of entity and frequency based query completion methods},
    Year = {2009},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/1571941.1572074}}
INEX

A Generative Language Modeling Approach for Ranking Entities

We describe our participation in the INEX 2008 Entity Ranking track. We develop a generative language modeling approach for the entity ranking and list completion tasks. Our framework comprises the following components: (i) entity and (ii) query language models, (iii) entity prior, (iv) the probability of an entity for a given category, and (v) the probability of an entity given another entity. We explore various ways of estimating these components, and report on our results. We find that improving the estimation of these components has very positive effects on performance, yet, there is room for further improvements.

  • [PDF] W. Weerkamp, K. Balog, and E. Meij, “A generative language modeling approach for ranking entities,” in Advances in focused retrieval, 2009.
    [Bibtex]
    @inproceedings{INEX:2008:weerkamp,
    Abstract = {We describe our participation in the INEX 2008 Entity Ranking track. We develop a generative language modeling approach for the entity ranking and list completion tasks. Our framework comprises the following components: (i) entity and (ii) query language models, (iii) entity prior, (iv) the probability of an entity for a given category, and (v) the probability of an entity given another entity. We explore various ways of estimating these components, and report on our results. We find that improving the estimation of these components has very positive effects on performance, yet, there is room for further improvements.},
    Author = {Weerkamp, W. and Balog, K. and Meij, E.},
    Booktitle = {Advances in Focused Retrieval},
    Date-Added = {2011-10-16 12:29:08 +0200},
    Date-Modified = {2011-10-16 12:29:08 +0200},
    Organization = {Springer},
    Publisher = {Springer},
    Title = {A Generative Language Modeling Approach for Ranking Entities},
    Year = {2009}}
Stack of books

Concept models for domain-specific search

We describe our participation in the 2008 CLEF Domain-specific track. We evaluate blind relevance feedback models and concept models on the CLEF domain-specific test collection. Applying relevance modeling techniques is found to have a positive effect on the 2008 topic set, in terms of mean average precision and precision@10. Applying concept models for blind relevance feedback, results in even bigger improvements over a query-likelihood baseline, in terms of mean average precision and early precision.

  • [PDF] E. Meij and M. de Rijke, “Concept models for domain-specific search,” in Evaluating systems for multilingual and multimodal information access, 9th workshop of the cross-language evaluation forum, clef 2008, aarhus, denmark, september 17-19, 2008, revised selected papers, 2009.
    [Bibtex]
    @inproceedings{CLEF:2008:meij,
    Author = {Meij, Edgar and de Rijke, Maarten},
    Booktitle = {Evaluating Systems for Multilingual and Multimodal Information Access, 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:44:35 +0000},
    Title = {Concept models for domain-specific search},
    Year = {2009}}
question mark

Towards a combined model for search and navigation of annotated documents

Documents whose textual content is complemented with annotations of one kind or another are ubiquitous. Examples include biomedical documents (annotated with MeSH terms) and news articles (annotated with IPTC terms). Such annotations—or concepts—have typically been used for query expansion, to suggest alternative or related query formulations, and to facilitate browsing of the document collection. In recent years, we have seen two important developments in this area: (i) a renewed interest in the knowledge sources underlying the annotations, mainly inspired by semantic web initiatives and (ii) the creation of social annotations, as part of web 2.0 developments. These developments motivate a renewed interest in models and methods for accessing annotated documents.

The theme of my proposed research is to capture two aspects in a single, unified model: retrieval and navigation. Given a query, this entails using both term-based and concept-based evidence to locate relevant information (retrieval) and suggesting useful browsing suggestions (navigation). I imagine this to be a “two-way” process, i.e., the user can browse the document collection using concepts and the relations between concepts, but she can also navigate the knowledge structure using the (vocabulary) terms from the documents. Such information seeking behavior is witnessed in an increasing number of applications and domains (e.g., suggesting related tags in Bibsonomy or Flickr), providing a solid motivation for my research agenda. In order to accomplish this unification, I will first need to address three separate, but intertwined issues. First, a way of “bridging the gap” between concepts and (vocabulary) terms is needed, since concepts are not directly observable. Second, relations between concepts need to be modeled in some way. Finally, the concepts and relations thus modeled should be integrated in the information seeking process, thereby improving both retrieval and navigation.

So far, I have formulated concept modeling as a form of text classification, by representing concepts as distributions over vocabulary terms. In the context of a digital library setting, I have shown that integrating conceptual knowledge in this way can be beneficial both to retrieval performance as well as to facilitate navigation. More recently, I have taken these experiments a step further by creating parsimonious concept models. In these experiments, the integration of concepts in the query model estimations is able to deliver significantly better results, both compared to a query likelihood run as well as to a run based on relevance models.

To determine the strength of relations between concepts, I have looked at using the divergence between concept models. The estimations are based on differences in language use as measured by computing the cross-entropy reduction between concept models. Experimental results show that this approach is able to outperform both path-based as well as information content-based methods on two separate test sets. While this approach measures the similarity between concepts, it does not explicitly take a relation type into consideration. Thus, any explicit link structure present in the used knowledge structure disappears. Whether this is a reasonable assumption for my work is still unclear and something I intend to find an answer to.

In future work, I would also like to address the question how the retrieval-oriented models I have introduced so far may be used to further aid navigation. To some extent, I have already used the TREC Genomics test collections for the evaluation of the navigational effectiveness, but future work—possibly observing users directly in a user study or indirectly through log analysis—should indicate what the model’s impact, if any, is on navigational effectiveness.

  • [PDF] E. Meij, “Towards a combined model for search and navigation of annotated documents,” in Proceedings of the 31st annual international acm sigir conference on research and development in information retrieval, 2008.
    [Bibtex]
    @inproceedings{SIGIR:2008:meij-doctcons,
    Abstract = {Note: OCR errors may be found in this Reference List extracted from
    the full text article. ACM has opted to expose the complete List
    rather than only correct and linked references.},
    Author = {Meij, Edgar},
    Booktitle = {Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:48:04 +0000},
    Series = {SIGIR 2008},
    Title = {Towards a combined model for search and navigation of annotated documents},
    Year = {2008},
    Bdsk-Url-1 = {http://dx.doi.org/10.1145/1390334.1390573}}
Apple of orange?

Measuring Concept Relatedness Using Language Models

Over the years, the notion of concept relatedness has at- tracted considerable attention. A variety of approaches, based on ontology structure, information content, association, or context have been proposed to indicate the relatedness of abstract ideas. In this paper we present a novel context based measure of concept relatedness, based on cross entropy reduction. We propose a method based on the cross entropy reduction between language models of concepts which are estimated based on document-concept assignments. After introducing our method, we compare it to the methods introduced earlier, by comparing the results with relatedness judgments provided by human assessors. The approach shows improved or competitive results compared to state-of-the-art methods on two test sets in the biomedical domain.

  • [PDF] D. Trieschnigg, E. Meij, M. de Rijke, and W. Kraaij, “Measuring concept relatedness using language models,” in Proceedings of the 31st annual international acm sigir conference on research and development in information retrieval, 2008.
    [Bibtex]
    @inproceedings{SIGIR:2008:trieschnigg,
    Author = {Trieschnigg, Dolf and Meij, Edgar and de Rijke, Maarten and Kraaij, Wessel},
    Booktitle = {Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:45:51 +0000},
    Series = {SIGIR 2008},
    Title = {Measuring concept relatedness using language models},
    Year = {2008},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/1390334.1390523}}
mathematics

Parsimonious Relevance Models

Relevance feedback is often applied to better capture a user’s information need. Automatically reformulating queries (or blind relevance feedback) entails looking at the terms in some set of (pseudo-)relevant documents and selecting the most informative ones with respect to the set or the collection. These terms may then be reweighed based on information pertinent to the query or the documents and—in a language modeling setting—be used to estimate a query model, P(t|θQ), i.e., a distribution over terms t for a given query Q.

Not all of the terms obtained using blind relevance feedback are equally informative given the query, even after reweighing. Some may be common terms, whilst others may describe the general domain of interest. We hypothesize that refining the results of blind relevance feedback, using a technique called parsimonious language modeling, will improve retrieval effectiveness. Hiemstra et al. already provide a mechanism for incorporating (parsimonious) blind relevance feedback, by viewing it as a three component mixture model of document, set of feedback documents, and collection. Our approach is more straightforward, since it considers each feedback document separately and, hence, does not require the additional mixture model parameter. To create parsimonious language models we use an EM algorithm to update the maximum-likelihood (ML) estimates. Zhai and Lafferty already proposed an approach which uses a similar EM algorithm; it differs, however, in the way the set of feedback documents is handled. Whereas we parsimonize each individual document, they apply their EM algorithm to the entire set of feedback documents.

To verify our hypothesis, we use a specific instance of blind relevance feedback, namely relevance modeling (RM). We choose this particular method because it has been shown to achieve state-of-the-art retrieval performance. Relevance modeling assumes that the query and the set of documents are samples from an underlying term distribution—the relevance model. Lavrenko and Croft formulate two ways of approaching the estimation of the parameters of this model. We build upon their work and compare the results of our proposed parsimonious relevance models with RMs as well as with a query-likelihood baseline. To measure the effects in different contexts, we employ five test collections taken from the TREC-7, TREC Robust, Genomics, Blog, and Enterprise tracks and show that our proposed model improves performance in terms of mean average precision on all the topic sets over both a query-likelihood baseline as well as a run based on relevance models. Moreover, although blind relevance feedback is mainly a recall enhancing technique, we observe that parsimonious relevance models (unlike their non-parsimonized counterparts) can also improve early precision and reciprocal rank of the first relevant result. Thus, our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most measures, and (iii) have a precision enhancing effect, unlike other blind relevance feedback methods.

  • [PDF] E. Meij, W. Weerkamp, K. Balog, and M. de Rijke, “Parsimonious relevance models,” in Proceedings of the 31st annual international acm sigir conference on research and development in information retrieval, 2008.
    [Bibtex]
    @inproceedings{SIGIR:2008:Meij-prm,
    Author = {Meij, Edgar and Weerkamp, Wouter and Balog, Krisztian and de Rijke, Maarten},
    Booktitle = {Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:47:44 +0000},
    Series = {SIGIR 2008},
    Title = {Parsimonious relevance models},
    Year = {2008},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/1390334.1390520}}
concepts

Parsimonious concept modeling

In many collections, documents are annotated using concepts from a structured knowledge source such as an ontology or thesaurus. Examples include the news domain, where each news item is categorized according to the nature of the event that took place, and Wikipedia, with its per-article categories. These categorizing systems originally stem from the cataloging systems used in libraries and conceptual search is commonly used in digital library environments at the front-end to support search and navigation. In this paper we want to employ the explicit knowledge used for annotation at the back-end, not just to improve retrieval performance, but also to generate high-quality term and concept suggestions. To do so, we use the dual document representation— concepts and terms—to create a generative language model for each concept, which bridges the gap between vocabulary terms and concepts. Related work has also used textual representations to rep- resent concepts, however, there are two important differences. First, we use statistical language modeling techniques to parametrize the concept models, by leveraging the dual represen- tation of the documents. Second, we found that simple maximum likelihood estimation assigns too much probability mass to terms and concepts which may not be relevant to each document. Thus we apply an EM algorithm to “parsimonize” the document models.

The research questions we address are twofold: (i) what are the results of applying our model as compared to a query-likelihood baseline as well as compared to a run based on relevance models and (ii) what is the influence of parsimonizing? To answer these questions, we use the TREC Genomics track test collections in conjunction with MedLine. MedLine contains over 16 million bibliographic records of publications from the life sciences domain and each abstract therein has been manually indexed by trained curators, who use concepts from the MeSH (Medical Subject Headings) thesaurus. We show that our approach is able to achieve similar or better performance than relevance models, whilst at the same time providing high quality concepts to facilitate navigation. Examples show that our parsimonious concept models generate terms that are more specific than those acquired through maximum likelihood estimates.

  • [PDF] E. Meij, D. Trieschnigg, M. de Rijke, and W. Kraaij, “Parsimonious concept modeling,” in Proceedings of the 31st annual international acm sigir conference on research and development in information retrieval, 2008.
    [Bibtex]
    @inproceedings{SIGIR:2008:Meij-cm,
    Author = {Meij, Edgar and Trieschnigg, Dolf and de Rijke, Maarten and Kraaij, Wessel},
    Booktitle = {Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:46:38 +0000},
    Series = {SIGIR 2008},
    Title = {Parsimonious concept modeling},
    Year = {2008},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/1390334.1390519}}