Questions and Answers signpost

Learning Semantic Query Suggestions

An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download.

  • [PDF] E. Meij, M. Bron, B. Huurnink, L. Hollink, and M. de Rijke, “Learning semantic query suggestions,” in Proceedings of the 8th international conference on the semantic web, 2009.
    [Bibtex]
    @inproceedings{ISWC:2009:Meij,
    Abstract = {Learning Semantic Query Suggestions by Edgar Meij, Marc Bron, Laura Hollink, Bouke Huurnink and Maarten de Rijke is available online now. An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download. },
    Author = {E. Meij and M. Bron and B. Huurnink and Hollink, L. and de Rijke, M.},
    Booktitle = {Proceedings of the 8th International Conference on The Semantic Web},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:45:04 +0000},
    Series = {ISWC 2009},
    Title = {Learning Semantic Query Suggestions},
    Year = {2009}}
Distribution of structured data embedded in XHTML

Investigating the Semantic Gap through Query Log Analysis

Significant efforts have focused in the past years on bringing large amounts of metadata online and the success of these efforts can be seen by the impressive number of web sites exposing data in RDFa or RDF/XML. However, little is known about the extent to which this data fits the needs of ordinary web users with everyday information needs. In this paper we study what we perceive as the semantic gap between the supply of data on the Semantic Web and the needs of web users as expressed in the queries submitted to a major Web search engine. We perform our analysis on both the level of instances and ontologies. First, we first look at how much data is actually relevant to Web queries and what kind of data is it. Second, we provide a generic method to extract the attributes that Web users are searching for regarding particular classes of entities. This method allows to contrast class definitions found in Semantic Web vocabularies with the attributes of objects that users are interested in. Our findings are crucial to measuring the potential of semantic search, but also speak to the state of the Semantic Web in general.

  • [PDF] P. Mika, E. Meij, and H. Zaragoza, “Investigating the semantic gap through query log analysis.,” in Proceedings of the 8th international semantic web conference, 2009.
    [Bibtex]
    @inproceedings{ISWC:2009:mika,
    Author = {Peter Mika and Edgar Meij and Hugo Zaragoza},
    Booktitle = {Proceedings of the 8th International Semantic Web Conference},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-30 08:45:11 +0000},
    Series = {ISWC 2009},
    Title = {Investigating the Semantic Gap through Query Log Analysis.},
    Year = {2009},
    Bdsk-Url-1 = {http://dblp.uni-trier.de/db/conf/semweb/iswc2009.html#MikaMZ09}}