Thesaurus-Based Feedback to Support Mixed Search and Browsing Environments

We propose and evaluate a query expansion mechanism that supports searching and browsing in collections of annotated documents. Based on generative language models, our feedback mechanism uses document-level annotations to bias the generation of expansion terms and to generate browsing suggestions in the form of concepts selected from a controlled vocabulary (as typically used in digital library settings). We provide a detailed formalization of our feedback mechanism and evaluate its effectiveness using the TREC 2006 Genomics track test set. As to the retrieval effectiveness, we find a 20% improvement in mean average precision over a query-likelihood baseline, whilst increasing precision at 10. When we base the parameter estimation and feedback generation of our algorithm on a large corpus, we also find an improvement over state-of-the-art relevance models. The browsing suggestions are assessed along two dimensions: relevancy and specificity. We present an account of per-topic results, which helps understand for what type of queries our feedback mechanism is particularly helpful.

  • [PDF] E. Meij and M. de Rijke, “Thesaurus-based feedback to support mixed search and browsing environments,” in Research and advanced technology for digital libraries, 11th european conference, ecdl 2007, 2007.
    [Bibtex]
    @inproceedings{ECDL:2007:meij,
    Author = {Edgar Meij and Maarten de Rijke},
    Booktitle = {Research and Advanced Technology for Digital Libraries, 11th European Conference, ECDL 2007},
    Date-Added = {2011-10-12 18:31:55 +0200},
    Date-Modified = {2012-10-28 23:04:22 +0000},
    Title = {Thesaurus-Based Feedback to Support Mixed Search and Browsing Environments},
    Year = {2007}}