Thesaurus-Based Feedback to Support Mixed Search and Browsing Environments

We propose and evaluate a query expansion mechanism that supports searching and browsing in collections of annotated documents. Based on generative language models, our feedback mechanism uses document-level annotations to bias the generation of expansion terms and to generate browsing suggestions in the form of concepts selected from a controlled vocabulary (as typically used in digital library settings). We provide a detailed formalization of our feedback mechanism and evaluate its effectiveness using the TREC 2006 Genomics track test set. As to the retrieval effectiveness, we find a 20% improvement in mean average precision over a query-likelihood baseline, whilst increasing precision at 10. When we base the parameter estimation and feedback generation of our algorithm on a large corpus, we also find an improvement over state-of-the-art relevance models. The browsing suggestions are assessed along two dimensions: relevancy and specificity. We present an account of per-topic results, which helps understand for what type of queries our feedback mechanism is particularly helpful.

[bibtex key=ECDL:2007:meij]

[bibtex key=ECDL:2007:meij:talk]