TREC

The University of Amsterdam at the TREC 2011 Session Track

We describe the participation of the University of Amsterdam’s ILPS group in the Session track at TREC 2011.

The stream of interactions created by a user engaging with a search system contains a wealth of information. For retrieval purposes, previous interactions can help inform us about a user’s current information need. Building on this intuition, our contribution to this TREC year’s session track focuses on session modeling and learning to rank using session information. In this paper, we present and compare three complementary strategies that we designed for improving retrieval for a current query using previous queries and clicked results: probabilistic session modeling, semantic query modeling, and implicit feedback.

In our experiments we examined three complementary strategies for improving retrieval for a current query. Our first strategy, based on probabilistic session modeling, was the best performing strategy.

Our second strategy, based on semantic query modeling, did less well than we expected, likely due to topic drift from excessively aggressive query expansion. We expect that performance of this strategy would improve by limiting the number of terms and/or improving the probability estimates.

With respect to our third strategy, based on learning from feedback, we found that learning weights for linear weighted combinations of features from an external collection can be beneficial, if characteristics of the collection are similar to the current data. Feedback available in the form of user clicks appeared to be less beneficial. Our run learning from implicit feedback did perform substantially lower than a run where weights were learned from an external collection with explicit feedback using the same learning algorithm and set of features.

  • [PDF] B. Huurnink, R. Berendsen, K. Hofmann, E. Meij, and M. de Rijke, “The University of Amsterdam at the TREC 2011 session track,” in The twentieth text retrieval conference, 2012.
    [Bibtex]
    @inproceedings{TREC:2011:huurnink,
    Author = {Huurnink, Bouke and Berendsen, Richard and Hofmann, Katja and Meij, Edgar and de Rijke, Maarten},
    Booktitle = {The Twentieth Text REtrieval Conference},
    Date-Added = {2011-10-22 12:22:18 +0200},
    Date-Modified = {2013-05-22 11:44:53 +0000},
    Month = {January},
    Series = {TREC 2011},
    Title = {The {University of Amsterdam} at the {TREC} 2011 Session Track},
    Year = {2012}}
P30 difference plot

Team COMMIT at TREC 2011

We describe the participation of Team COMMIT in this year’s Microblog and Entity track.

In our participation in the Microblog track, we used a feature-based approach. Specifically, we pursued a precision oriented recency-aware retrieval approach for tweets. Amongst others we used various types of external data. In particular, we examined the potential of link retrieval on a corpus of crawled content pages and we use semantic query expansion using Wikipedia. We also deployed pre-filtering based on query-dependent and query-independent features. For the Microblog track we found that a simple cut-off based on the z-score is not sufficient: for differently distributed scores, this can decrease recall. A well set cut-off parameter can however significantly increase precision, especially if there are few highly relevant tweets. Filtering based on query-independent filtering does not help for already small result list. With a high occurrence of links in relevant tweets, we found that using link retrieval helps improving precision and recall for highly relevant and relevant tweets. Future work should focus on a score-distribution dependent selection criterion.

In this years Entity track participation we focused on the Entity List Completion (ELC) task. We experimented with a text based and link based approach to retrieve entities in Linked Data (LD). Additionally we experimented with selecting candidate entities from a web corpus. Our intuition is that entities occurring on pages with many of the example entities are more likely to be good candidates than entities that do not. For the Entity track there are no analyses or conclusions to report yet; at the time of writing no evaluation results are available for the Entity track.

  • [PDF] M. Bron, E. Meij, M. Peetz, M. Tsagkias, and M. de Rijke, “Team COMMIT at TREC 2011,” in The twentieth text retrieval conference, 2012.
    [Bibtex]
    @inproceedings{TREC:2011:commit,
    Author = {Bron, Marc and Meij, Edgar and Peetz, Maria-Hendrike and Tsagkias, Manos and de Rijke, Maarten},
    Booktitle = {The Twentieth Text REtrieval Conference},
    Date-Added = {2011-10-22 12:22:19 +0200},
    Date-Modified = {2012-10-30 09:26:12 +0000},
    Series = {TREC 2011},
    Title = {Team {COMMIT} at {TREC 2011}},
    Year = {2012}}