Overview of RepLab 2012: Evaluating Online Reputation Management Systems

This paper summarizes the goals, organization and results of the first RepLab competitive evaluation campaign for Online Reputation Management Systems (RepLab 2012). RepLab focused on the reputation of companies, and asked participant systems to annotate different types of information on tweets containing the names of several companies. Two tasks were proposed: a pro ling task, where tweets had to be annotated for relevance and polarity for reputation, and a monitoring task, where tweets had to be clustered thematically and clusters had to be ordered by priority (for reputation management purposes). The gold standard consisted of annotations made by reputation management experts, a feature which turns the RepLab 2012 test collection in a useful source not only to evaluate systems, but also to reach a better understanding of the notions of polarity and priority in the context of reputation management.

  • [PDF] E. Amigó, A. Corujo, J. Gonzalo, E. Meij, and M. de Rijke, “Overview of RepLab 2012: evaluating online reputation management systems,” in Clef (online working notes/labs/workshop), 2012.
    [Bibtex]
    @inproceedings{CLEF:2012:replab,
    Author = {Enrique Amig{\'o} and Adolfo Corujo and Julio Gonzalo and Edgar Meij and Maarten de Rijke},
    Booktitle = {CLEF (Online Working Notes/Labs/Workshop)},
    Date-Added = {2012-09-20 12:48:33 +0000},
    Date-Modified = {2012-10-30 09:30:49 +0000},
    Title = {Overview of {RepLab} 2012: Evaluating Online Reputation Management Systems},
    Year = {2012}}

Generating Pseudo Test Collections for Learning to Rank Scientific Articles

Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.

  • [PDF] R. Berendsen, M. Tsagkias, M. de Rijke, and E. Meij, “Generating pseudo test collections for learning to rank scientific articles,” in Information access evaluation. multilinguality, multimodality, and visual analytics – third international conference of the clef initiative, clef 2012, 2012.
    [Bibtex]
    @inproceedings{CLEF:2012:berendsen,
    Author = {Berendsen, Richard and Tsagkias, Manos and de Rijke, Maarten and Meij, Edgar},
    Booktitle = {Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics - Third International Conference of the CLEF Initiative, CLEF 2012},
    Date-Added = {2012-07-03 13:44:06 +0200},
    Date-Modified = {2012-10-30 08:37:52 +0000},
    Title = {Generating Pseudo Test Collections for Learning to Rank Scientific Articles},
    Year = {2012}}