TREC

TREC 2012 summary

In the 21st Text REtrieval Conference (TREC 2012), seven tracks ran: KBA, Contextual suggestion, Session, Web, Medical, Crowdsourcing, and Microblog. Of these, Microblog attracted the largest number of participating groups (40) closely followed by Medical (24).

Overview of RepLab 2012: Evaluating Online Reputation Management Systems

This paper summarizes the goals, organization and results of the first RepLab competitive evaluation campaign for Online Reputation Management Systems (RepLab 2012). RepLab focused on the reputation of companies, and asked participant systems to annotate different types of information on tweets containing the names of several companies. Two tasks were proposed: a proling task, where…
TREC KBA logo

Hadoop code for TREC KBA

I’ve decided to put some of the Hadoop code I developed for the TREC KBA task online. It’s available on Github: https://github.com/ejmeij/trec-kba. In particular, it provides classes to read/write topic files, read/write run files, and expose the documents in the Thrift files as Hadoop-readable objects (‘ThriftFileInputFormat’) to be used as input to mappers. I obviously also…
linking open data datasets

Zoekmachines van de toekomst

Er bestaat enige discussie over wat de logische opvolger zal zijn van web 2.0, waarin user-generated content, het delen van informatie en interoperabiliteit centraal stonden. Hoewel meer ideeën de ronde doen, is er veel steun voor het idee web 3.0 gelijk te stellen aan het semantische web. Het sturende idee…