Workshop Papers Archives

Recent papers

Please look at Google Scholar to see the list of my most recept up-to-date papers, and be sure to check out https://techatbloomberg.com/ai as well!

Uncertainty over Uncertainty: Investigating the Assumptions, Annotations, and Text Measurements of Economic Policy Uncertainty

18/11/2020 Blog Publications Workshop Papers No Comments

Methods and applications are inextricably linked in science, and in particular in the domain of text-as-data. In this paper, we examine one such text-as-data application, an established economic index that measures economic policy uncertainty from keyword occurrences in news. This index, which is shown to correlate with firm investment, employment,…

Time-Aware Chi-squared for Document Filtering over Time

22/07/2013 Publications Workshop Papers No Comments

To appear at TAIA2013 (a SIGIR 2013 workshop). Document filtering over time is widely applied in various tasks such as tracking topics in online news or social media. We consider it a classification task, where topics of interest correspond to classes, and the feature space consists of the words associated…

Multilingual Semantic Linking for Video Streams: Making “Ideas Worth Sharing” More Accessible

15/05/2013 Blog Publications Workshop Papers No Comments

This paper describes our (winning!) submission to the Developers Challenge at WoLE2013, “Doing Good by Linking Entities.” We present a fully automatic system – called “Semantic TED” – which provides intelligent suggestions in the form of links to Wikipedia articles for video streams in multiple languages, based on the subtitles…

OpenGeist: Insight in the Stream of Page Views on Wikipedia

03/07/2012 Publications Workshop Papers No Comments

We present a RESTful interface that captures insights into the zeitgeist of Wikipedia users. In recent years many so-called zeitgeist applications have been launched. Such applications are used to gain insights into the current gist of society and actual affairs. Several news sources run zeitgeist applications for popular and trending news.…

A Corpus for Entity Profiling in Microblog Posts

29/03/2012 Blog Publications Workshop Papers No Comments

Microblogs have become an invaluable source of information for the purpose of online reputation management. An emerging problem in the field of online reputation management consists of identifying the key aspects of an entity commented in microblog posts. Streams of microblogs are of great value because of their direct and…

Entity Search: Building Bridges between Two Worlds

20/04/2010 Publications Workshop Papers No Comments

We have come to depend on technological resources to create order and find meaning in the ever-growing amount of online data. One frequently recurring type of query in web search are queries containing named entities (persons, organizations, locations, etc.): we organize our environments around entities that are meaningful to us.…

Investigating the Demand Side of Semantic Search through Query Log Analysis

17/04/2009 Publications Workshop Papers No Comments

Semantic search is by its broadest definition a collection of approaches that aim at matching the Web’s content with the information need of Web users at a semantic level. Most of the work in this area has focused on the supply-side of semantic search, in particular elevating Web content to…

Deploying Lucene on the Grid

01/07/2006 Publications Workshop Papers No Comments

We investigate if and how open source retrieval engines can be deployed in a grid environment. When comparing grids to conventional distributed IR, the lack of a-priori knowledge about available nodes is one of the most significant differences. On top of that, it is also unknown when a particular node…

Edgar Meij

Category: Workshop Papers