Special issue on knowledge graphs and semantics in text analysis and retrieval

Knowledge graphs are an effective way to store semantics in a structured format that is easily used by computer systems. In the past few decades, work across different research communities led to scalable knowledge acquisition techniques for building large-scale knowledge graphs. The result is the emergence of large publicly available knowledge graphs (KGs) such as Wikidata, DBpedia, Freebase, and others. While knowledge graphs are designed to support a wide set of different applications, this special issue focuses on the use case of text retrieval and analysis.

Utilizing knowledge graphs for text analysis requires effective alignment techniques that associate segments of unstructured text with entries in the knowledge graph, for example using entity extraction and linking algorithms. A wide range of approaches that combine query-document representations and machine learning repeatedly demonstrate significant improvements for such tasks across diverse domains. The goal of this special issue is to summarize recent progress in research and practice in constructing, grounding, and utilizing knowledge graphs and similar semantic resources for text retrieval and analysis applications. The scope includes acquisition, alignment, and utilization of knowledge graphs and other semantic resources for the purpose of optimizing end-to-end performance of information retrieval systems.

For this special issue we selected six articles out of 23 submissions. Each article was reviewed by at least three reviewers and underwent at least one revision. More literature on how to effectively use of knowledge graphs in information retrieval can be found in the proceedings of the KG4IR Workshop series.

  • [PDF] [DOI] L. Dietz, C. Xiong, J. Dalton, and E. Meij, “Special issue on knowledge graphs and semantics in text analysis and retrieval,” Information retrieval journal, 2019.
    [Bibtex]
    @article{IRJ:2019:Dietz,
    Author = {Dietz, Laura and Xiong, Chenyan and Dalton, Jeff and Meij, Edgar},
    Date-Added = {2019-03-12 20:19:31 +0000},
    Date-Modified = {2019-03-12 20:19:39 +0000},
    Day = {04},
    Doi = {10.1007/s10791-019-09354-z},
    Issn = {1573-7659},
    Journal = {Information Retrieval Journal},
    Month = {Mar},
    Title = {Special issue on knowledge graphs and semantics in text analysis and retrieval},
    Url = {https://doi.org/10.1007/s10791-019-09354-z},
    Year = {2019},
    Bdsk-Url-1 = {https://doi.org/10.1007/s10791-019-09354-z}}

The Second Workshop on Knowledge Graphs and Semantics for Text Retrieval, Analysis, and Understanding (KG4IR)

Semantic technologies such as controlled vocabularies, thesauri, and knowledge graphs have been used throughout the history of information retrieval for a variety of tasks. Recent advances in knowledge acquisition, alignment, and utilization have given rise to a body of new approaches for utilizing knowledge graphs in text retrieval tasks and it is therefore time to consolidate the community efforts and study how such technologies can be employed in information retrieval systems in the most effective way. It is also time to start and deepen the dialogue between researchers and practitioners in order to ensure that breakthroughs, technologies, and algorithms in this space are widely disseminated. The goal of this workshop, co-located with SIGIR 2018, is to bring together and grow a community of researchers and practitioners who are interested in using, aligning, and constructing knowledge graphs and similar semantic resources for information retrieval applications. See https://kg4ir.github.io/ for more info.

  • [PDF] [DOI] L. Dietz, C. Xiong, J. Dalton, and E. Meij, “The second workshop on knowledge graphs and semantics for text retrieval, analysis, and understanding (kg4ir),” in The 41st international acm sigir conference on research & development in information retrieval, New York, NY, USA, 2018, p. 1423–1426.
    [Bibtex]
    @inproceedings{SIGIR:2018:Dietz-WS,
    Acmid = {3210196},
    Address = {New York, NY, USA},
    Author = {Dietz, Laura and Xiong, Chenyan and Dalton, Jeff and Meij, Edgar},
    Booktitle = {The 41st International ACM SIGIR Conference on Research \& Development in Information Retrieval},
    Date-Added = {2018-07-26 18:25:34 +0000},
    Date-Modified = {2018-07-26 18:31:50 +0000},
    Doi = {10.1145/3209978.3210196},
    Isbn = {978-1-4503-5657-2},
    Keywords = {entity linking, entity retrieval, entity-oriented search, information retrieval, knowledge graphs},
    Location = {Ann Arbor, MI, USA},
    Numpages = {4},
    Pages = {1423--1426},
    Publisher = {ACM},
    Series = {SIGIR '18},
    Title = {The Second Workshop on Knowledge Graphs and Semantics for Text Retrieval, Analysis, and Understanding (KG4IR)},
    Url = {http://doi.acm.org/10.1145/3209978.3210196},
    Year = {2018},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/3209978.3210196},
    Bdsk-Url-2 = {https://doi.org/10.1145/3209978.3210196}}

Utilizing Knowledge Graphs for Text-Centric Information Retrieval

The past decade has witnessed the emergence of several publicly available and proprietary knowledge graphs (KGs). The depth and breadth of content in these KGs made them not only rich sources of structured knowledge by themselves, but also valuable resources for search systems. A surge of recent developments in entity linking and entity retrieval methods gave rise to a new line of research that aims at utilizing KGs for text-centric retrieval applications. This tutorial is the first to summarize and disseminate the progress in this emerging area to industry practitioners and researchers.

  • [PDF] [DOI] L. Dietz, A. Kotov, and E. Meij, “Utilizing knowledge graphs for text-centric information retrieval,” in The 41st international acm sigir conference on research & development in information retrieval, New York, NY, USA, 2018, p. 1387–1390.
    [Bibtex]
    @inproceedings{SIGIR:2018:Dietz-Tut,
    Acmid = {3210187},
    Address = {New York, NY, USA},
    Author = {Dietz, Laura and Kotov, Alexander and Meij, Edgar},
    Booktitle = {The 41st International ACM SIGIR Conference on Research \& Development in Information Retrieval},
    Date-Added = {2018-07-26 18:24:31 +0000},
    Date-Modified = {2018-07-26 18:31:50 +0000},
    Doi = {10.1145/3209978.3210187},
    Isbn = {978-1-4503-5657-2},
    Keywords = {entity linking, entity retrieval, information retrieval, knowledge graphs},
    Location = {Ann Arbor, MI, USA},
    Numpages = {4},
    Pages = {1387--1390},
    Publisher = {ACM},
    Series = {SIGIR '18},
    Title = {Utilizing Knowledge Graphs for Text-Centric Information Retrieval},
    Url = {http://doi.acm.org/10.1145/3209978.3210187},
    Year = {2018},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/3209978.3210187},
    Bdsk-Url-2 = {https://doi.org/10.1145/3209978.3210187}}

Overview of The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR)

Knowledge graphs have been used throughout the history of information retrieval for a variety of tasks. Advances in knowledge acquisition and alignment technology in the last few years have given rise to a body of new approaches for utilizing knowledge graphs in text retrieval tasks. This report presents the motivation, output, and outlook of the first workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis which was co-located with SIGIR 2017 in Tokyo, Japan. We aim to assess where we stand today, what future directions are, and which preconditions could lead to further performance increases.

  • [PDF] [DOI] L. Dietz, C. Xiong, and E. Meij, “Overview of the first workshop on knowledge graphs and semantics for text retrieval and analysis (kg4ir),” Sigir forum, vol. 51, iss. 3, p. 139–144, 2018.
    [Bibtex]
    @article{Forum:2018:Dietz,
    Acmid = {3190601},
    Address = {New York, NY, USA},
    Author = {Dietz, Laura and Xiong, Chenyan and Meij, Edgar},
    Date-Added = {2018-07-26 18:22:37 +0000},
    Date-Modified = {2018-07-26 18:22:48 +0000},
    Doi = {10.1145/3190580.3190601},
    Issn = {0163-5840},
    Issue_Date = {December 2017},
    Journal = {SIGIR Forum},
    Month = 2,
    Number = {3},
    Numpages = {6},
    Pages = {139--144},
    Publisher = {ACM},
    Title = {Overview of The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR)},
    Url = {http://doi.acm.org/10.1145/3190580.3190601},
    Volume = {51},
    Year = {2018},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/3190580.3190601},
    Bdsk-Url-2 = {https://doi.org/10.1145/3190580.3190601}}

The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR)

Knowledge graphs have been used throughout the history of information retrieval for a variety of tasks. Advances in knowledge acquisition and alignment technology in the last few years have given rise to a body of new approaches for utilizing knowledge graphs in text retrieval tasks. This report presents the motivation, output, and outlook of the first workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis which was co-located with SIGIR 2017 in Tokyo, Japan. We aim to assess where we stand today, what future directions are, and which preconditions could lead to further performance increases. See https://kg4ir.github.io/ for more info.

  • [PDF] [DOI] L. Dietz, C. Xiong, and E. Meij, “The first workshop on knowledge graphs and semantics for text retrieval and analysis (kg4ir),” in Proceedings of the 40th international acm sigir conference on research and development in information retrieval, New York, NY, USA, 2017, p. 1427–1428.
    [Bibtex]
    @inproceedings{SIGIR:2017:Dietz,
    Acmid = {3084371},
    Address = {New York, NY, USA},
    Author = {Dietz, Laura and Xiong, Chenyan and Meij, Edgar},
    Booktitle = {Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval},
    Date-Added = {2018-07-26 18:17:39 +0000},
    Date-Modified = {2018-07-26 18:17:51 +0000},
    Doi = {10.1145/3077136.3084371},
    Isbn = {978-1-4503-5022-8},
    Keywords = {entities, information retrieval, knowledge graphs},
    Location = {Shinjuku, Tokyo, Japan},
    Numpages = {2},
    Pages = {1427--1428},
    Publisher = {ACM},
    Series = {SIGIR '17},
    Title = {The First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR)},
    Url = {http://doi.acm.org/10.1145/3077136.3084371},
    Year = {2017},
    Bdsk-Url-1 = {http://doi.acm.org/10.1145/3077136.3084371},
    Bdsk-Url-2 = {https://doi.org/10.1145/3077136.3084371}}
ECIR 2017

Generating descriptions of entity relationships

Large-scale knowledge graphs (KGs) store relationships between entities that are increasingly being used to improve the user experience in search applications. The structured nature of the data in KGs is typically not suitable to show to an end user and applications that utilize KGs therefore benefit from human-readable textual descriptions of KG relationships. We present a method that automatically generates textual descriptions of entity relationships by combining textual and KG information. Our method creates sentence templates for a particular relationship and then generates a textual description of a relationship instance by selecting the best template and filling it with appropriate entities. Experimental results show that a supervised variation of our method outperforms other variations as it captures the semantic similarity between a relationship instance and a template best, whilst providing more contextual information.

  • [PDF] N. Voskarides, E. Meij, and M. de Rijke, “Generating descriptions of entity relationships,” in Ecir 2017: 39th european conference on information retrieval, 2017.
    [Bibtex]
    @inproceedings{ECIR:2017:voskarides,
    Author = {Voskarides, Nikos and Meij, Edgar and de Rijke, Maarten},
    Booktitle = {ECIR 2017: 39th European Conference on Information Retrieval},
    Date-Added = {2017-01-10 21:27:37 +0000},
    Date-Modified = {2017-01-10 21:27:58 +0000},
    Month = {April},
    Publisher = {Springer},
    Series = {LNCS},
    Title = {Generating descriptions of entity relationships},
    Year = {2017}}
wsdm 2017

Utilizing Knowledge Bases in Text-centric Information Retrieval (WSDM 2017)

The past decade has witnessed the emergence of several publicly available and proprietary knowledge graphs (KGs). The increasing depth and breadth of content in KGs makes them not only rich sources of structured knowledge by themselves but also valuable resources for search systems. A surge of recent developments in entity linking and retrieval methods gave rise to a new line of research that aims at utilizing KGs for text-centric retrieval applications, making this an ideal time to pause and report current findings to the community, summarizing successful approaches, and soliciting new ideas. This tutorial is the first to disseminate the progress in this emerging field to researchers and practitioners.

Utilizing Knowledge Bases in Text-centric Information Retrieval (ICTIR 2016)

General-purpose knowledge bases are increasingly growing in terms of depth (content) and width (coverage). Moreover, algorithms for entity linking and entity retrieval have improved tremendously in the past years. These developments give rise to a new line of research that exploits and combines these developments for the purposes of text-centric information retrieval applications. This tutorial focuses on a) how to retrieve a set of entities for an ad-hoc query, or more broadly, assessing relevance of KB elements for the information need, b) how to annotate text with such elements, and c) how to use this information to assess the relevance of text. We discuss different kinds of information available in a knowledge graph and how to leverage each most effectively.
Continue reading “Utilizing Knowledge Bases in Text-centric Information Retrieval (ICTIR 2016)” »

WSDM

Dynamic Collective Entity Representations for Entity Ranking

Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity’s description in a knowledge base, and the way people refer to the entity when searching for it. To counter this issue we propose a method for constructing dynamic collective entity representations. We collect entity descriptions from a variety of sources and combine them into a single entity representation by learning to weight the content from different sources that are associated with an entity for optimal retrieval effectiveness. Our method is able to add new descriptions in real time and learn the best representation as time evolves so as to capture the dynamics of how people search entities. Incorporating dynamic description sources into dynamic collective entity representations improves retrieval effectiveness by 7% over a state-of-the-art learning to rank baseline. Periodic retraining of the ranker enables higher ranking effectiveness for dynamic collective entity representations.

  • [PDF] D. Graus, M. Tsagkias, W. Weerkamp, E. Meij, and M. de Rijke, “Dynamic collective entity representations for entity ranking,” in Proceedings of the ninth acm international conference on web search and data mining, 2016.
    [Bibtex]
    @inproceedings{WSDM:2016:Graus,
    Author = {Graus, David and Tsagkias, Manos and Weerkamp, Wouter and Meij, Edgar and de Rijke, Maarten},
    Booktitle = {Proceedings of the ninth ACM international conference on Web search and data mining},
    Date-Added = {2016-01-07 17:24:16 +0000},
    Date-Modified = {2016-01-07 17:25:55 +0000},
    Series = {WSDM 2016},
    Title = {Dynamic Collective Entity Representations for Entity Ranking},
    Year = {2016},
    Bdsk-Url-1 = {http://aclweb.org/anthology/P15-1055}}

Mining, ranking and recommending entity aspects

Entity queries constitute a large fraction of web search queries and most of these queries are in the form of an entity mention plus some context terms that represent an intent in the context of that entity. We refer to these entity-oriented search intents as entity aspects. Recognizing entity aspects in a query can improve various search applications such as providing direct answers, diversifying search results, and recommending queries. In this paper we focus on the tasks of identifying, ranking, and recommending entity aspects, and propose an approach that mines, clusters, and ranks such aspects from query logs.  Continue reading “Mining, ranking and recommending entity aspects” »