Skip to content
  • About
  • Contact

    Edgar Meij

    semantic search research ッ

    • Publications
      • Conference Papers
      • Workshop Papers
      • Journal Papers
      • Publicity
      • Books
      • Theses
      • Submitted
    • Research
    • Teaching
    • Professional Activities
    • About
    • Contact
    Home trec2011workingnotestrecmedicalrecords
    TREC

    DutchHatTrick: Semantic query modeling, ConText, section detection, and match score maximization.

    This report discusses the collaborative work of the ErasmusMC, University of Twente, and the University of Amsterdam on the TREC 2011 Medical track. Here, the task is to retrieve patient visits from the University of Pittsburgh NLP Repository for 35 topics. The repository consists of 101,711 patient reports, and a patient visit was recorded in one or more reports.

    Because the training set provided by the track organization was small and not made available until quite late in the competition, we decided to create a small training set ourselves. Not only did this allow us to test several ideas before submitting runs to TREC, it also led to several insights into the data. One finding was that synonyms are widely used. Query expansion was therefore deemed essential to achieve a reasonable performance. Query expansion has been used before in Information Retrieval (IR), and is often divided into statistical and knowledge-based query expansion. Statistical query expansion uses data derived from the corpus itself, and a well-known example is pseudo-relevance feedback . In contrast, we investigated knowledge-based query expansion, which uses a knowledge base such as an ontology or a dictionary to find related terms. This type of query expansion has not always proven to be successful. For instance, Hersh et al. found a decrease in overall search performance when using the Unified Medical Language System (UMLS) to find related terms. Liu et al. found slight improvements with scenario-specific expansion strategies using UMLS. In a previous TREC track, we also found reduced performance when using concept based query expansion , but found slightly improved results when using an approach combining concepts with a statistical model of related words . Similarly, Zhou found promising results when using combination of both the original words in the text and the synonyms found for concepts in the text.

    An often-used resource for knowledge-based query expansion in the biomedical domain is the UMLS. However, initial explorations indicated that there is only limited overlap between terms used in topics and medical records and terms found in the UMLS. The main reason for this appears to be that the UMLS is mainly constructed from vocabularies used in classifying clinical data, but not intended to be used in text- mining. Terms in the UMLS tend to be more specific than what a physician would use in free-text reporting. For instance, a physician might use the term „upper endoscopy‟, but this term is not found in the UMLS. Instead, the term „upper GI endoscopy‟ is found. We have therefore explored a different source of synonyms: Wikipedia. We expected Wikipedia to have a better coverage of the terms encountered in medical records.

    • [PDF] M. Schuemie, D. Trieschnigg, and E. Meij, “DutchHatTrick: semantic query modeling, ConText, section detection, and match score maximization,” in The twentieth text retrieval conference, 2012.
      [Bibtex]
      @inproceedings{TREC:2011:schuemie,
      Author = {Schuemie, M. and Trieschnigg, Dolf and Meij, Edgar},
      Booktitle = {The Twentieth Text REtrieval Conference},
      Date-Added = {2011-10-22 12:14:30 +0200},
      Date-Modified = {2013-05-22 11:44:30 +0000},
      Month = {January},
      Series = {TREC 2011},
      Title = {{DutchHatTrick:} Semantic query modeling, {ConText}, section detection, and match score maximization},
      Year = {2012}}
    Posted by Edgar Meij / January 25, 2012 / 0 Comments / Posted in Publications, Publications, Unrefereed

    Welcome!

    This is the website of Edgar Meij. I lead several groups of researchers and engineers at Bloomberg working on knowledge graphs, question answering, information retrieval, machine learning, and more…

    Search

    Tweets by @edgarmeij

    Tags

    AIDA Artificial Intelligence CLEF DBpedia Document priors edgar-meij entity-linking-and-retrieval entity-linking-and-retrieval-tutorial entity-linking-tutorial Entity finding Entity linking Information retrieval Knowledge base population Knowledge Graph Language modeling Linking Open Data LOD logo-penerbit-buku-internasional Lucene Machine learning meij MeSH Microblogs penerbit-buku-internasional Query log analysis Query modeling Relevance modeling Semanticizing Semantic linking Semantic query analysis Semantic search Teaching Text mining TREC Blog TREC Enterprise TREC Genomics TREC KBA TREC Microblog TREC Relevance Feedback Tutorial Twitter Web services Wikipedia Workflows Workshop

    Publications

    • Conference Papers
    • Unrefereed
    • Workshop Papers
    • Journal Papers
    • Publicity
    • Books
    • Theses
    • Submitted

    Archives

    • November 2020 (2)
    • October 2020 (1)
    • July 2020 (1)
    • June 2020 (1)
    • April 2020 (3)
    • March 2019 (1)
    • August 2018 (1)
    • July 2018 (3)
    • February 2018 (1)
    • July 2017 (1)
    • January 2017 (2)
    • October 2016 (1)
    • September 2016 (1)
    • January 2016 (1)
    • August 2015 (2)
    • July 2015 (1)
    • January 2015 (1)
    • August 2014 (1)
    • February 2014 (5)
    • November 2013 (1)
    • July 2013 (2)
    • June 2013 (1)
    • May 2013 (2)
    • April 2013 (2)
    • February 2013 (6)
    • November 2012 (2)
    • October 2012 (1)
    • September 2012 (2)
    • July 2012 (2)
    • May 2012 (1)
    • March 2012 (1)
    • February 2012 (2)
    • January 2012 (7)
    • December 2011 (1)
    • November 2011 (2)
    • October 2011 (1)
    • September 2011 (1)
    • July 2011 (2)
    • June 2011 (2)
    • May 2011 (1)
    • April 2011 (1)
    • March 2011 (1)
    • January 2011 (3)
    • December 2010 (2)
    • July 2010 (2)
    • April 2010 (2)
    • January 2010 (2)
    • November 2009 (2)
    • October 2009 (3)
    • September 2009 (1)
    • July 2009 (1)
    • April 2009 (1)
    • March 2009 (1)
    • February 2009 (2)
    • January 2009 (2)
    • November 2008 (1)
    • October 2008 (1)
    • September 2008 (1)
    • July 2008 (4)
    • March 2008 (1)
    • February 2008 (1)
    • January 2008 (1)
    • October 2007 (1)
    • September 2007 (1)
    • May 2007 (1)
    • April 2007 (1)
    • January 2007 (2)
    • July 2006 (1)
    • January 2006 (1)
    • September 2005 (1)

    Authors

    • Edgar Meij117 Posts

    © Copyright 2021 Edgar Meij • Designed by MotoPress • Proudly Powered by WordPress