Document (#40958)

Author
Soldaini, L.
Yates, A.
Goharian, N.
Title
Learning to reformulate long queries for clinical decision support
Source
Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2602-2619
Year
2017
Abstract
The large volume of biomedical literature poses a serious problem for medical professionals, who are often struggling to keep current with it. At the same time, many health providers consider knowledge of the latest literature in their field a key component for successful clinical practice. In this work, we introduce two systems designed to help retrieving medical literature. Both receive a long, discursive clinical note as input query, and return highly relevant literature that could be used in support of clinical practice. The first system is an improved version of a method previously proposed by the authors; it combines pseudo relevance feedback and a domain-specific term filter to reformulate the query. The second is an approach that uses a deep neural network to reformulate a clinical note. Both approaches were evaluated on the 2014 and 2015 TREC CDS datasets; in our tests, they outperform the previously proposed method by up to 28% in inferred NDCG; furthermore, they are competitive with the state of the art, achieving up to 8% improvement in inferred NDCG.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23924/full.
Footnote
Beitrag in einem Special issue on biomedical information retrieval.
Field
Medizin

Similar documents (author)

  1. Cohan, A.; Young, S.; Yates, A.; Goharian, N.: Triaging content severity in online mental health forums (2017) 4.06
    4.0613146 = sum of:
      4.0613146 = sum of:
        1.745625 = weight(author_txt:yates in 4930) [ClassicSimilarity], result of:
          1.745625 = score(doc=4930,freq=1.0), product of:
            0.63788766 = queryWeight, product of:
              8.757029 = idf(docFreq=18, maxDocs=44421)
              0.07284293 = queryNorm
            2.7365713 = fieldWeight in 4930, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.757029 = idf(docFreq=18, maxDocs=44421)
              0.3125 = fieldNorm(doc=4930)
        2.3156893 = weight(author_txt:goharian in 4930) [ClassicSimilarity], result of:
          2.3156893 = score(doc=4930,freq=1.0), product of:
            0.7701295 = queryWeight, product of:
              1.0987775 = boost
              9.622026 = idf(docFreq=7, maxDocs=44421)
              0.07284293 = queryNorm
            3.0068831 = fieldWeight in 4930, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.622026 = idf(docFreq=7, maxDocs=44421)
              0.3125 = fieldNorm(doc=4930)
    
  2. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 1.85
    1.8525516 = sum of:
      1.8525516 = product of:
        3.7051032 = sum of:
          3.7051032 = weight(author_txt:goharian in 3765) [ClassicSimilarity], result of:
            3.7051032 = score(doc=3765,freq=1.0), product of:
              0.7701295 = queryWeight, product of:
                1.0987775 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07284293 = queryNorm
              4.811013 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.5 = fieldNorm(doc=3765)
        0.5 = coord(1/2)
    
  3. Mengle, S.S.R.; Goharian, N.: Ambiguity measure feature-selection algorithm (2009) 1.85
    1.8525516 = sum of:
      1.8525516 = product of:
        3.7051032 = sum of:
          3.7051032 = weight(author_txt:goharian in 3804) [ClassicSimilarity], result of:
            3.7051032 = score(doc=3804,freq=1.0), product of:
              0.7701295 = queryWeight, product of:
                1.0987775 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07284293 = queryNorm
              4.811013 = fieldWeight in 3804, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.5 = fieldNorm(doc=3804)
        0.5 = coord(1/2)
    
  4. Mengle, S.S.R.; Goharian, N.: Detecting relationships among categories using text classification (2010) 1.85
    1.8525516 = sum of:
      1.8525516 = product of:
        3.7051032 = sum of:
          3.7051032 = weight(author_txt:goharian in 449) [ClassicSimilarity], result of:
            3.7051032 = score(doc=449,freq=1.0), product of:
              0.7701295 = queryWeight, product of:
                1.0987775 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07284293 = queryNorm
              4.811013 = fieldWeight in 449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.5 = fieldNorm(doc=449)
        0.5 = coord(1/2)
    
  5. Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval (1992) 1.40
    1.3965001 = sum of:
      1.3965001 = product of:
        2.7930002 = sum of:
          2.7930002 = weight(author_txt:yates in 4082) [ClassicSimilarity], result of:
            2.7930002 = score(doc=4082,freq=1.0), product of:
              0.63788766 = queryWeight, product of:
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.07284293 = queryNorm
              4.3785143 = fieldWeight in 4082, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.5 = fieldNorm(doc=4082)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Pluye, P.; Grad, R.; Repchinsky, C.; Jovaisas, B.; Johnson-Lafleur, J.; Carrier, M.-E.; Granikov, V.; Farrell, B.; Rodriguez, C.; Bartlett, G.; Loiselle, C.; Légaré, F.: Four levels of outcomes of information-seeking : a mixed methods study in primary health care (2013) 0.13
    0.13087037 = sum of:
      0.13087037 = product of:
        0.65435183 = sum of:
          0.018465912 = weight(abstract_txt:they in 1534) [ClassicSimilarity], result of:
            0.018465912 = score(doc=1534,freq=2.0), product of:
              0.055744193 = queryWeight, product of:
                1.0517541 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.014141949 = queryNorm
              0.33126163 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.022584958 = weight(abstract_txt:method in 1534) [ClassicSimilarity], result of:
            0.022584958 = score(doc=1534,freq=1.0), product of:
              0.080323376 = queryWeight, product of:
                1.2625116 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.014141949 = queryNorm
              0.2811754 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.03432398 = weight(abstract_txt:proposed in 1534) [ClassicSimilarity], result of:
            0.03432398 = score(doc=1534,freq=2.0), product of:
              0.08427217 = queryWeight, product of:
                1.2931726 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014141949 = queryNorm
              0.4072991 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.048642535 = weight(abstract_txt:medical in 1534) [ClassicSimilarity], result of:
            0.048642535 = score(doc=1534,freq=1.0), product of:
              0.13395941 = queryWeight, product of:
                1.6304257 = boost
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.014141949 = queryNorm
              0.36311397 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.5303344 = weight(abstract_txt:clinical in 1534) [ClassicSimilarity], result of:
            0.5303344 = score(doc=1534,freq=5.0), product of:
              0.52278 = queryWeight, product of:
                5.092651 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.014141949 = queryNorm
              1.0144504 = fieldWeight in 1534, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
        0.2 = coord(5/25)
    
  2. Grad, R.; Pluye, P.; Granikov, V.; Johnson-Lafleur, J.; Shulha, M.; Sridhar, S.B.; Moscovici, J.L.; Bartlett, G.; Vandal, A.C.; Marlow, B.; Kloda, L.: Physicians' assessment of the value of clinical information : Operationalization of a theoretical model (2011) 0.12
    0.122339636 = sum of:
      0.122339636 = product of:
        0.76462275 = sum of:
          0.020743154 = weight(abstract_txt:support in 763) [ClassicSimilarity], result of:
            0.020743154 = score(doc=763,freq=1.0), product of:
              0.07589485 = queryWeight, product of:
                1.2272147 = boost
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.014141949 = queryNorm
              0.2733144 = fieldWeight in 763, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.37303 = idf(docFreq=1522, maxDocs=44421)
                0.0625 = fieldNorm(doc=763)
          0.045169916 = weight(abstract_txt:method in 763) [ClassicSimilarity], result of:
            0.045169916 = score(doc=763,freq=4.0), product of:
              0.080323376 = queryWeight, product of:
                1.2625116 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.014141949 = queryNorm
              0.5623508 = fieldWeight in 763, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=763)
          0.027883856 = weight(abstract_txt:practice in 763) [ClassicSimilarity], result of:
            0.027883856 = score(doc=763,freq=1.0), product of:
              0.09244093 = queryWeight, product of:
                1.3543988 = boost
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.014141949 = queryNorm
              0.3016397 = fieldWeight in 763, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.0625 = fieldNorm(doc=763)
          0.67082584 = weight(abstract_txt:clinical in 763) [ClassicSimilarity], result of:
            0.67082584 = score(doc=763,freq=8.0), product of:
              0.52278 = queryWeight, product of:
                5.092651 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.014141949 = queryNorm
              1.2831895 = fieldWeight in 763, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=763)
        0.16 = coord(4/25)
    
  3. Cruz Díaz, N.P.; Maña López, M.J.; Mata Vázquez, J.; Pachón Álvarez, V.: ¬A machine-learning approach to negation and speculation detection in clinical texts (2012) 0.12
    0.11718626 = sum of:
      0.11718626 = product of:
        0.5859313 = sum of:
          0.044892434 = weight(abstract_txt:biomedical in 1283) [ClassicSimilarity], result of:
            0.044892434 = score(doc=1283,freq=1.0), product of:
              0.10078623 = queryWeight, product of:
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.014141949 = queryNorm
              0.4454223 = fieldWeight in 1283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1267567 = idf(docFreq=96, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.024270717 = weight(abstract_txt:proposed in 1283) [ClassicSimilarity], result of:
            0.024270717 = score(doc=1283,freq=1.0), product of:
              0.08427217 = queryWeight, product of:
                1.2931726 = boost
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.014141949 = queryNorm
              0.28800395 = fieldWeight in 1283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.608063 = idf(docFreq=1203, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.048642535 = weight(abstract_txt:medical in 1283) [ClassicSimilarity], result of:
            0.048642535 = score(doc=1283,freq=1.0), product of:
              0.13395941 = queryWeight, product of:
                1.6304257 = boost
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.014141949 = queryNorm
              0.36311397 = fieldWeight in 1283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.057330336 = weight(abstract_txt:previously in 1283) [ClassicSimilarity], result of:
            0.057330336 = score(doc=1283,freq=1.0), product of:
              0.14946933 = queryWeight, product of:
                1.7222272 = boost
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.014141949 = queryNorm
              0.3835592 = fieldWeight in 1283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.41079524 = weight(abstract_txt:clinical in 1283) [ClassicSimilarity], result of:
            0.41079524 = score(doc=1283,freq=3.0), product of:
              0.52278 = queryWeight, product of:
                5.092651 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.014141949 = queryNorm
              0.7857899 = fieldWeight in 1283, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
        0.2 = coord(5/25)
    
  4. Thelwall, M.; Maflahi, N.: Guideline references and academic citations as evidence of the clinical value of health research (2016) 0.11
    0.11257464 = sum of:
      0.11257464 = product of:
        0.5628732 = sum of:
          0.0130573725 = weight(abstract_txt:they in 3856) [ClassicSimilarity], result of:
            0.0130573725 = score(doc=3856,freq=1.0), product of:
              0.055744193 = queryWeight, product of:
                1.0517541 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.014141949 = queryNorm
              0.23423736 = fieldWeight in 3856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0625 = fieldNorm(doc=3856)
          0.027883856 = weight(abstract_txt:practice in 3856) [ClassicSimilarity], result of:
            0.027883856 = score(doc=3856,freq=1.0), product of:
              0.09244093 = queryWeight, product of:
                1.3543988 = boost
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.014141949 = queryNorm
              0.3016397 = fieldWeight in 3856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.0625 = fieldNorm(doc=3856)
          0.068790935 = weight(abstract_txt:medical in 3856) [ClassicSimilarity], result of:
            0.068790935 = score(doc=3856,freq=2.0), product of:
              0.13395941 = queryWeight, product of:
                1.6304257 = boost
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.014141949 = queryNorm
              0.5135207 = fieldWeight in 3856, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8098235 = idf(docFreq=361, maxDocs=44421)
                0.0625 = fieldNorm(doc=3856)
          0.042345762 = weight(abstract_txt:literature in 3856) [ClassicSimilarity], result of:
            0.042345762 = score(doc=3856,freq=1.0), product of:
              0.1538789 = queryWeight, product of:
                2.471263 = boost
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.014141949 = queryNorm
              0.2751889 = fieldWeight in 3856, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.0625 = fieldNorm(doc=3856)
          0.41079524 = weight(abstract_txt:clinical in 3856) [ClassicSimilarity], result of:
            0.41079524 = score(doc=3856,freq=3.0), product of:
              0.52278 = queryWeight, product of:
                5.092651 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.014141949 = queryNorm
              0.7857899 = fieldWeight in 3856, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=3856)
        0.2 = coord(5/25)
    
  5. Aloteibi, S.; Sanderson, M.: Analyzing geographic query reformulation : an exploratory study (2014) 0.11
    0.10721326 = sum of:
      0.10721326 = product of:
        0.5360663 = sum of:
          0.06741123 = weight(abstract_txt:filter in 2177) [ClassicSimilarity], result of:
            0.06741123 = score(doc=2177,freq=2.0), product of:
              0.10489721 = queryWeight, product of:
                1.0201907 = boost
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.014141949 = queryNorm
              0.6426408 = fieldWeight in 2177, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.0625 = fieldNorm(doc=2177)
          0.0130573725 = weight(abstract_txt:they in 2177) [ClassicSimilarity], result of:
            0.0130573725 = score(doc=2177,freq=1.0), product of:
              0.055744193 = queryWeight, product of:
                1.0517541 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.014141949 = queryNorm
              0.23423736 = fieldWeight in 2177, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0625 = fieldNorm(doc=2177)
          0.053317487 = weight(abstract_txt:query in 2177) [ClassicSimilarity], result of:
            0.053317487 = score(doc=2177,freq=4.0), product of:
              0.08971304 = queryWeight, product of:
                1.3342652 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.014141949 = queryNorm
              0.5943115 = fieldWeight in 2177, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=2177)
          0.042345762 = weight(abstract_txt:literature in 2177) [ClassicSimilarity], result of:
            0.042345762 = score(doc=2177,freq=1.0), product of:
              0.1538789 = queryWeight, product of:
                2.471263 = boost
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.014141949 = queryNorm
              0.2751889 = fieldWeight in 2177, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4030223 = idf(docFreq=1477, maxDocs=44421)
                0.0625 = fieldNorm(doc=2177)
          0.35993445 = weight(abstract_txt:reformulate in 2177) [ClassicSimilarity], result of:
            0.35993445 = score(doc=2177,freq=2.0), product of:
              0.4621665 = queryWeight, product of:
                3.7090209 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.014141949 = queryNorm
              0.7787982 = fieldWeight in 2177, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=2177)
        0.2 = coord(5/25)