Document (#40930)

Author
Yim, W.-w.
Kwan, S.W.
Yetisgen, M.
Title
Classifying tumor event attributes in radiology reports
Source
Journal of the Association for Information Science and Technology. 68(2017) no.11, S.2662-2674
Year
2017
Abstract
Radiology reports contain vital diagnostic information that characterizes patient disease progression. However, information from reports is represented in free text, which is difficult to query against for secondary use. Automatic extraction of important information, such as tumor events using natural language processing, offers possibilities in improved clinical decision support, cohort identification, and retrospective evidence-based research for cancer patients. The goal of this work was to classify tumor event attributes: negation, temporality, and malignancy, using biomedical ontology and linguistically enriched features. We report our results on an annotated corpus of 101 hepatocellular carcinoma patient radiology reports, and show that the improved classification improves overall template structuring. Classification performances for negation identification, past temporality classification, and malignancy classification were at 0.94, 0.62, and 0.77 F1, respectively. Incorporating the attributes into full templates led to an improvement of 0.72 F1 for tumor-related events over a baseline of 0.65 F1. Improvement of negation, malignancy, and temporality classifications led to significant improvements in template extraction for the majority of categories. We present our machine-learning approach to identifying these several tumor event attributes from radiology reports, as well as highlight challenges and areas for improvement.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23937/full.
Footnote
Beitrag in einem Special issue on biomedical information retrieval.
Field
Medizin

Similar documents (content)

  1. Pluye, P.; Grad, R.; Repchinsky, C.; Jovaisas, B.; Johnson-Lafleur, J.; Carrier, M.-E.; Granikov, V.; Farrell, B.; Rodriguez, C.; Bartlett, G.; Loiselle, C.; Légaré, F.: Four levels of outcomes of information-seeking : a mixed methods study in primary health care (2013) 0.13
    0.13197038 = sum of:
      0.13197038 = product of:
        0.5498766 = sum of:
          0.09671318 = weight(abstract_txt:clinical in 1534) [ClassicSimilarity], result of:
            0.09671318 = score(doc=1534,freq=5.0), product of:
              0.095335536 = queryWeight, product of:
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.013133758 = queryNorm
              1.0144504 = fieldWeight in 1534, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.05187833 = weight(abstract_txt:disease in 1534) [ClassicSimilarity], result of:
            0.05187833 = score(doc=1534,freq=1.0), product of:
              0.10762464 = queryWeight, product of:
                1.0624988 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.013133758 = queryNorm
              0.4820302 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.05344692 = weight(abstract_txt:patients in 1534) [ClassicSimilarity], result of:
            0.05344692 = score(doc=1534,freq=1.0), product of:
              0.10978327 = queryWeight, product of:
                1.0731012 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.013133758 = queryNorm
              0.48684028 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.21436432 = weight(abstract_txt:patient in 1534) [ClassicSimilarity], result of:
            0.21436432 = score(doc=1534,freq=5.0), product of:
              0.20419388 = queryWeight, product of:
                2.0697074 = boost
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.013133758 = queryNorm
              1.0498078 = fieldWeight in 1534, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.078855805 = weight(abstract_txt:improvement in 1534) [ClassicSimilarity], result of:
            0.078855805 = score(doc=1534,freq=1.0), product of:
              0.2052031 = queryWeight, product of:
                2.54112 = boost
                6.148508 = idf(docFreq=257, maxDocs=44421)
                0.013133758 = queryNorm
              0.38428175 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.148508 = idf(docFreq=257, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
          0.05461801 = weight(abstract_txt:reports in 1534) [ClassicSimilarity], result of:
            0.05461801 = score(doc=1534,freq=1.0), product of:
              0.19045906 = queryWeight, product of:
                3.1605191 = boost
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.013133758 = queryNorm
              0.28677034 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.0625 = fieldNorm(doc=1534)
        0.24 = coord(6/25)
    
  2. Lomax, E.C.; Lowe, H.J.; Logan, T.F.; Detlefsen, E.G.: ¬An investigation of the information seeking behavior of medical oncologists in Metropolitan Pittsburgh using a multi-method approach (1999) 0.13
    0.12626529 = sum of:
      0.12626529 = product of:
        0.5261054 = sum of:
          0.061166782 = weight(abstract_txt:clinical in 1289) [ClassicSimilarity], result of:
            0.061166782 = score(doc=1289,freq=2.0), product of:
              0.095335536 = queryWeight, product of:
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.013133758 = queryNorm
              0.64159477 = fieldWeight in 1289, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
          0.05187833 = weight(abstract_txt:disease in 1289) [ClassicSimilarity], result of:
            0.05187833 = score(doc=1289,freq=1.0), product of:
              0.10762464 = queryWeight, product of:
                1.0624988 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.013133758 = queryNorm
              0.4820302 = fieldWeight in 1289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
          0.05344692 = weight(abstract_txt:patients in 1289) [ClassicSimilarity], result of:
            0.05344692 = score(doc=1289,freq=1.0), product of:
              0.10978327 = queryWeight, product of:
                1.0731012 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.013133758 = queryNorm
              0.48684028 = fieldWeight in 1289, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
          0.09145236 = weight(abstract_txt:diagnostic in 1289) [ClassicSimilarity], result of:
            0.09145236 = score(doc=1289,freq=2.0), product of:
              0.12465441 = queryWeight, product of:
                1.1434743 = boost
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.013133758 = queryNorm
              0.73364717 = fieldWeight in 1289, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
          0.13258512 = weight(abstract_txt:cancer in 1289) [ClassicSimilarity], result of:
            0.13258512 = score(doc=1289,freq=4.0), product of:
              0.1267353 = queryWeight, product of:
                1.152979 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.013133758 = queryNorm
              1.0461578 = fieldWeight in 1289, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
          0.13557589 = weight(abstract_txt:patient in 1289) [ClassicSimilarity], result of:
            0.13557589 = score(doc=1289,freq=2.0), product of:
              0.20419388 = queryWeight, product of:
                2.0697074 = boost
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.013133758 = queryNorm
              0.6639567 = fieldWeight in 1289, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.0625 = fieldNorm(doc=1289)
        0.24 = coord(6/25)
    
  3. Bean, C.A.: Representation of medical knowledge for automated semantic interpretation of clinical reports (2004) 0.09
    0.09448501 = sum of:
      0.09448501 = product of:
        0.5905313 = sum of:
          0.05406431 = weight(abstract_txt:clinical in 3660) [ClassicSimilarity], result of:
            0.05406431 = score(doc=3660,freq=1.0), product of:
              0.095335536 = queryWeight, product of:
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.013133758 = queryNorm
              0.56709504 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.078125 = fieldNorm(doc=3660)
          0.18341759 = weight(abstract_txt:disease in 3660) [ClassicSimilarity], result of:
            0.18341759 = score(doc=3660,freq=8.0), product of:
              0.10762464 = queryWeight, product of:
                1.0624988 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.013133758 = queryNorm
              1.7042341 = fieldWeight in 3660, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.078125 = fieldNorm(doc=3660)
          0.06827251 = weight(abstract_txt:reports in 3660) [ClassicSimilarity], result of:
            0.06827251 = score(doc=3660,freq=1.0), product of:
              0.19045906 = queryWeight, product of:
                3.1605191 = boost
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.013133758 = queryNorm
              0.35846293 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.078125 = fieldNorm(doc=3660)
          0.2847769 = weight(abstract_txt:negation in 3660) [ClassicSimilarity], result of:
            0.2847769 = score(doc=3660,freq=1.0), product of:
              0.41625357 = queryWeight, product of:
                3.619197 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.013133758 = queryNorm
              0.6841428 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.078125 = fieldNorm(doc=3660)
        0.16 = coord(4/25)
    
  4. Roberts, A.: ¬The Standard Generalized Markup Language for electronic patient records (1998) 0.09
    0.09329193 = sum of:
      0.09329193 = product of:
        0.46645963 = sum of:
          0.05406431 = weight(abstract_txt:clinical in 4625) [ClassicSimilarity], result of:
            0.05406431 = score(doc=4625,freq=1.0), product of:
              0.095335536 = queryWeight, product of:
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.013133758 = queryNorm
              0.56709504 = fieldWeight in 4625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.078125 = fieldNorm(doc=4625)
          0.06680865 = weight(abstract_txt:patients in 4625) [ClassicSimilarity], result of:
            0.06680865 = score(doc=4625,freq=1.0), product of:
              0.10978327 = queryWeight, product of:
                1.0731012 = boost
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.013133758 = queryNorm
              0.60855037 = fieldWeight in 4625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7894444 = idf(docFreq=49, maxDocs=44421)
                0.078125 = fieldNorm(doc=4625)
          0.06975682 = weight(abstract_txt:events in 4625) [ClassicSimilarity], result of:
            0.06975682 = score(doc=4625,freq=1.0), product of:
              0.1423581 = queryWeight, product of:
                1.728139 = boost
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.013133758 = queryNorm
              0.49000952 = fieldWeight in 4625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.272122 = idf(docFreq=227, maxDocs=44421)
                0.078125 = fieldNorm(doc=4625)
          0.20755735 = weight(abstract_txt:patient in 4625) [ClassicSimilarity], result of:
            0.20755735 = score(doc=4625,freq=3.0), product of:
              0.20419388 = queryWeight, product of:
                2.0697074 = boost
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.013133758 = queryNorm
              1.016472 = fieldWeight in 4625, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5118127 = idf(docFreq=65, maxDocs=44421)
                0.078125 = fieldNorm(doc=4625)
          0.06827251 = weight(abstract_txt:reports in 4625) [ClassicSimilarity], result of:
            0.06827251 = score(doc=4625,freq=1.0), product of:
              0.19045906 = queryWeight, product of:
                3.1605191 = boost
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.013133758 = queryNorm
              0.35846293 = fieldWeight in 4625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.078125 = fieldNorm(doc=4625)
        0.2 = coord(5/25)
    
  5. Cruz Díaz, N.P.; Maña López, M.J.; Mata Vázquez, J.; Pachón Álvarez, V.: ¬A machine-learning approach to negation and speculation detection in clinical texts (2012) 0.08
    0.08250938 = sum of:
      0.08250938 = product of:
        0.6875782 = sum of:
          0.0749137 = weight(abstract_txt:clinical in 1283) [ClassicSimilarity], result of:
            0.0749137 = score(doc=1283,freq=3.0), product of:
              0.095335536 = queryWeight, product of:
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.013133758 = queryNorm
              0.7857899 = fieldWeight in 1283, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.05461801 = weight(abstract_txt:reports in 1283) [ClassicSimilarity], result of:
            0.05461801 = score(doc=1283,freq=1.0), product of:
              0.19045906 = queryWeight, product of:
                3.1605191 = boost
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.013133758 = queryNorm
              0.28677034 = fieldWeight in 1283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5883255 = idf(docFreq=1227, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
          0.5580465 = weight(abstract_txt:negation in 1283) [ClassicSimilarity], result of:
            0.5580465 = score(doc=1283,freq=6.0), product of:
              0.41625357 = queryWeight, product of:
                3.619197 = boost
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.013133758 = queryNorm
              1.3406408 = fieldWeight in 1283, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.757029 = idf(docFreq=18, maxDocs=44421)
                0.0625 = fieldNorm(doc=1283)
        0.12 = coord(3/25)