Document (#28238)

Author
Wu, T.
Pottenger, W.M.
Title
¬A semi-supervised active learning algorithm for information extraction from textual data
Source
Journal of the American Society for Information Science and Technology. 56(2005) no.3, S.258-271
Year
2005
Abstract
In this article we present a semi-supervised active learning algorithm for pattern discovery in information extraction from textual data. The patterns are reduced regular expressions composed of various characteristics of features useful in information extraction. Our major contribution is a semi-supervised learning algorithm that extracts information from a set of examples labeled as relevant or irrelevant to a given attribute. The approach is semi-supervised because it does not require precise labeling of the exact location of features in the training data. This significantly reduces the effort needed to develop a training set. An active learning algorithm is used to assist the semi-supervised learning algorithm to further reduce the training set development effort. The active learning algorithm is seeded with a Single positive example of a given attribute. The context of the seed is used to automatically identify candidates for additional positive examples of the given attribute. Candidate examples are manually pruned during the active learning phase, and our semi-supervised learning algorithm automatically discovers reduced regular expressions for each attribute. We have successfully applied this learning technique in the extraction of textual features from police incident reports, university crime reports, and patents. The performance of our algorithm compares favorably with competitive extraction systems being used in criminal justice information systems.
Footnote
Beitrag in einem Themenheft zu: 'Intelligence and security informatics'
Theme
Data Mining

Similar documents (content)

  1. Levin, M.; Krawczyk, S.; Bethard, S.; Jurafsky, D.: Citation-based bootstrapping for large-scale author disambiguation (2012) 0.35
    0.34978777 = sum of:
      0.34978777 = product of:
        0.97163266 = sum of:
          0.014226662 = weight(abstract_txt:used in 246) [ClassicSimilarity], result of:
            0.014226662 = score(doc=246,freq=2.0), product of:
              0.04791366 = queryWeight, product of:
                1.0989175 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.012979129 = queryNorm
              0.2969229 = fieldWeight in 246, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.012938981 = weight(abstract_txt:from in 246) [ClassicSimilarity], result of:
            0.012938981 = score(doc=246,freq=3.0), product of:
              0.04324539 = queryWeight, product of:
                1.2055207 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.012979129 = queryNorm
              0.29919907 = fieldWeight in 246, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.036518328 = weight(abstract_txt:positive in 246) [ClassicSimilarity], result of:
            0.036518328 = score(doc=246,freq=1.0), product of:
              0.09886535 = queryWeight, product of:
                1.2888781 = boost
                5.90999 = idf(docFreq=325, maxDocs=44218)
                0.012979129 = queryNorm
              0.36937436 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.90999 = idf(docFreq=325, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.055494636 = weight(abstract_txt:features in 246) [ClassicSimilarity], result of:
            0.055494636 = score(doc=246,freq=5.0), product of:
              0.08748051 = queryWeight, product of:
                1.4848789 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.012979129 = queryNorm
              0.63436574 = fieldWeight in 246, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.032923974 = weight(abstract_txt:examples in 246) [ClassicSimilarity], result of:
            0.032923974 = score(doc=246,freq=1.0), product of:
              0.10561902 = queryWeight, product of:
                1.6315728 = boost
                4.9875827 = idf(docFreq=819, maxDocs=44218)
                0.012979129 = queryNorm
              0.31172392 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9875827 = idf(docFreq=819, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.035451848 = weight(abstract_txt:training in 246) [ClassicSimilarity], result of:
            0.035451848 = score(doc=246,freq=1.0), product of:
              0.11095832 = queryWeight, product of:
                1.6723045 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.012979129 = queryNorm
              0.319506 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.104976 = weight(abstract_txt:extraction in 246) [ClassicSimilarity], result of:
            0.104976 = score(doc=246,freq=1.0), product of:
              0.27127528 = queryWeight, product of:
                3.375708 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012979129 = queryNorm
              0.38697222 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.49324086 = weight(abstract_txt:supervised in 246) [ClassicSimilarity], result of:
            0.49324086 = score(doc=246,freq=5.0), product of:
              0.47292614 = queryWeight, product of:
                4.882554 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.012979129 = queryNorm
              1.0429554 = fieldWeight in 246, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.18586141 = weight(abstract_txt:algorithm in 246) [ClassicSimilarity], result of:
            0.18586141 = score(doc=246,freq=2.0), product of:
              0.36855844 = queryWeight, product of:
                4.977061 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.012979129 = queryNorm
              0.5042929 = fieldWeight in 246, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
        0.36 = coord(9/25)
    
  2. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.33
    0.32537806 = sum of:
      0.32537806 = product of:
        1.0168065 = sum of:
          0.021122223 = weight(abstract_txt:data in 4095) [ClassicSimilarity], result of:
            0.021122223 = score(doc=4095,freq=6.0), product of:
              0.047261182 = queryWeight, product of:
                1.0914094 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012979129 = queryNorm
              0.4469254 = fieldWeight in 4095, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.008802298 = weight(abstract_txt:used in 4095) [ClassicSimilarity], result of:
            0.008802298 = score(doc=4095,freq=1.0), product of:
              0.04791366 = queryWeight, product of:
                1.0989175 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.012979129 = queryNorm
              0.18371168 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.0065365336 = weight(abstract_txt:from in 4095) [ClassicSimilarity], result of:
            0.0065365336 = score(doc=4095,freq=1.0), product of:
              0.04324539 = queryWeight, product of:
                1.2055207 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.012979129 = queryNorm
              0.15114984 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.03102037 = weight(abstract_txt:training in 4095) [ClassicSimilarity], result of:
            0.03102037 = score(doc=4095,freq=1.0), product of:
              0.11095832 = queryWeight, product of:
                1.6723045 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.012979129 = queryNorm
              0.27956775 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.091854 = weight(abstract_txt:extraction in 4095) [ClassicSimilarity], result of:
            0.091854 = score(doc=4095,freq=1.0), product of:
              0.27127528 = queryWeight, product of:
                3.375708 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012979129 = queryNorm
              0.3386007 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.25886223 = weight(abstract_txt:semi in 4095) [ClassicSimilarity], result of:
            0.25886223 = score(doc=4095,freq=4.0), product of:
              0.3623245 = queryWeight, product of:
                4.273653 = boost
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.012979129 = queryNorm
              0.7144486 = fieldWeight in 4095, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.532101 = idf(docFreq=174, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.1670231 = weight(abstract_txt:learning in 4095) [ClassicSimilarity], result of:
            0.1670231 = score(doc=4095,freq=5.0), product of:
              0.28749484 = queryWeight, product of:
                4.6624165 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.012979129 = queryNorm
              0.5809604 = fieldWeight in 4095, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.43158576 = weight(abstract_txt:supervised in 4095) [ClassicSimilarity], result of:
            0.43158576 = score(doc=4095,freq=5.0), product of:
              0.47292614 = queryWeight, product of:
                4.882554 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.012979129 = queryNorm
              0.912586 = fieldWeight in 4095, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
        0.32 = coord(8/25)
    
  3. Kholghi, M.; Vine, L.D.; Sitbon, L.; Zuccon, G.; Nguyen, A.: Clinical information extraction using small data : an active learning approach based on sequence representations and word embeddings (2017) 0.32
    0.32361394 = sum of:
      0.32361394 = product of:
        0.89892757 = sum of:
          0.0074703237 = weight(abstract_txt:from in 3920) [ClassicSimilarity], result of:
            0.0074703237 = score(doc=3920,freq=1.0), product of:
              0.04324539 = queryWeight, product of:
                1.2055207 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.012979129 = queryNorm
              0.17274266 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.033161916 = weight(abstract_txt:effort in 3920) [ClassicSimilarity], result of:
            0.033161916 = score(doc=3920,freq=1.0), product of:
              0.09271072 = queryWeight, product of:
                1.2481154 = boost
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.012979129 = queryNorm
              0.35769236 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.008874854 = weight(abstract_txt:information in 3920) [ClassicSimilarity], result of:
            0.008874854 = score(doc=3920,freq=2.0), product of:
              0.041474488 = queryWeight, product of:
                1.3199282 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012979129 = queryNorm
              0.21398345 = fieldWeight in 3920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.059663307 = weight(abstract_txt:reduced in 3920) [ClassicSimilarity], result of:
            0.059663307 = score(doc=3920,freq=1.0), product of:
              0.13714346 = queryWeight, product of:
                1.5180194 = boost
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.012979129 = queryNorm
              0.43504304 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.035451848 = weight(abstract_txt:training in 3920) [ClassicSimilarity], result of:
            0.035451848 = score(doc=3920,freq=1.0), product of:
              0.11095832 = queryWeight, product of:
                1.6723045 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.012979129 = queryNorm
              0.319506 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.14845848 = weight(abstract_txt:extraction in 3920) [ClassicSimilarity], result of:
            0.14845848 = score(doc=3920,freq=2.0), product of:
              0.27127528 = queryWeight, product of:
                3.375708 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012979129 = queryNorm
              0.54726136 = fieldWeight in 3920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.1943793 = weight(abstract_txt:active in 3920) [ClassicSimilarity], result of:
            0.1943793 = score(doc=3920,freq=3.0), product of:
              0.2836241 = queryWeight, product of:
                3.4516866 = boost
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.012979129 = queryNorm
              0.68534124 = fieldWeight in 3920, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.19088356 = weight(abstract_txt:learning in 3920) [ClassicSimilarity], result of:
            0.19088356 = score(doc=3920,freq=5.0), product of:
              0.28749484 = queryWeight, product of:
                4.6624165 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.012979129 = queryNorm
              0.66395473 = fieldWeight in 3920, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
          0.220584 = weight(abstract_txt:supervised in 3920) [ClassicSimilarity], result of:
            0.220584 = score(doc=3920,freq=1.0), product of:
              0.47292614 = queryWeight, product of:
                4.882554 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.012979129 = queryNorm
              0.4664238 = fieldWeight in 3920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=3920)
        0.36 = coord(9/25)
    
  4. Suakkaphong, N.; Zhang, Z.; Chen, H.: Disease named entity recognition using semisupervised learning and conditional random fields (2011) 0.26
    0.2617343 = sum of:
      0.2617343 = product of:
        0.7270397 = sum of:
          0.019709967 = weight(abstract_txt:data in 4367) [ClassicSimilarity], result of:
            0.019709967 = score(doc=4367,freq=4.0), product of:
              0.047261182 = queryWeight, product of:
                1.0914094 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012979129 = queryNorm
              0.41704348 = fieldWeight in 4367, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.010564634 = weight(abstract_txt:from in 4367) [ClassicSimilarity], result of:
            0.010564634 = score(doc=4367,freq=2.0), product of:
              0.04324539 = queryWeight, product of:
                1.2055207 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.012979129 = queryNorm
              0.24429502 = fieldWeight in 4367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.012550939 = weight(abstract_txt:information in 4367) [ClassicSimilarity], result of:
            0.012550939 = score(doc=4367,freq=4.0), product of:
              0.041474488 = queryWeight, product of:
                1.3199282 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012979129 = queryNorm
              0.3026183 = fieldWeight in 4367, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.027570598 = weight(abstract_txt:given in 4367) [ClassicSimilarity], result of:
            0.027570598 = score(doc=4367,freq=1.0), product of:
              0.093834974 = queryWeight, product of:
                1.5378635 = boost
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.012979129 = queryNorm
              0.29382005 = fieldWeight in 4367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.035451848 = weight(abstract_txt:training in 4367) [ClassicSimilarity], result of:
            0.035451848 = score(doc=4367,freq=1.0), product of:
              0.11095832 = queryWeight, product of:
                1.6723045 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.012979129 = queryNorm
              0.319506 = fieldWeight in 4367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.14845848 = weight(abstract_txt:extraction in 4367) [ClassicSimilarity], result of:
            0.14845848 = score(doc=4367,freq=2.0), product of:
              0.27127528 = queryWeight, product of:
                3.375708 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012979129 = queryNorm
              0.54726136 = fieldWeight in 4367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.120725356 = weight(abstract_txt:learning in 4367) [ClassicSimilarity], result of:
            0.120725356 = score(doc=4367,freq=2.0), product of:
              0.28749484 = queryWeight, product of:
                4.6624165 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.012979129 = queryNorm
              0.41992182 = fieldWeight in 4367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.220584 = weight(abstract_txt:supervised in 4367) [ClassicSimilarity], result of:
            0.220584 = score(doc=4367,freq=1.0), product of:
              0.47292614 = queryWeight, product of:
                4.882554 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.012979129 = queryNorm
              0.4664238 = fieldWeight in 4367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
          0.13142386 = weight(abstract_txt:algorithm in 4367) [ClassicSimilarity], result of:
            0.13142386 = score(doc=4367,freq=1.0), product of:
              0.36855844 = queryWeight, product of:
                4.977061 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.012979129 = queryNorm
              0.35658893 = fieldWeight in 4367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=4367)
        0.36 = coord(9/25)
    
  5. Thelwall, M.; Buckley, K.; Paltoglou, G.: Sentiment strength detection for the social web (2012) 0.22
    0.2239485 = sum of:
      0.2239485 = product of:
        0.62207913 = sum of:
          0.013937052 = weight(abstract_txt:data in 4972) [ClassicSimilarity], result of:
            0.013937052 = score(doc=4972,freq=2.0), product of:
              0.047261182 = queryWeight, product of:
                1.0914094 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012979129 = queryNorm
              0.29489428 = fieldWeight in 4972, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.01005977 = weight(abstract_txt:used in 4972) [ClassicSimilarity], result of:
            0.01005977 = score(doc=4972,freq=1.0), product of:
              0.04791366 = queryWeight, product of:
                1.0989175 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.012979129 = queryNorm
              0.2099562 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.012938981 = weight(abstract_txt:from in 4972) [ClassicSimilarity], result of:
            0.012938981 = score(doc=4972,freq=3.0), product of:
              0.04324539 = queryWeight, product of:
                1.2055207 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.012979129 = queryNorm
              0.29919907 = fieldWeight in 4972, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.036518328 = weight(abstract_txt:positive in 4972) [ClassicSimilarity], result of:
            0.036518328 = score(doc=4972,freq=1.0), product of:
              0.09886535 = queryWeight, product of:
                1.2888781 = boost
                5.90999 = idf(docFreq=325, maxDocs=44218)
                0.012979129 = queryNorm
              0.36937436 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.90999 = idf(docFreq=325, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.0062754694 = weight(abstract_txt:information in 4972) [ClassicSimilarity], result of:
            0.0062754694 = score(doc=4972,freq=1.0), product of:
              0.041474488 = queryWeight, product of:
                1.3199282 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.012979129 = queryNorm
              0.15130915 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.104976 = weight(abstract_txt:extraction in 4972) [ClassicSimilarity], result of:
            0.104976 = score(doc=4972,freq=1.0), product of:
              0.27127528 = queryWeight, product of:
                3.375708 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012979129 = queryNorm
              0.38697222 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.08536572 = weight(abstract_txt:learning in 4972) [ClassicSimilarity], result of:
            0.08536572 = score(doc=4972,freq=1.0), product of:
              0.28749484 = queryWeight, product of:
                4.6624165 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.012979129 = queryNorm
              0.29692957 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.220584 = weight(abstract_txt:supervised in 4972) [ClassicSimilarity], result of:
            0.220584 = score(doc=4972,freq=1.0), product of:
              0.47292614 = queryWeight, product of:
                4.882554 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.012979129 = queryNorm
              0.4664238 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
          0.13142386 = weight(abstract_txt:algorithm in 4972) [ClassicSimilarity], result of:
            0.13142386 = score(doc=4972,freq=1.0), product of:
              0.36855844 = queryWeight, product of:
                4.977061 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.012979129 = queryNorm
              0.35658893 = fieldWeight in 4972, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=4972)
        0.36 = coord(9/25)