Document (#35111)

Author
Gelernter, J.
Title
Image indexing in article component databases
Source
Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.1965-1976
Year
2009
Abstract
It is often necessary to compare data-rich charts, tables, diagrams, or drawings rather than the articles that contextualize that data. The objective of this research has been to create a database of non-textual components (here, maps) that are searchable independently of the articles from which they are taken, with the option to view the source articles. The method mines words from the articles that are near or associated with each component map, and these mined words become the basis of region, time, and subject indexing. The evaluation showed that automatic indexing of the component maps by these three facets works well, and indicates that a large-scale component database following this model is viable.

Similar documents (content)

  1. Catarci, T.; Spaccapietra, S.: Visual information querying (2002) 0.19
    0.18532337 = sum of:
      0.18532337 = product of:
        0.4211895 = sum of:
          0.0377169 = weight(abstract_txt:textual in 5268) [ClassicSimilarity], result of:
            0.0377169 = score(doc=5268,freq=3.0), product of:
              0.11682143 = queryWeight, product of:
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.019584825 = queryNorm
              0.3228594 = fieldWeight in 5268, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.024700833 = weight(abstract_txt:facets in 5268) [ClassicSimilarity], result of:
            0.024700833 = score(doc=5268,freq=1.0), product of:
              0.12706132 = queryWeight, product of:
                1.0429066 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.019584825 = queryNorm
              0.19440089 = fieldWeight in 5268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.0093577355 = weight(abstract_txt:these in 5268) [ClassicSimilarity], result of:
            0.0093577355 = score(doc=5268,freq=2.0), product of:
              0.06652519 = queryWeight, product of:
                1.0672024 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.019584825 = queryNorm
              0.14066455 = fieldWeight in 5268, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.02005242 = weight(abstract_txt:data in 5268) [ClassicSimilarity], result of:
            0.02005242 = score(doc=5268,freq=7.0), product of:
              0.07282729 = queryWeight, product of:
                1.1166081 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019584825 = queryNorm
              0.27534214 = fieldWeight in 5268, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.034672134 = weight(abstract_txt:near in 5268) [ClassicSimilarity], result of:
            0.034672134 = score(doc=5268,freq=1.0), product of:
              0.15929152 = queryWeight, product of:
                1.1677102 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.019584825 = queryNorm
              0.21766466 = fieldWeight in 5268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.09057841 = weight(abstract_txt:charts in 5268) [ClassicSimilarity], result of:
            0.09057841 = score(doc=5268,freq=3.0), product of:
              0.20949885 = queryWeight, product of:
                1.339151 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.019584825 = queryNorm
              0.43235752 = fieldWeight in 5268, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.0635342 = weight(abstract_txt:drawings in 5268) [ClassicSimilarity], result of:
            0.0635342 = score(doc=5268,freq=1.0), product of:
              0.23853055 = queryWeight, product of:
                1.4289293 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.019584825 = queryNorm
              0.26635668 = fieldWeight in 5268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.0322115 = weight(abstract_txt:database in 5268) [ClassicSimilarity], result of:
            0.0322115 = score(doc=5268,freq=4.0), product of:
              0.120374985 = queryWeight, product of:
                1.4355617 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.019584825 = queryNorm
              0.26759297 = fieldWeight in 5268, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.042002793 = weight(abstract_txt:maps in 5268) [ClassicSimilarity], result of:
            0.042002793 = score(doc=5268,freq=1.0), product of:
              0.22806977 = queryWeight, product of:
                1.9760029 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.019584825 = queryNorm
              0.18416642 = fieldWeight in 5268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.021543927 = weight(abstract_txt:that in 5268) [ClassicSimilarity], result of:
            0.021543927 = score(doc=5268,freq=7.0), product of:
              0.110180974 = queryWeight, product of:
                2.3788533 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.019584825 = queryNorm
              0.19553219 = fieldWeight in 5268, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
          0.04481864 = weight(abstract_txt:articles in 5268) [ClassicSimilarity], result of:
            0.04481864 = score(doc=5268,freq=1.0), product of:
              0.30005306 = queryWeight, product of:
                3.2052932 = boost
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.019584825 = queryNorm
              0.14936905 = fieldWeight in 5268, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.03125 = fieldNorm(doc=5268)
        0.44 = coord(11/25)
    
  2. Kostoff, R.N.; Rio, J.A. del; Humenik, J.A.; Garcia, E.O.; Ramirez, A.M.: Citation mining : integrating text mining and bibliometrics for research user profiling (2001) 0.11
    0.113923505 = sum of:
      0.113923505 = product of:
        0.4746813 = sum of:
          0.014036604 = weight(abstract_txt:these in 850) [ClassicSimilarity], result of:
            0.014036604 = score(doc=850,freq=2.0), product of:
              0.06652519 = queryWeight, product of:
                1.0672024 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.019584825 = queryNorm
              0.21099682 = fieldWeight in 850, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
          0.062301125 = weight(abstract_txt:independently in 850) [ClassicSimilarity], result of:
            0.062301125 = score(doc=850,freq=1.0), product of:
              0.17966992 = queryWeight, product of:
                1.2401563 = boost
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.019584825 = queryNorm
              0.34675324 = fieldWeight in 850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
          0.024158625 = weight(abstract_txt:database in 850) [ClassicSimilarity], result of:
            0.024158625 = score(doc=850,freq=1.0), product of:
              0.120374985 = queryWeight, product of:
                1.4355617 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.019584825 = queryNorm
              0.20069472 = fieldWeight in 850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
          0.017273571 = weight(abstract_txt:that in 850) [ClassicSimilarity], result of:
            0.017273571 = score(doc=850,freq=2.0), product of:
              0.110180974 = queryWeight, product of:
                2.3788533 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.019584825 = queryNorm
              0.15677454 = fieldWeight in 850, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
          0.2229699 = weight(abstract_txt:articles in 850) [ClassicSimilarity], result of:
            0.2229699 = score(doc=850,freq=11.0), product of:
              0.30005306 = queryWeight, product of:
                3.2052932 = boost
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.019584825 = queryNorm
              0.7431016 = fieldWeight in 850, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
          0.13394144 = weight(abstract_txt:component in 850) [ClassicSimilarity], result of:
            0.13394144 = score(doc=850,freq=1.0), product of:
              0.47508875 = queryWeight, product of:
                4.033259 = boost
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.019584825 = queryNorm
              0.2819293 = fieldWeight in 850, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.046875 = fieldNorm(doc=850)
        0.24 = coord(6/25)
    
  3. Tudhope, D.; Binding, C.; Blocks, D.; Cunliffe, D.: FACET: thesaurus retrieval with semantic term expansion (2002) 0.11
    0.10860818 = sum of:
      0.10860818 = product of:
        0.38788635 = sum of:
          0.043226458 = weight(abstract_txt:facets in 1175) [ClassicSimilarity], result of:
            0.043226458 = score(doc=1175,freq=1.0), product of:
              0.12706132 = queryWeight, product of:
                1.0429066 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.019584825 = queryNorm
              0.34020156 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.0132634295 = weight(abstract_txt:data in 1175) [ClassicSimilarity], result of:
            0.0132634295 = score(doc=1175,freq=1.0), product of:
              0.07282729 = queryWeight, product of:
                1.1166081 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019584825 = queryNorm
              0.18212171 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.053464264 = weight(abstract_txt:tables in 1175) [ClassicSimilarity], result of:
            0.053464264 = score(doc=1175,freq=1.0), product of:
              0.14640501 = queryWeight, product of:
                1.119481 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.019584825 = queryNorm
              0.36518055 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.048817962 = weight(abstract_txt:database in 1175) [ClassicSimilarity], result of:
            0.048817962 = score(doc=1175,freq=3.0), product of:
              0.120374985 = queryWeight, product of:
                1.4355617 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.019584825 = queryNorm
              0.40554905 = fieldWeight in 1175, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.044349283 = weight(abstract_txt:indexing in 1175) [ClassicSimilarity], result of:
            0.044349283 = score(doc=1175,freq=1.0), product of:
              0.18641394 = queryWeight, product of:
                2.1879559 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.019584825 = queryNorm
              0.23790754 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.028499939 = weight(abstract_txt:that in 1175) [ClassicSimilarity], result of:
            0.028499939 = score(doc=1175,freq=4.0), product of:
              0.110180974 = queryWeight, product of:
                2.3788533 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.019584825 = queryNorm
              0.2586648 = fieldWeight in 1175, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
          0.15626502 = weight(abstract_txt:component in 1175) [ClassicSimilarity], result of:
            0.15626502 = score(doc=1175,freq=1.0), product of:
              0.47508875 = queryWeight, product of:
                4.033259 = boost
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.019584825 = queryNorm
              0.32891753 = fieldWeight in 1175, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1175)
        0.28 = coord(7/25)
    
  4. Keyser, P. de: Indexing : from thesauri to the Semantic Web (2012) 0.11
    0.105258755 = sum of:
      0.105258755 = product of:
        0.52629375 = sum of:
          0.016542297 = weight(abstract_txt:these in 4197) [ClassicSimilarity], result of:
            0.016542297 = score(doc=4197,freq=1.0), product of:
              0.06652519 = queryWeight, product of:
                1.0672024 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.019584825 = queryNorm
              0.24866214 = fieldWeight in 4197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.078125 = fieldNorm(doc=4197)
          0.1038352 = weight(abstract_txt:independently in 4197) [ClassicSimilarity], result of:
            0.1038352 = score(doc=4197,freq=1.0), product of:
              0.17966992 = queryWeight, product of:
                1.2401563 = boost
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.019584825 = queryNorm
              0.57792205 = fieldWeight in 4197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.078125 = fieldNorm(doc=4197)
          0.14850229 = weight(abstract_txt:maps in 4197) [ClassicSimilarity], result of:
            0.14850229 = score(doc=4197,freq=2.0), product of:
              0.22806977 = queryWeight, product of:
                1.9760029 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.019584825 = queryNorm
              0.65112656 = fieldWeight in 4197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.078125 = fieldNorm(doc=4197)
          0.23705691 = weight(abstract_txt:indexing in 4197) [ClassicSimilarity], result of:
            0.23705691 = score(doc=4197,freq=14.0), product of:
              0.18641394 = queryWeight, product of:
                2.1879559 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.019584825 = queryNorm
              1.2716694 = fieldWeight in 4197, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=4197)
          0.020357098 = weight(abstract_txt:that in 4197) [ClassicSimilarity], result of:
            0.020357098 = score(doc=4197,freq=1.0), product of:
              0.110180974 = queryWeight, product of:
                2.3788533 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.019584825 = queryNorm
              0.18476056 = fieldWeight in 4197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=4197)
        0.2 = coord(5/25)
    
  5. Jascó, P.: CD-ROM databases with full-page images (1998) 0.10
    0.10403762 = sum of:
      0.10403762 = product of:
        0.5201881 = sum of:
          0.07637752 = weight(abstract_txt:tables in 2890) [ClassicSimilarity], result of:
            0.07637752 = score(doc=2890,freq=1.0), product of:
              0.14640501 = queryWeight, product of:
                1.119481 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.019584825 = queryNorm
              0.5216865 = fieldWeight in 2890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.078125 = fieldNorm(doc=2890)
          0.134257 = weight(abstract_txt:searchable in 2890) [ClassicSimilarity], result of:
            0.134257 = score(doc=2890,freq=2.0), product of:
              0.16924931 = queryWeight, product of:
                1.2036555 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.019584825 = queryNorm
              0.7932499 = fieldWeight in 2890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.078125 = fieldNorm(doc=2890)
          0.13073866 = weight(abstract_txt:charts in 2890) [ClassicSimilarity], result of:
            0.13073866 = score(doc=2890,freq=1.0), product of:
              0.20949885 = queryWeight, product of:
                1.339151 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.019584825 = queryNorm
              0.6240543 = fieldWeight in 2890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.078125 = fieldNorm(doc=2890)
          0.020357098 = weight(abstract_txt:that in 2890) [ClassicSimilarity], result of:
            0.020357098 = score(doc=2890,freq=1.0), product of:
              0.110180974 = queryWeight, product of:
                2.3788533 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.019584825 = queryNorm
              0.18476056 = fieldWeight in 2890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=2890)
          0.15845782 = weight(abstract_txt:articles in 2890) [ClassicSimilarity], result of:
            0.15845782 = score(doc=2890,freq=2.0), product of:
              0.30005306 = queryWeight, product of:
                3.2052932 = boost
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.019584825 = queryNorm
              0.5280993 = fieldWeight in 2890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7798095 = idf(docFreq=1013, maxDocs=44421)
                0.078125 = fieldNorm(doc=2890)
        0.2 = coord(5/25)