Document (#17531)

Author
Bordoni, L.
Pazienza, M.T.
Title
Documents automatic indexing in an environmental domain
Source
International forum on information and documentation. 22(1997) no.1, S.17-28
Year
1997
Abstract
Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Theme
Automatisches Indexieren

Similar documents (content)

  1. Polity, Y.: Vers une ergonomie linguistique (1994) 0.25
    0.2520724 = sum of:
      0.2520724 = product of:
        0.7877262 = sum of:
          0.033806115 = weight(abstract_txt:system in 35) [ClassicSimilarity], result of:
            0.033806115 = score(doc=35,freq=1.0), product of:
              0.09164101 = queryWeight, product of:
                1.1761417 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.02310164 = queryNorm
              0.36889726 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.037022 = weight(abstract_txt:retrieval in 35) [ClassicSimilarity], result of:
            0.037022 = score(doc=35,freq=1.0), product of:
              0.097364254 = queryWeight, product of:
                1.2123122 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.02310164 = queryNorm
              0.3802422 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.049420215 = weight(abstract_txt:describes in 35) [ClassicSimilarity], result of:
            0.049420215 = score(doc=35,freq=1.0), product of:
              0.118040055 = queryWeight, product of:
                1.3348407 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.02310164 = queryNorm
              0.41867328 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.06393665 = weight(abstract_txt:language in 35) [ClassicSimilarity], result of:
            0.06393665 = score(doc=35,freq=1.0), product of:
              0.14014994 = queryWeight, product of:
                1.4544914 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02310164 = queryNorm
              0.45620176 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.07254268 = weight(abstract_txt:indexing in 35) [ClassicSimilarity], result of:
            0.07254268 = score(doc=35,freq=1.0), product of:
              0.15245982 = queryWeight, product of:
                1.5170238 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.02310164 = queryNorm
              0.4758151 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.10525469 = weight(abstract_txt:processing in 35) [ClassicSimilarity], result of:
            0.10525469 = score(doc=35,freq=1.0), product of:
              0.19539823 = queryWeight, product of:
                1.7174141 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.02310164 = queryNorm
              0.53866756 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.11479413 = weight(abstract_txt:natural in 35) [ClassicSimilarity], result of:
            0.11479413 = score(doc=35,freq=1.0), product of:
              0.20703293 = queryWeight, product of:
                1.7678053 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.02310164 = queryNorm
              0.5544728 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
          0.31094977 = weight(abstract_txt:system's in 35) [ClassicSimilarity], result of:
            0.31094977 = score(doc=35,freq=1.0), product of:
              0.4023029 = queryWeight, product of:
                2.464287 = boost
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.02310164 = queryNorm
              0.77292454 = fieldWeight in 35, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.109375 = fieldNorm(doc=35)
        0.32 = coord(8/25)
    
  2. Oard, D.W.; He, D.; Wang, J.: User-assisted query translation for interactive cross-language information retrieval (2008) 0.21
    0.2099573 = sum of:
      0.2099573 = product of:
        0.6561166 = sum of:
          0.04829446 = weight(abstract_txt:system in 3030) [ClassicSimilarity], result of:
            0.04829446 = score(doc=3030,freq=4.0), product of:
              0.09164101 = queryWeight, product of:
                1.1761417 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.02310164 = queryNorm
              0.52699614 = fieldWeight in 3030, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.037397865 = weight(abstract_txt:retrieval in 3030) [ClassicSimilarity], result of:
            0.037397865 = score(doc=3030,freq=2.0), product of:
              0.097364254 = queryWeight, product of:
                1.2123122 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.02310164 = queryNorm
              0.3841026 = fieldWeight in 3030, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.11293134 = weight(abstract_txt:incorporates in 3030) [ClassicSimilarity], result of:
            0.11293134 = score(doc=3030,freq=1.0), product of:
              0.20341267 = queryWeight, product of:
                1.2390497 = boost
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.02310164 = queryNorm
              0.5551834 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1063476 = idf(docFreq=98, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.035300154 = weight(abstract_txt:describes in 3030) [ClassicSimilarity], result of:
            0.035300154 = score(doc=3030,freq=1.0), product of:
              0.118040055 = queryWeight, product of:
                1.3348407 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.02310164 = queryNorm
              0.29905233 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.06239674 = weight(abstract_txt:documents in 3030) [ClassicSimilarity], result of:
            0.06239674 = score(doc=3030,freq=2.0), product of:
              0.136965 = queryWeight, product of:
                1.4378697 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02310164 = queryNorm
              0.455567 = fieldWeight in 3030, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.079101086 = weight(abstract_txt:language in 3030) [ClassicSimilarity], result of:
            0.079101086 = score(doc=3030,freq=3.0), product of:
              0.14014994 = queryWeight, product of:
                1.4544914 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02310164 = queryNorm
              0.5644033 = fieldWeight in 3030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.058587983 = weight(abstract_txt:techniques in 3030) [ClassicSimilarity], result of:
            0.058587983 = score(doc=3030,freq=1.0), product of:
              0.16546927 = queryWeight, product of:
                1.5804232 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.02310164 = queryNorm
              0.35407168 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
          0.222107 = weight(abstract_txt:system's in 3030) [ClassicSimilarity], result of:
            0.222107 = score(doc=3030,freq=1.0), product of:
              0.4023029 = queryWeight, product of:
                2.464287 = boost
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.02310164 = queryNorm
              0.552089 = fieldWeight in 3030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.078125 = fieldNorm(doc=3030)
        0.32 = coord(8/25)
    
  3. Vledutz-Stokolov, N.: Concept recognition in an automatic text-processing system for the life sciences (1987) 0.20
    0.19855373 = sum of:
      0.19855373 = product of:
        0.6204804 = sum of:
          0.038635563 = weight(abstract_txt:system in 2848) [ClassicSimilarity], result of:
            0.038635563 = score(doc=2848,freq=4.0), product of:
              0.09164101 = queryWeight, product of:
                1.1761417 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.02310164 = queryNorm
              0.42159688 = fieldWeight in 2848, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.028240124 = weight(abstract_txt:describes in 2848) [ClassicSimilarity], result of:
            0.028240124 = score(doc=2848,freq=1.0), product of:
              0.118040055 = queryWeight, product of:
                1.3348407 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.02310164 = queryNorm
              0.23924187 = fieldWeight in 2848, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.096663125 = weight(abstract_txt:language in 2848) [ClassicSimilarity], result of:
            0.096663125 = score(doc=2848,freq=7.0), product of:
              0.14014994 = queryWeight, product of:
                1.4544914 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02310164 = queryNorm
              0.6897122 = fieldWeight in 2848, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.05862334 = weight(abstract_txt:indexing in 2848) [ClassicSimilarity], result of:
            0.05862334 = score(doc=2848,freq=2.0), product of:
              0.15245982 = queryWeight, product of:
                1.5170238 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.02310164 = queryNorm
              0.38451666 = fieldWeight in 2848, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.046870384 = weight(abstract_txt:techniques in 2848) [ClassicSimilarity], result of:
            0.046870384 = score(doc=2848,freq=1.0), product of:
              0.16546927 = queryWeight, product of:
                1.5804232 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.02310164 = queryNorm
              0.28325734 = fieldWeight in 2848, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.060145535 = weight(abstract_txt:processing in 2848) [ClassicSimilarity], result of:
            0.060145535 = score(doc=2848,freq=1.0), product of:
              0.19539823 = queryWeight, product of:
                1.7174141 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.02310164 = queryNorm
              0.30781004 = fieldWeight in 2848, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.11361672 = weight(abstract_txt:natural in 2848) [ClassicSimilarity], result of:
            0.11361672 = score(doc=2848,freq=3.0), product of:
              0.20703293 = queryWeight, product of:
                1.7678053 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.02310164 = queryNorm
              0.54878575 = fieldWeight in 2848, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
          0.17768559 = weight(abstract_txt:system's in 2848) [ClassicSimilarity], result of:
            0.17768559 = score(doc=2848,freq=1.0), product of:
              0.4023029 = queryWeight, product of:
                2.464287 = boost
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.02310164 = queryNorm
              0.44167116 = fieldWeight in 2848, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.0625 = fieldNorm(doc=2848)
        0.32 = coord(8/25)
    
  4. Burke, R.D.: Question answering from frequently asked question files : experiences with the FAQ Finder System (1997) 0.19
    0.19374658 = sum of:
      0.19374658 = product of:
        0.69195205 = sum of:
          0.033806115 = weight(abstract_txt:system in 2191) [ClassicSimilarity], result of:
            0.033806115 = score(doc=2191,freq=1.0), product of:
              0.09164101 = queryWeight, product of:
                1.1761417 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.02310164 = queryNorm
              0.36889726 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.037022 = weight(abstract_txt:retrieval in 2191) [ClassicSimilarity], result of:
            0.037022 = score(doc=2191,freq=1.0), product of:
              0.097364254 = queryWeight, product of:
                1.2123122 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.02310164 = queryNorm
              0.3802422 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.049420215 = weight(abstract_txt:describes in 2191) [ClassicSimilarity], result of:
            0.049420215 = score(doc=2191,freq=1.0), product of:
              0.118040055 = queryWeight, product of:
                1.3348407 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.02310164 = queryNorm
              0.41867328 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.06393665 = weight(abstract_txt:language in 2191) [ClassicSimilarity], result of:
            0.06393665 = score(doc=2191,freq=1.0), product of:
              0.14014994 = queryWeight, product of:
                1.4544914 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02310164 = queryNorm
              0.45620176 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.08202317 = weight(abstract_txt:techniques in 2191) [ClassicSimilarity], result of:
            0.08202317 = score(doc=2191,freq=1.0), product of:
              0.16546927 = queryWeight, product of:
                1.5804232 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.02310164 = queryNorm
              0.49570033 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.11479413 = weight(abstract_txt:natural in 2191) [ClassicSimilarity], result of:
            0.11479413 = score(doc=2191,freq=1.0), product of:
              0.20703293 = queryWeight, product of:
                1.7678053 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.02310164 = queryNorm
              0.5544728 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
          0.31094977 = weight(abstract_txt:system's in 2191) [ClassicSimilarity], result of:
            0.31094977 = score(doc=2191,freq=1.0), product of:
              0.4023029 = queryWeight, product of:
                2.464287 = boost
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.02310164 = queryNorm
              0.77292454 = fieldWeight in 2191, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0667386 = idf(docFreq=102, maxDocs=44421)
                0.109375 = fieldNorm(doc=2191)
        0.28 = coord(7/25)
    
  5. Evans, D.A.; Lefferts, R.G.: CLARIT-TREC experiments (1995) 0.19
    0.1923036 = sum of:
      0.1923036 = product of:
        0.6867986 = sum of:
          0.0957603 = weight(abstract_txt:recall in 1980) [ClassicSimilarity], result of:
            0.0957603 = score(doc=1980,freq=1.0), product of:
              0.13321261 = queryWeight, product of:
                1.0027032 = boost
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.02310164 = queryNorm
              0.7188531 = fieldWeight in 1980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.750825 = idf(docFreq=383, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.077271126 = weight(abstract_txt:system in 1980) [ClassicSimilarity], result of:
            0.077271126 = score(doc=1980,freq=4.0), product of:
              0.09164101 = queryWeight, product of:
                1.1761417 = boost
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.02310164 = queryNorm
              0.84319377 = fieldWeight in 1980, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.372775 = idf(docFreq=4140, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.056480248 = weight(abstract_txt:describes in 1980) [ClassicSimilarity], result of:
            0.056480248 = score(doc=1980,freq=1.0), product of:
              0.118040055 = queryWeight, product of:
                1.3348407 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.02310164 = queryNorm
              0.47848374 = fieldWeight in 1980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.07307046 = weight(abstract_txt:language in 1980) [ClassicSimilarity], result of:
            0.07307046 = score(doc=1980,freq=1.0), product of:
              0.14014994 = queryWeight, product of:
                1.4544914 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.02310164 = queryNorm
              0.52137345 = fieldWeight in 1980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.08290592 = weight(abstract_txt:indexing in 1980) [ClassicSimilarity], result of:
            0.08290592 = score(doc=1980,freq=1.0), product of:
              0.15245982 = queryWeight, product of:
                1.5170238 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.02310164 = queryNorm
              0.5437887 = fieldWeight in 1980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.17011726 = weight(abstract_txt:processing in 1980) [ClassicSimilarity], result of:
            0.17011726 = score(doc=1980,freq=2.0), product of:
              0.19539823 = queryWeight, product of:
                1.7174141 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.02310164 = queryNorm
              0.8706182 = fieldWeight in 1980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
          0.1311933 = weight(abstract_txt:natural in 1980) [ClassicSimilarity], result of:
            0.1311933 = score(doc=1980,freq=1.0), product of:
              0.20703293 = queryWeight, product of:
                1.7678053 = boost
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.02310164 = queryNorm
              0.6336832 = fieldWeight in 1980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0694656 = idf(docFreq=758, maxDocs=44421)
                0.125 = fieldNorm(doc=1980)
        0.28 = coord(7/25)