Document (#21505)

Author
Srinivasan, P.
Title
Thesaurus construction
Source
Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
Imprint
Englewood Cliffs, NJ : Prentice Hall
Year
1992
Pages
S.161-218
Abstract
Thesauri are valuable structures for Information Retrieval systems. A thesaurus provides a precise and controlled vocabulary which serves to coordinate dacument indexing and document retrieval. In both indexing and retrieval, a thesaurus may be used to select the most appropriate terms. Additionally, the thesaurus can assist the searcher in reformulating search strategies if required. Examines the important features of thesauri. This should allow the reader to differentiate between thesauri. Next, a brief overview of the manual thesaurus construction process is given. 2 major approaches for automatic thesaurus construction have been selected for detailed examination. The first is on thesaurus construction from collections of documents,a nd the 2nd, on thesaurus construction by merging existing thesauri. These 2 methods were selected since they rely on statistical techniques alone and are also significantly different from each other. Programs written in C language accompany the discussion of these approaches
Theme
Konzeption und Anwendung des Prinzips Thesaurus

Similar documents (author)

  1. Srinivasan, P.: Expert interface to Library of Congress Subject Headings (1990/91) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 2208) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 2208, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=2208)
    
  2. Srinivasan, P.: Query expansion and MEDLINE (1996) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 67) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 67, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=67)
    
  3. Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 2594) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 2594, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=2594)
    
  4. Srinivasan, P.: On generalizing the Two-Poisson Model (1990) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 2948) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 2948, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=2948)
    
  5. Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE (1996) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:srinivasan in 6702) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 6702, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=6702)
    

Similar documents (content)

  1. Nielsen, M.L.: Thesaurus construction : key issues and selected readings (2004) 0.27
    0.2654172 = sum of:
      0.2654172 = product of:
        1.327086 = sum of:
          0.053508874 = weight(abstract_txt:manual in 6) [ClassicSimilarity], result of:
            0.053508874 = score(doc=6,freq=1.0), product of:
              0.095841475 = queryWeight, product of:
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.01609357 = queryNorm
              0.55830604 = fieldWeight in 6, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9552646 = idf(docFreq=312, maxDocs=44421)
                0.09375 = fieldNorm(doc=6)
          0.04926072 = weight(abstract_txt:approaches in 6) [ClassicSimilarity], result of:
            0.04926072 = score(doc=6,freq=1.0), product of:
              0.11427383 = queryWeight, product of:
                1.5442288 = boost
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.01609357 = queryNorm
              0.43107614 = fieldWeight in 6, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.09375 = fieldNorm(doc=6)
          0.072578825 = weight(abstract_txt:selected in 6) [ClassicSimilarity], result of:
            0.072578825 = score(doc=6,freq=1.0), product of:
              0.14796293 = queryWeight, product of:
                1.7571738 = boost
                5.2322173 = idf(docFreq=644, maxDocs=44421)
                0.01609357 = queryNorm
              0.49052036 = fieldWeight in 6, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2322173 = idf(docFreq=644, maxDocs=44421)
                0.09375 = fieldNorm(doc=6)
          0.41045815 = weight(abstract_txt:construction in 6) [ClassicSimilarity], result of:
            0.41045815 = score(doc=6,freq=4.0), product of:
              0.40156162 = queryWeight, product of:
                4.5770364 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01609357 = queryNorm
              1.0221548 = fieldWeight in 6, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.09375 = fieldNorm(doc=6)
          0.7412794 = weight(abstract_txt:thesaurus in 6) [ClassicSimilarity], result of:
            0.7412794 = score(doc=6,freq=7.0), product of:
              0.57799166 = queryWeight, product of:
                6.945908 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01609357 = queryNorm
              1.2825089 = fieldWeight in 6, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.09375 = fieldNorm(doc=6)
        0.2 = coord(5/25)
    
  2. Spiteri, L.F.: ¬The use of facet analysis in information retrieval thesauri : an examination of selected guidelines for thesaurus construction (1997) 0.26
    0.25914428 = sum of:
      0.25914428 = product of:
        1.2957214 = sum of:
          0.059000343 = weight(abstract_txt:examination in 1372) [ClassicSimilarity], result of:
            0.059000343 = score(doc=1372,freq=1.0), product of:
              0.10229144 = queryWeight, product of:
                1.0331013 = boost
                6.1523914 = idf(docFreq=256, maxDocs=44421)
                0.01609357 = queryNorm
              0.5767867 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1523914 = idf(docFreq=256, maxDocs=44421)
                0.09375 = fieldNorm(doc=1372)
          0.031935275 = weight(abstract_txt:retrieval in 1372) [ClassicSimilarity], result of:
            0.031935275 = score(doc=1372,freq=1.0), product of:
              0.09798444 = queryWeight, product of:
                1.7513077 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01609357 = queryNorm
              0.3259219 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.09375 = fieldNorm(doc=1372)
          0.36403725 = weight(abstract_txt:thesauri in 1372) [ClassicSimilarity], result of:
            0.36403725 = score(doc=1372,freq=5.0), product of:
              0.31944552 = queryWeight, product of:
                3.6513348 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01609357 = queryNorm
              1.139591 = fieldWeight in 1372, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.09375 = fieldNorm(doc=1372)
          0.3554672 = weight(abstract_txt:construction in 1372) [ClassicSimilarity], result of:
            0.3554672 = score(doc=1372,freq=3.0), product of:
              0.40156162 = queryWeight, product of:
                4.5770364 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01609357 = queryNorm
              0.88521206 = fieldWeight in 1372, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.09375 = fieldNorm(doc=1372)
          0.48528135 = weight(abstract_txt:thesaurus in 1372) [ClassicSimilarity], result of:
            0.48528135 = score(doc=1372,freq=3.0), product of:
              0.57799166 = queryWeight, product of:
                6.945908 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01609357 = queryNorm
              0.8395992 = fieldWeight in 1372, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.09375 = fieldNorm(doc=1372)
        0.2 = coord(5/25)
    
  3. Nielsen, M.L.: Future thesauri : what kind of conceptual knowledge do searchers need? (1998) 0.24
    0.24073714 = sum of:
      0.24073714 = product of:
        1.0030714 = sum of:
          0.06627327 = weight(abstract_txt:searcher in 1145) [ClassicSimilarity], result of:
            0.06627327 = score(doc=1145,freq=1.0), product of:
              0.12481957 = queryWeight, product of:
                1.1412075 = boost
                6.7961926 = idf(docFreq=134, maxDocs=44421)
                0.01609357 = queryNorm
              0.5309526 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7961926 = idf(docFreq=134, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
          0.034764167 = weight(abstract_txt:indexing in 1145) [ClassicSimilarity], result of:
            0.034764167 = score(doc=1145,freq=1.0), product of:
              0.102287285 = queryWeight, product of:
                1.4609962 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.01609357 = queryNorm
              0.33986792 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
          0.026612727 = weight(abstract_txt:retrieval in 1145) [ClassicSimilarity], result of:
            0.026612727 = score(doc=1145,freq=1.0), product of:
              0.09798444 = queryWeight, product of:
                1.7513077 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01609357 = queryNorm
              0.27160156 = fieldWeight in 1145, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
          0.3033644 = weight(abstract_txt:thesauri in 1145) [ClassicSimilarity], result of:
            0.3033644 = score(doc=1145,freq=5.0), product of:
              0.31944552 = queryWeight, product of:
                3.6513348 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01609357 = queryNorm
              0.9496592 = fieldWeight in 1145, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
          0.24186477 = weight(abstract_txt:construction in 1145) [ClassicSimilarity], result of:
            0.24186477 = score(doc=1145,freq=2.0), product of:
              0.40156162 = queryWeight, product of:
                4.5770364 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01609357 = queryNorm
              0.6023105 = fieldWeight in 1145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
          0.33019212 = weight(abstract_txt:thesaurus in 1145) [ClassicSimilarity], result of:
            0.33019212 = score(doc=1145,freq=2.0), product of:
              0.57799166 = queryWeight, product of:
                6.945908 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01609357 = queryNorm
              0.5712749 = fieldWeight in 1145, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.078125 = fieldNorm(doc=1145)
        0.24 = coord(6/25)
    
  4. McCulloch, E.: Thesauri: practical guidance for construction (2005) 0.24
    0.23964989 = sum of:
      0.23964989 = product of:
        0.99854124 = sum of:
          0.038965154 = weight(abstract_txt:assist in 5724) [ClassicSimilarity], result of:
            0.038965154 = score(doc=5724,freq=1.0), product of:
              0.10165171 = queryWeight, product of:
                1.0298657 = boost
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.01609357 = queryNorm
              0.38332018 = fieldWeight in 5724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.133123 = idf(docFreq=261, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
          0.021290181 = weight(abstract_txt:retrieval in 5724) [ClassicSimilarity], result of:
            0.021290181 = score(doc=5724,freq=1.0), product of:
              0.09798444 = queryWeight, product of:
                1.7513077 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01609357 = queryNorm
              0.21728125 = fieldWeight in 5724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
          0.04838589 = weight(abstract_txt:selected in 5724) [ClassicSimilarity], result of:
            0.04838589 = score(doc=5724,freq=1.0), product of:
              0.14796293 = queryWeight, product of:
                1.7571738 = boost
                5.2322173 = idf(docFreq=644, maxDocs=44421)
                0.01609357 = queryNorm
              0.32701358 = fieldWeight in 5724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2322173 = idf(docFreq=644, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
          0.2426915 = weight(abstract_txt:thesauri in 5724) [ClassicSimilarity], result of:
            0.2426915 = score(doc=5724,freq=5.0), product of:
              0.31944552 = queryWeight, product of:
                3.6513348 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01609357 = queryNorm
              0.75972736 = fieldWeight in 5724, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
          0.27363876 = weight(abstract_txt:construction in 5724) [ClassicSimilarity], result of:
            0.27363876 = score(doc=5724,freq=4.0), product of:
              0.40156162 = queryWeight, product of:
                4.5770364 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01609357 = queryNorm
              0.68143654 = fieldWeight in 5724, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
          0.37356973 = weight(abstract_txt:thesaurus in 5724) [ClassicSimilarity], result of:
            0.37356973 = score(doc=5724,freq=4.0), product of:
              0.57799166 = queryWeight, product of:
                6.945908 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01609357 = queryNorm
              0.64632374 = fieldWeight in 5724, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.0625 = fieldNorm(doc=5724)
        0.24 = coord(6/25)
    
  5. Hou, H.; Chen, S.: ¬The integration of Chinese classification and thesaurus : its progress and technical features (1996) 0.22
    0.21953566 = sum of:
      0.21953566 = product of:
        1.372098 = sum of:
          0.09832791 = weight(abstract_txt:indexing in 2318) [ClassicSimilarity], result of:
            0.09832791 = score(doc=2318,freq=2.0), product of:
              0.102287285 = queryWeight, product of:
                1.4609962 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.01609357 = queryNorm
              0.9612917 = fieldWeight in 2318, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.15625 = fieldNorm(doc=2318)
          0.27133733 = weight(abstract_txt:thesauri in 2318) [ClassicSimilarity], result of:
            0.27133733 = score(doc=2318,freq=1.0), product of:
              0.31944552 = queryWeight, product of:
                3.6513348 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.01609357 = queryNorm
              0.849401 = fieldWeight in 2318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.15625 = fieldNorm(doc=2318)
          0.34204844 = weight(abstract_txt:construction in 2318) [ClassicSimilarity], result of:
            0.34204844 = score(doc=2318,freq=1.0), product of:
              0.40156162 = queryWeight, product of:
                4.5770364 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01609357 = queryNorm
              0.8517957 = fieldWeight in 2318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.15625 = fieldNorm(doc=2318)
          0.66038424 = weight(abstract_txt:thesaurus in 2318) [ClassicSimilarity], result of:
            0.66038424 = score(doc=2318,freq=2.0), product of:
              0.57799166 = queryWeight, product of:
                6.945908 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01609357 = queryNorm
              1.1425498 = fieldWeight in 2318, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.15625 = fieldNorm(doc=2318)
        0.16 = coord(4/25)