Document (#40088)

Author
Losee, R.
Title
Thesaurus structure, descriptive parameters, and scale
Source
Journal of the Association for Information Science and Technology. 67(2016) no.9, S.2156-2165
Year
2016
Abstract
A thesaurus contains a set of terms or features that may be used to represent recorded information, including prose documents or scientific data sets. The focus of this work is on the basic structural nature of a thesaurus itself, not on how people develop a thesaurus or how a thesaurus effects retrieval performance. Thesauri in this research are automatically developed in a simulation from sets of randomly or exhaustively generated documents. Each thesaurus is generated by the Thesaurus Generator software from a set of several hundred documents, and thousands of different document sets are used as input to the Thesaurus Generator, producing thousands of thesauri. Thus, thousands of thesauri are generated for each data point in accompanying graphs. The characteristics of this large number of thesauri are studied so that the relationships between thesaurus parameters can be determined. Some rules governing these relationships are suggested, addressing factors such as tree height and width, number of tree roots in thesauri, and number of terms available for the vocabulary. How these parameters scale as vocabularies grow is addressed. These results apply to various information systems that contain features with hierarchical relationships, including many thesauri and ontologies.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23544/full.
Theme
Konzeption und Anwendung des Prinzips Thesaurus

Similar documents (author)

  1. Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 2334) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 2334, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=2334)
    
  2. Losee, R.M.: ¬The relative shelf location of circulated books : a study of classification, users, and browsing (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4484) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4484)
    
  3. Losee, R.M.: Seven fundamental questions for the science of library classification (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4507) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4507, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4507)
    
  4. Losee, R.M.: Term dependence : truncating the Bahadur Lazarsfeld expansion (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7389) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7389)
    
  5. Losee, R.M.: Upper bounds for retrieval performance and their user measuring performance and generating optimal queries : can it get any better than this? (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7417) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7417, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7417)
    

Similar documents (content)

  1. Srinivasan, P.: Thesaurus construction (1992) 0.27
    0.267775 = sum of:
      0.267775 = product of:
        0.95633924 = sum of:
          0.020988382 = weight(abstract_txt:terms in 4504) [ClassicSimilarity], result of:
            0.020988382 = score(doc=4504,freq=1.0), product of:
              0.0664368 = queryWeight, product of:
                1.0830364 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.015169993 = queryNorm
              0.31591502 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.02216185 = weight(abstract_txt:each in 4504) [ClassicSimilarity], result of:
            0.02216185 = score(doc=4504,freq=1.0), product of:
              0.06889062 = queryWeight, product of:
                1.1028559 = boost
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.015169993 = queryNorm
              0.32169622 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.029715585 = weight(abstract_txt:features in 4504) [ClassicSimilarity], result of:
            0.029715585 = score(doc=4504,freq=1.0), product of:
              0.083768144 = queryWeight, product of:
                1.2161249 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.015169993 = queryNorm
              0.3547361 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.021712178 = weight(abstract_txt:these in 4504) [ClassicSimilarity], result of:
            0.021712178 = score(doc=4504,freq=2.0), product of:
              0.06174172 = queryWeight, product of:
                1.2787149 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.015169993 = queryNorm
              0.35166138 = fieldWeight in 4504, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.033378843 = weight(abstract_txt:documents in 4504) [ClassicSimilarity], result of:
            0.033378843 = score(doc=4504,freq=1.0), product of:
              0.10361771 = queryWeight, product of:
                1.6565379 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.015169993 = queryNorm
              0.32213452 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.30596212 = weight(abstract_txt:thesauri in 4504) [ClassicSimilarity], result of:
            0.30596212 = score(doc=4504,freq=4.0), product of:
              0.3602093 = queryWeight, product of:
                4.3679414 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.015169993 = queryNorm
              0.849401 = fieldWeight in 4504, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.5224203 = weight(abstract_txt:thesaurus in 4504) [ClassicSimilarity], result of:
            0.5224203 = score(doc=4504,freq=7.0), product of:
              0.48881093 = queryWeight, product of:
                6.2318277 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.015169993 = queryNorm
              1.0687574 = fieldWeight in 4504, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
        0.28 = coord(7/25)
    
  2. Rada, R.: Connecting and evaluating thesauri : issues and cases (1987) 0.26
    0.26085937 = sum of:
      0.26085937 = product of:
        1.0869141 = sum of:
          0.035618465 = weight(abstract_txt:terms in 822) [ClassicSimilarity], result of:
            0.035618465 = score(doc=822,freq=2.0), product of:
              0.0664368 = queryWeight, product of:
                1.0830364 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.015169993 = queryNorm
              0.53612554 = fieldWeight in 822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
          0.018423393 = weight(abstract_txt:these in 822) [ClassicSimilarity], result of:
            0.018423393 = score(doc=822,freq=1.0), product of:
              0.06174172 = queryWeight, product of:
                1.2787149 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.015169993 = queryNorm
              0.29839456 = fieldWeight in 822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
          0.04005461 = weight(abstract_txt:documents in 822) [ClassicSimilarity], result of:
            0.04005461 = score(doc=822,freq=1.0), product of:
              0.10361771 = queryWeight, product of:
                1.6565379 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.015169993 = queryNorm
              0.38656145 = fieldWeight in 822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
          0.06317733 = weight(abstract_txt:relationships in 822) [ClassicSimilarity], result of:
            0.06317733 = score(doc=822,freq=1.0), product of:
              0.14040196 = queryWeight, product of:
                1.9282838 = boost
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.015169993 = queryNorm
              0.44997472 = fieldWeight in 822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
          0.51923496 = weight(abstract_txt:thesauri in 822) [ClassicSimilarity], result of:
            0.51923496 = score(doc=822,freq=8.0), product of:
              0.3602093 = queryWeight, product of:
                4.3679414 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.015169993 = queryNorm
              1.4414812 = fieldWeight in 822, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
          0.41040525 = weight(abstract_txt:thesaurus in 822) [ClassicSimilarity], result of:
            0.41040525 = score(doc=822,freq=3.0), product of:
              0.48881093 = queryWeight, product of:
                6.2318277 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.015169993 = queryNorm
              0.8395992 = fieldWeight in 822, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.09375 = fieldNorm(doc=822)
        0.24 = coord(6/25)
    
  3. Aitchison, J.: ¬A classification as a source for a thesaurus : the bibliographic classification of H.E. Bliss as a source of thesaurus terms and structure (1986) 0.21
    0.21120013 = sum of:
      0.21120013 = product of:
        0.8800006 = sum of:
          0.029682051 = weight(abstract_txt:terms in 1568) [ClassicSimilarity], result of:
            0.029682051 = score(doc=1568,freq=2.0), product of:
              0.0664368 = queryWeight, product of:
                1.0830364 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.015169993 = queryNorm
              0.44677126 = fieldWeight in 1568, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
          0.015352828 = weight(abstract_txt:these in 1568) [ClassicSimilarity], result of:
            0.015352828 = score(doc=1568,freq=1.0), product of:
              0.06174172 = queryWeight, product of:
                1.2787149 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.015169993 = queryNorm
              0.24866214 = fieldWeight in 1568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
          0.033679727 = weight(abstract_txt:number in 1568) [ClassicSimilarity], result of:
            0.033679727 = score(doc=1568,freq=1.0), product of:
              0.10423947 = queryWeight, product of:
                1.6615005 = boost
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.015169993 = queryNorm
              0.32309955 = fieldWeight in 1568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1356745 = idf(docFreq=1930, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
          0.052647777 = weight(abstract_txt:relationships in 1568) [ClassicSimilarity], result of:
            0.052647777 = score(doc=1568,freq=1.0), product of:
              0.14040196 = queryWeight, product of:
                1.9282838 = boost
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.015169993 = queryNorm
              0.37497893 = fieldWeight in 1568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
          0.264971 = weight(abstract_txt:thesauri in 1568) [ClassicSimilarity], result of:
            0.264971 = score(doc=1568,freq=3.0), product of:
              0.3602093 = queryWeight, product of:
                4.3679414 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.015169993 = queryNorm
              0.73560286 = fieldWeight in 1568, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
          0.48366722 = weight(abstract_txt:thesaurus in 1568) [ClassicSimilarity], result of:
            0.48366722 = score(doc=1568,freq=6.0), product of:
              0.48881093 = queryWeight, product of:
                6.2318277 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.015169993 = queryNorm
              0.9894771 = fieldWeight in 1568, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.078125 = fieldNorm(doc=1568)
        0.24 = coord(6/25)
    
  4. Evens, M.: Thesaural relations in information retrieval (2002) 0.20
    0.19859034 = sum of:
      0.19859034 = product of:
        0.82745975 = sum of:
          0.043623533 = weight(abstract_txt:terms in 2201) [ClassicSimilarity], result of:
            0.043623533 = score(doc=2201,freq=3.0), product of:
              0.0664368 = queryWeight, product of:
                1.0830364 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.015169993 = queryNorm
              0.65661705 = fieldWeight in 2201, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
          0.018423393 = weight(abstract_txt:these in 2201) [ClassicSimilarity], result of:
            0.018423393 = score(doc=2201,freq=1.0), product of:
              0.06174172 = queryWeight, product of:
                1.2787149 = boost
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.015169993 = queryNorm
              0.29839456 = fieldWeight in 2201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1828754 = idf(docFreq=5006, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
          0.04005461 = weight(abstract_txt:documents in 2201) [ClassicSimilarity], result of:
            0.04005461 = score(doc=2201,freq=1.0), product of:
              0.10361771 = queryWeight, product of:
                1.6565379 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.015169993 = queryNorm
              0.38656145 = fieldWeight in 2201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
          0.22879313 = weight(abstract_txt:thousands in 2201) [ClassicSimilarity], result of:
            0.22879313 = score(doc=2201,freq=1.0), product of:
              0.33110136 = queryWeight, product of:
                2.9611802 = boost
                7.370734 = idf(docFreq=75, maxDocs=44421)
                0.015169993 = queryNorm
              0.6910063 = fieldWeight in 2201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.370734 = idf(docFreq=75, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
          0.25961748 = weight(abstract_txt:thesauri in 2201) [ClassicSimilarity], result of:
            0.25961748 = score(doc=2201,freq=2.0), product of:
              0.3602093 = queryWeight, product of:
                4.3679414 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.015169993 = queryNorm
              0.7207406 = fieldWeight in 2201, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
          0.23694758 = weight(abstract_txt:thesaurus in 2201) [ClassicSimilarity], result of:
            0.23694758 = score(doc=2201,freq=1.0), product of:
              0.48881093 = queryWeight, product of:
                6.2318277 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.015169993 = queryNorm
              0.48474282 = fieldWeight in 2201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.09375 = fieldNorm(doc=2201)
        0.24 = coord(6/25)
    
  5. Willis, C.; Losee, R.M.: ¬A random walk on an ontology : using thesaurus structure for automatic subject indexing (2013) 0.20
    0.19669673 = sum of:
      0.19669673 = product of:
        0.7024883 = sum of:
          0.020777436 = weight(abstract_txt:terms in 2016) [ClassicSimilarity], result of:
            0.020777436 = score(doc=2016,freq=2.0), product of:
              0.0664368 = queryWeight, product of:
                1.0830364 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.015169993 = queryNorm
              0.31273988 = fieldWeight in 2016, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.015513295 = weight(abstract_txt:each in 2016) [ClassicSimilarity], result of:
            0.015513295 = score(doc=2016,freq=1.0), product of:
              0.06889062 = queryWeight, product of:
                1.1028559 = boost
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.015169993 = queryNorm
              0.22518735 = fieldWeight in 2016, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1177115 = idf(docFreq=1965, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.020800907 = weight(abstract_txt:features in 2016) [ClassicSimilarity], result of:
            0.020800907 = score(doc=2016,freq=1.0), product of:
              0.083768144 = queryWeight, product of:
                1.2161249 = boost
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.015169993 = queryNorm
              0.24831524 = fieldWeight in 2016, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5406218 = idf(docFreq=1287, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.03304337 = weight(abstract_txt:documents in 2016) [ClassicSimilarity], result of:
            0.03304337 = score(doc=2016,freq=2.0), product of:
              0.10361771 = queryWeight, product of:
                1.6565379 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.015169993 = queryNorm
              0.31889692 = fieldWeight in 2016, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.06383203 = weight(abstract_txt:relationships in 2016) [ClassicSimilarity], result of:
            0.06383203 = score(doc=2016,freq=3.0), product of:
              0.14040196 = queryWeight, product of:
                1.9282838 = boost
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.015169993 = queryNorm
              0.45463777 = fieldWeight in 2016, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7997303 = idf(docFreq=993, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.23945324 = weight(abstract_txt:thesauri in 2016) [ClassicSimilarity], result of:
            0.23945324 = score(doc=2016,freq=5.0), product of:
              0.3602093 = queryWeight, product of:
                4.3679414 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.015169993 = queryNorm
              0.6647614 = fieldWeight in 2016, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
          0.30906802 = weight(abstract_txt:thesaurus in 2016) [ClassicSimilarity], result of:
            0.30906802 = score(doc=2016,freq=5.0), product of:
              0.48881093 = queryWeight, product of:
                6.2318277 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.015169993 = queryNorm
              0.6322854 = fieldWeight in 2016, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2016)
        0.28 = coord(7/25)