Document (#32925)

Author
Losee, R.M.
Title
Decisions in thesaurus construction and use
Source
Information processing and management. 43(2007) no.4, S.958-968
Year
2007
Abstract
A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in "breadcrumb navigation".
Theme
Konzeption und Anwendung des Prinzips Thesaurus

Similar documents (author)

  1. Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 2334) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 2334, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=2334)
    
  2. Losee, R.M.: ¬The relative shelf location of circulated books : a study of classification, users, and browsing (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4484) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4484)
    
  3. Losee, R.M.: Seven fundamental questions for the science of library classification (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4507) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4507, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4507)
    
  4. Losee, R.M.: Term dependence : truncating the Bahadur Lazarsfeld expansion (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7389) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7389)
    
  5. Losee, R.M.: Upper bounds for retrieval performance and their user measuring performance and generating optimal queries : can it get any better than this? (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7417) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7417, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7417)
    

Similar documents (content)

  1. Park, Y.C.; Choi, K.-S.: Automatic thesaurus construction using Bayesian networks (1996) 0.18
    0.17555782 = sum of:
      0.17555782 = product of:
        0.73149097 = sum of:
          0.12127408 = weight(abstract_txt:deciding in 6649) [ClassicSimilarity], result of:
            0.12127408 = score(doc=6649,freq=1.0), product of:
              0.16691004 = queryWeight, product of:
                1.1938149 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.01803978 = queryNorm
              0.7265835 = fieldWeight in 6649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
          0.08121907 = weight(abstract_txt:term in 6649) [ClassicSimilarity], result of:
            0.08121907 = score(doc=6649,freq=2.0), product of:
              0.1277642 = queryWeight, product of:
                1.4771185 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.01803978 = queryNorm
              0.6356951 = fieldWeight in 6649, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
          0.11937584 = weight(abstract_txt:construction in 6649) [ClassicSimilarity], result of:
            0.11937584 = score(doc=6649,freq=2.0), product of:
              0.16516376 = queryWeight, product of:
                1.6794541 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01803978 = queryNorm
              0.7227726 = fieldWeight in 6649, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
          0.11555059 = weight(abstract_txt:terms in 6649) [ClassicSimilarity], result of:
            0.11555059 = score(doc=6649,freq=5.0), product of:
              0.1363125 = queryWeight, product of:
                1.8686337 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.01803978 = queryNorm
              0.8476889 = fieldWeight in 6649, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
          0.039429102 = weight(abstract_txt:used in 6649) [ClassicSimilarity], result of:
            0.039429102 = score(doc=6649,freq=1.0), product of:
              0.12527615 = queryWeight, product of:
                2.068521 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01803978 = queryNorm
              0.3147375 = fieldWeight in 6649, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
          0.25464225 = weight(abstract_txt:thesaurus in 6649) [ClassicSimilarity], result of:
            0.25464225 = score(doc=6649,freq=2.0), product of:
              0.3714532 = queryWeight, product of:
                3.9822893 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01803978 = queryNorm
              0.6855298 = fieldWeight in 6649, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.09375 = fieldNorm(doc=6649)
        0.24 = coord(6/25)
    
  2. Becker, C.; Rauber, A.: Decision criteria in digital preservation : what to measure and how (2011) 0.16
    0.1601358 = sum of:
      0.1601358 = product of:
        0.5719136 = sum of:
          0.12085738 = weight(abstract_txt:measurements in 456) [ClassicSimilarity], result of:
            0.12085738 = score(doc=456,freq=2.0), product of:
              0.17319557 = queryWeight, product of:
                1.2160856 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.01803978 = queryNorm
              0.69780874 = fieldWeight in 456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.024755906 = weight(abstract_txt:when in 456) [ClassicSimilarity], result of:
            0.024755906 = score(doc=456,freq=1.0), product of:
              0.09553456 = queryWeight, product of:
                1.2772944 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.01803978 = queryNorm
              0.25913036 = fieldWeight in 456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.03319947 = weight(abstract_txt:made in 456) [ClassicSimilarity], result of:
            0.03319947 = score(doc=456,freq=1.0), product of:
              0.116179295 = queryWeight, product of:
                1.4085592 = boost
                4.5721703 = idf(docFreq=1247, maxDocs=44421)
                0.01803978 = queryNorm
              0.28576064 = fieldWeight in 456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5721703 = idf(docFreq=1247, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.05414605 = weight(abstract_txt:term in 456) [ClassicSimilarity], result of:
            0.05414605 = score(doc=456,freq=2.0), product of:
              0.1277642 = queryWeight, product of:
                1.4771185 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.01803978 = queryNorm
              0.42379674 = fieldWeight in 456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.12163679 = weight(abstract_txt:decision in 456) [ClassicSimilarity], result of:
            0.12163679 = score(doc=456,freq=4.0), product of:
              0.17393939 = queryWeight, product of:
                1.7234937 = boost
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.01803978 = queryNorm
              0.6993056 = fieldWeight in 456, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5944448 = idf(docFreq=448, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.02628607 = weight(abstract_txt:used in 456) [ClassicSimilarity], result of:
            0.02628607 = score(doc=456,freq=1.0), product of:
              0.12527615 = queryWeight, product of:
                2.068521 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01803978 = queryNorm
              0.20982501 = fieldWeight in 456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
          0.19103195 = weight(abstract_txt:decisions in 456) [ClassicSimilarity], result of:
            0.19103195 = score(doc=456,freq=2.0), product of:
              0.37305996 = queryWeight, product of:
                3.5695632 = boost
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.01803978 = queryNorm
              0.5120677 = fieldWeight in 456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.0625 = fieldNorm(doc=456)
        0.28 = coord(7/25)
    
  3. Fidel, R.: Thesaurus requirements for an intermediary expert system (1992) 0.16
    0.15850827 = sum of:
      0.15850827 = product of:
        0.56610096 = sum of:
          0.024755906 = weight(abstract_txt:when in 2102) [ClassicSimilarity], result of:
            0.024755906 = score(doc=2102,freq=1.0), product of:
              0.09553456 = queryWeight, product of:
                1.2772944 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.01803978 = queryNorm
              0.25913036 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.03828704 = weight(abstract_txt:term in 2102) [ClassicSimilarity], result of:
            0.03828704 = score(doc=2102,freq=1.0), product of:
              0.1277642 = queryWeight, product of:
                1.4771185 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.01803978 = queryNorm
              0.29966956 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.05627431 = weight(abstract_txt:construction in 2102) [ClassicSimilarity], result of:
            0.05627431 = score(doc=2102,freq=1.0), product of:
              0.16516376 = queryWeight, product of:
                1.6794541 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01803978 = queryNorm
              0.34071827 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.03445053 = weight(abstract_txt:terms in 2102) [ClassicSimilarity], result of:
            0.03445053 = score(doc=2102,freq=1.0), product of:
              0.1363125 = queryWeight, product of:
                1.8686337 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.01803978 = queryNorm
              0.252732 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.037174117 = weight(abstract_txt:used in 2102) [ClassicSimilarity], result of:
            0.037174117 = score(doc=2102,freq=2.0), product of:
              0.12527615 = queryWeight, product of:
                2.068521 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01803978 = queryNorm
              0.29673737 = fieldWeight in 2102, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.13508 = weight(abstract_txt:decisions in 2102) [ClassicSimilarity], result of:
            0.13508 = score(doc=2102,freq=1.0), product of:
              0.37305996 = queryWeight, product of:
                3.5695632 = boost
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.01803978 = queryNorm
              0.36208653 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
          0.24007902 = weight(abstract_txt:thesaurus in 2102) [ClassicSimilarity], result of:
            0.24007902 = score(doc=2102,freq=4.0), product of:
              0.3714532 = queryWeight, product of:
                3.9822893 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01803978 = queryNorm
              0.64632374 = fieldWeight in 2102, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.0625 = fieldNorm(doc=2102)
        0.28 = coord(7/25)
    
  4. Losee, R.M.: ¬The effect of assigning a metadata or indexing term on document ordering (2013) 0.14
    0.13692087 = sum of:
      0.13692087 = product of:
        0.57050365 = sum of:
          0.12338706 = weight(abstract_txt:ordering in 2100) [ClassicSimilarity], result of:
            0.12338706 = score(doc=2100,freq=3.0), product of:
              0.13219975 = queryWeight, product of:
                1.0624563 = boost
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.01803978 = queryNorm
              0.9333381 = fieldWeight in 2100, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8974466 = idf(docFreq=121, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
          0.043762673 = weight(abstract_txt:when in 2100) [ClassicSimilarity], result of:
            0.043762673 = score(doc=2100,freq=2.0), product of:
              0.09553456 = queryWeight, product of:
                1.2772944 = boost
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.01803978 = queryNorm
              0.45808208 = fieldWeight in 2100, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1460857 = idf(docFreq=1910, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
          0.09572314 = weight(abstract_txt:performance in 2100) [ClassicSimilarity], result of:
            0.09572314 = score(doc=2100,freq=5.0), product of:
              0.11861035 = queryWeight, product of:
                1.42322 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.01803978 = queryNorm
              0.80703866 = fieldWeight in 2100, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
          0.0957176 = weight(abstract_txt:term in 2100) [ClassicSimilarity], result of:
            0.0957176 = score(doc=2100,freq=4.0), product of:
              0.1277642 = queryWeight, product of:
                1.4771185 = boost
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.01803978 = queryNorm
              0.7491739 = fieldWeight in 2100, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.794713 = idf(docFreq=998, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
          0.043063167 = weight(abstract_txt:terms in 2100) [ClassicSimilarity], result of:
            0.043063167 = score(doc=2100,freq=1.0), product of:
              0.1363125 = queryWeight, product of:
                1.8686337 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.01803978 = queryNorm
              0.31591502 = fieldWeight in 2100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
          0.16884999 = weight(abstract_txt:decisions in 2100) [ClassicSimilarity], result of:
            0.16884999 = score(doc=2100,freq=1.0), product of:
              0.37305996 = queryWeight, product of:
                3.5695632 = boost
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.01803978 = queryNorm
              0.45260817 = fieldWeight in 2100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7933846 = idf(docFreq=367, maxDocs=44421)
                0.078125 = fieldNorm(doc=2100)
        0.24 = coord(6/25)
    
  5. Srinivasan, P.: Thesaurus construction (1992) 0.13
    0.13405302 = sum of:
      0.13405302 = product of:
        0.6702651 = sum of:
          0.14068577 = weight(abstract_txt:construction in 4504) [ClassicSimilarity], result of:
            0.14068577 = score(doc=4504,freq=4.0), product of:
              0.16516376 = queryWeight, product of:
                1.6794541 = boost
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.01803978 = queryNorm
              0.8517957 = fieldWeight in 4504, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4514923 = idf(docFreq=517, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.043063167 = weight(abstract_txt:terms in 4504) [ClassicSimilarity], result of:
            0.043063167 = score(doc=4504,freq=1.0), product of:
              0.1363125 = queryWeight, product of:
                1.8686337 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.01803978 = queryNorm
              0.31591502 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.056665145 = weight(abstract_txt:should in 4504) [ClassicSimilarity], result of:
            0.056665145 = score(doc=4504,freq=1.0), product of:
              0.16368507 = queryWeight, product of:
                2.0476744 = boost
                4.4311547 = idf(docFreq=1436, maxDocs=44421)
                0.01803978 = queryNorm
              0.34618396 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4311547 = idf(docFreq=1436, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.032857586 = weight(abstract_txt:used in 4504) [ClassicSimilarity], result of:
            0.032857586 = score(doc=4504,freq=1.0), product of:
              0.12527615 = queryWeight, product of:
                2.068521 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01803978 = queryNorm
              0.26228127 = fieldWeight in 4504, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
          0.39699337 = weight(abstract_txt:thesaurus in 4504) [ClassicSimilarity], result of:
            0.39699337 = score(doc=4504,freq=7.0), product of:
              0.3714532 = queryWeight, product of:
                3.9822893 = boost
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.01803978 = queryNorm
              1.0687574 = fieldWeight in 4504, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.17059 = idf(docFreq=685, maxDocs=44421)
                0.078125 = fieldNorm(doc=4504)
        0.2 = coord(5/25)