Document (#2331)

Author
Salton, G.
Title
Fast document classification in automatic information retrieval
Source
Kooperation in der Klassifikation I. Proc. der Sekt.1-3 der 2. Fachtagung der Gesellschaft für Klassifikation, Frankfurt-Hoechst, 6.-7.4.1978. Bearb.: W. Dahlberg
Imprint
Frankfurt : Gesellschaft für Klassifikation
Year
1978
Pages
S.129-146
Series
Studien zur Klassifikation; Bd.2
Abstract
A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
Theme
Automatisches Indexieren

Similar documents (author)

  1. Salton, G.: Another look at automatic text-retrieval systems (1986) 4.87
    4.8684025 = sum of:
      4.8684025 = weight(author_txt:salton in 1355) [ClassicSimilarity], result of:
        4.8684025 = score(doc=1355,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.12837885 = queryNorm
          4.868403 = fieldWeight in 1355, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.625 = fieldNorm(doc=1355)
    
  2. Salton, G.: ¬A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART) (1972) 4.87
    4.8684025 = sum of:
      4.8684025 = weight(author_txt:salton in 2324) [ClassicSimilarity], result of:
        4.8684025 = score(doc=2324,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.12837885 = queryNorm
          4.868403 = fieldWeight in 2324, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.625 = fieldNorm(doc=2324)
    
  3. Salton, G.: Future prospects for text-based information retrieval (1990) 4.87
    4.8684025 = sum of:
      4.8684025 = weight(author_txt:salton in 2326) [ClassicSimilarity], result of:
        4.8684025 = score(doc=2326,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.12837885 = queryNorm
          4.868403 = fieldWeight in 2326, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.625 = fieldNorm(doc=2326)
    
  4. Salton, G.: Expert systems and information retrieval (1987) 4.87
    4.8684025 = sum of:
      4.8684025 = weight(author_txt:salton in 2836) [ClassicSimilarity], result of:
        4.8684025 = score(doc=2836,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.12837885 = queryNorm
          4.868403 = fieldWeight in 2836, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.625 = fieldNorm(doc=2836)
    
  5. Salton, G.: Historical note: the past thirty years in information retrieval (1987) 4.87
    4.8684025 = sum of:
      4.8684025 = weight(author_txt:salton in 3909) [ClassicSimilarity], result of:
        4.8684025 = score(doc=3909,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.12837885 = queryNorm
          4.868403 = fieldWeight in 3909, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            7.7894444 = idf(docFreq=49, maxDocs=44421)
            0.625 = fieldNorm(doc=3909)
    

Similar documents (content)

  1. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2012) 0.17
    0.16774862 = sum of:
      0.16774862 = product of:
        0.69895256 = sum of:
          0.041568328 = weight(abstract_txt:simple in 1310) [ClassicSimilarity], result of:
            0.041568328 = score(doc=1310,freq=1.0), product of:
              0.100115865 = queryWeight, product of:
                1.0115006 = boost
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.01862375 = queryNorm
              0.4152022 = fieldWeight in 1310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
          0.041767 = weight(abstract_txt:appropriate in 1310) [ClassicSimilarity], result of:
            0.041767 = score(doc=1310,freq=1.0), product of:
              0.100434616 = queryWeight, product of:
                1.0131096 = boost
                5.3230414 = idf(docFreq=588, maxDocs=44421)
                0.01862375 = queryNorm
              0.41586262 = fieldWeight in 1310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3230414 = idf(docFreq=588, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
          0.053544793 = weight(abstract_txt:fast in 1310) [ClassicSimilarity], result of:
            0.053544793 = score(doc=1310,freq=1.0), product of:
              0.11852393 = queryWeight, product of:
                1.1005702 = boost
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.01862375 = queryNorm
              0.45176357 = fieldWeight in 1310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
          0.027033515 = weight(abstract_txt:search in 1310) [ClassicSimilarity], result of:
            0.027033515 = score(doc=1310,freq=1.0), product of:
              0.09468319 = queryWeight, product of:
                1.391125 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.01862375 = queryNorm
              0.28551546 = fieldWeight in 1310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
          0.103929006 = weight(abstract_txt:files in 1310) [ClassicSimilarity], result of:
            0.103929006 = score(doc=1310,freq=1.0), product of:
              0.23236054 = queryWeight, product of:
                2.1792693 = boost
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.01862375 = queryNorm
              0.44727474 = fieldWeight in 1310, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
          0.4311099 = weight(abstract_txt:file in 1310) [ClassicSimilarity], result of:
            0.4311099 = score(doc=1310,freq=5.0), product of:
              0.44199416 = queryWeight, product of:
                4.2506266 = boost
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.01862375 = queryNorm
              0.97537464 = fieldWeight in 1310, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.078125 = fieldNorm(doc=1310)
        0.24 = coord(6/25)
    
  2. Lee, D.L.; Ren, L.: Document ranking on weight-partitioned signature files (1996) 0.16
    0.15994695 = sum of:
      0.15994695 = product of:
        0.7997347 = sum of:
          0.04819981 = weight(abstract_txt:together in 3417) [ClassicSimilarity], result of:
            0.04819981 = score(doc=3417,freq=1.0), product of:
              0.0978522 = queryWeight, product of:
                5.254162 = idf(docFreq=630, maxDocs=44421)
                0.01862375 = queryNorm
              0.49257767 = fieldWeight in 3417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.254162 = idf(docFreq=630, maxDocs=44421)
                0.09375 = fieldNorm(doc=3417)
          0.03244022 = weight(abstract_txt:search in 3417) [ClassicSimilarity], result of:
            0.03244022 = score(doc=3417,freq=1.0), product of:
              0.09468319 = queryWeight, product of:
                1.391125 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.01862375 = queryNorm
              0.34261855 = fieldWeight in 3417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.09375 = fieldNorm(doc=3417)
          0.13166417 = weight(abstract_txt:grouped in 3417) [ClassicSimilarity], result of:
            0.13166417 = score(doc=3417,freq=1.0), product of:
              0.19121361 = queryWeight, product of:
                1.3978935 = boost
                7.344759 = idf(docFreq=77, maxDocs=44421)
                0.01862375 = queryNorm
              0.68857116 = fieldWeight in 3417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.344759 = idf(docFreq=77, maxDocs=44421)
                0.09375 = fieldNorm(doc=3417)
          0.1247148 = weight(abstract_txt:files in 3417) [ClassicSimilarity], result of:
            0.1247148 = score(doc=3417,freq=1.0), product of:
              0.23236054 = queryWeight, product of:
                2.1792693 = boost
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.01862375 = queryNorm
              0.5367297 = fieldWeight in 3417, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.09375 = fieldNorm(doc=3417)
          0.4627157 = weight(abstract_txt:file in 3417) [ClassicSimilarity], result of:
            0.4627157 = score(doc=3417,freq=4.0), product of:
              0.44199416 = queryWeight, product of:
                4.2506266 = boost
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.01862375 = queryNorm
              1.0468819 = fieldWeight in 3417, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.09375 = fieldNorm(doc=3417)
        0.2 = coord(5/25)
    
  3. O'Neill, E.T.; Bennett, R.; Kammerer, K.: Using authorities to improve subject searches (2014) 0.13
    0.1343838 = sum of:
      0.1343838 = product of:
        0.671919 = sum of:
          0.041568328 = weight(abstract_txt:simple in 2970) [ClassicSimilarity], result of:
            0.041568328 = score(doc=2970,freq=1.0), product of:
              0.100115865 = queryWeight, product of:
                1.0115006 = boost
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.01862375 = queryNorm
              0.4152022 = fieldWeight in 2970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.078125 = fieldNorm(doc=2970)
          0.041767 = weight(abstract_txt:appropriate in 2970) [ClassicSimilarity], result of:
            0.041767 = score(doc=2970,freq=1.0), product of:
              0.100434616 = queryWeight, product of:
                1.0131096 = boost
                5.3230414 = idf(docFreq=588, maxDocs=44421)
                0.01862375 = queryNorm
              0.41586262 = fieldWeight in 2970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3230414 = idf(docFreq=588, maxDocs=44421)
                0.078125 = fieldNorm(doc=2970)
          0.053544793 = weight(abstract_txt:fast in 2970) [ClassicSimilarity], result of:
            0.053544793 = score(doc=2970,freq=1.0), product of:
              0.11852393 = queryWeight, product of:
                1.1005702 = boost
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.01862375 = queryNorm
              0.45176357 = fieldWeight in 2970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.078125 = fieldNorm(doc=2970)
          0.103929006 = weight(abstract_txt:files in 2970) [ClassicSimilarity], result of:
            0.103929006 = score(doc=2970,freq=1.0), product of:
              0.23236054 = queryWeight, product of:
                2.1792693 = boost
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.01862375 = queryNorm
              0.44727474 = fieldWeight in 2970, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7251167 = idf(docFreq=393, maxDocs=44421)
                0.078125 = fieldNorm(doc=2970)
          0.4311099 = weight(abstract_txt:file in 2970) [ClassicSimilarity], result of:
            0.4311099 = score(doc=2970,freq=5.0), product of:
              0.44199416 = queryWeight, product of:
                4.2506266 = boost
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.01862375 = queryNorm
              0.97537464 = fieldWeight in 2970, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.58337 = idf(docFreq=453, maxDocs=44421)
                0.078125 = fieldNorm(doc=2970)
        0.2 = coord(5/25)
    
  4. Zamir, O.; Etzioni, O.: Grouper : a dynamic clustering interface to Web search results (1999) 0.13
    0.12553664 = sum of:
      0.12553664 = product of:
        0.5230693 = sum of:
          0.041568328 = weight(abstract_txt:simple in 207) [ClassicSimilarity], result of:
            0.041568328 = score(doc=207,freq=1.0), product of:
              0.100115865 = queryWeight, product of:
                1.0115006 = boost
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.01862375 = queryNorm
              0.4152022 = fieldWeight in 207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.314588 = idf(docFreq=593, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.053544793 = weight(abstract_txt:fast in 207) [ClassicSimilarity], result of:
            0.053544793 = score(doc=207,freq=1.0), product of:
              0.11852393 = queryWeight, product of:
                1.1005702 = boost
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.01862375 = queryNorm
              0.45176357 = fieldWeight in 207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7825737 = idf(docFreq=371, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.13333018 = weight(abstract_txt:clustering in 207) [ClassicSimilarity], result of:
            0.13333018 = score(doc=207,freq=4.0), product of:
              0.13717033 = queryWeight, product of:
                1.1839812 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.01862375 = queryNorm
              0.9720045 = fieldWeight in 207, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.1329211 = weight(abstract_txt:clusters in 207) [ClassicSimilarity], result of:
            0.1329211 = score(doc=207,freq=3.0), product of:
              0.15066652 = queryWeight, product of:
                1.2408608 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.01862375 = queryNorm
              0.88222057 = fieldWeight in 207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.1346714 = weight(abstract_txt:cluster in 207) [ClassicSimilarity], result of:
            0.1346714 = score(doc=207,freq=3.0), product of:
              0.15198629 = queryWeight, product of:
                1.2462837 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01862375 = queryNorm
              0.88607603 = fieldWeight in 207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
          0.027033515 = weight(abstract_txt:search in 207) [ClassicSimilarity], result of:
            0.027033515 = score(doc=207,freq=1.0), product of:
              0.09468319 = queryWeight, product of:
                1.391125 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.01862375 = queryNorm
              0.28551546 = fieldWeight in 207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=207)
        0.24 = coord(6/25)
    
  5. Rasmussen, E.: Clustering algorithms (1992) 0.12
    0.11889831 = sum of:
      0.11889831 = product of:
        0.59449154 = sum of:
          0.076937295 = weight(abstract_txt:items in 4513) [ClassicSimilarity], result of:
            0.076937295 = score(doc=4513,freq=4.0), product of:
              0.110324636 = queryWeight, product of:
                1.0618201 = boost
                5.5789747 = idf(docFreq=455, maxDocs=44421)
                0.01862375 = queryNorm
              0.69737184 = fieldWeight in 4513, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5789747 = idf(docFreq=455, maxDocs=44421)
                0.0625 = fieldNorm(doc=4513)
          0.0868237 = weight(abstract_txt:clusters in 4513) [ClassicSimilarity], result of:
            0.0868237 = score(doc=4513,freq=2.0), product of:
              0.15066652 = queryWeight, product of:
                1.2408608 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.01862375 = queryNorm
              0.5762641 = fieldWeight in 4513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=4513)
          0.13908803 = weight(abstract_txt:cluster in 4513) [ClassicSimilarity], result of:
            0.13908803 = score(doc=4513,freq=5.0), product of:
              0.15198629 = queryWeight, product of:
                1.2462837 = boost
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.01862375 = queryNorm
              0.9151354 = fieldWeight in 4513, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.548176 = idf(docFreq=172, maxDocs=44421)
                0.0625 = fieldNorm(doc=4513)
          0.07720453 = weight(abstract_txt:methods in 4513) [ClassicSimilarity], result of:
            0.07720453 = score(doc=4513,freq=6.0), product of:
              0.12170899 = queryWeight, product of:
                1.5772154 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.01862375 = queryNorm
              0.6343371 = fieldWeight in 4513, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=4513)
          0.21443798 = weight(abstract_txt:clustered in 4513) [ClassicSimilarity], result of:
            0.21443798 = score(doc=4513,freq=1.0), product of:
              0.43699756 = queryWeight, product of:
                2.9886098 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01862375 = queryNorm
              0.4907075 = fieldWeight in 4513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=4513)
        0.2 = coord(5/25)