Document (#20301)

Author
Trybula, W.J.
Title
Data mining and knowledge discovery
Source
Annual review of information science and technology. 32(1997), S.197-229
Year
1997
Abstract
State of the art review of the recently developed concepts of data mining (defined as the automated process of evaluating data and finding relationships) and knowledge discovery (defined as the automated process of extracting information, especially unpredicted relationships or previously unknown patterns among the data) with particular reference to numerical data. Includes: the knowledge acquisition process; data mining; evaluation methods; and knowledge discovery. Concludes that existing work in the field are confusing because the terminology is inconsistent and poorly defined. Although methods are available for analyzing and cleaning databases, better coordinated efforts should be directed toward providing users with improved means of structuring search mechanisms to explore the data for relationships
Theme
Data Mining
Literaturübersicht

Similar documents (content)

  1. Benoit, G.: Data mining (2002) 0.62
    0.6163111 = sum of:
      0.6163111 = product of:
        1.2839814 = sum of:
          0.0478115 = weight(abstract_txt:previously in 5296) [ClassicSimilarity], result of:
            0.0478115 = score(doc=5296,freq=1.0), product of:
              0.12465221 = queryWeight, product of:
                1.0512366 = boost
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.019321779 = queryNorm
              0.3835592 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.06886873 = weight(abstract_txt:extracting in 5296) [ClassicSimilarity], result of:
            0.06886873 = score(doc=5296,freq=1.0), product of:
              0.15898632 = queryWeight, product of:
                1.1872177 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.019321779 = queryNorm
              0.43317392 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.086134404 = weight(abstract_txt:inconsistent in 5296) [ClassicSimilarity], result of:
            0.086134404 = score(doc=5296,freq=1.0), product of:
              0.18455656 = queryWeight, product of:
                1.2791317 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.019321779 = queryNorm
              0.46671006 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.094898194 = weight(abstract_txt:poorly in 5296) [ClassicSimilarity], result of:
            0.094898194 = score(doc=5296,freq=1.0), product of:
              0.19687188 = queryWeight, product of:
                1.3211201 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.019321779 = queryNorm
              0.4820302 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.105431974 = weight(abstract_txt:confusing in 5296) [ClassicSimilarity], result of:
            0.105431974 = score(doc=5296,freq=1.0), product of:
              0.21118349 = queryWeight, product of:
                1.3682973 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.019321779 = queryNorm
              0.49924347 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.029430568 = weight(abstract_txt:methods in 5296) [ClassicSimilarity], result of:
            0.029430568 = score(doc=5296,freq=1.0), product of:
              0.113646 = queryWeight, product of:
                1.4195235 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.019321779 = queryNorm
              0.25896704 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.041192174 = weight(abstract_txt:process in 5296) [ClassicSimilarity], result of:
            0.041192174 = score(doc=5296,freq=1.0), product of:
              0.16277784 = queryWeight, product of:
                2.0806966 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.019321779 = queryNorm
              0.25305763 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.073811784 = weight(abstract_txt:knowledge in 5296) [ClassicSimilarity], result of:
            0.073811784 = score(doc=5296,freq=4.0), product of:
              0.1665056 = queryWeight, product of:
                2.4299364 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.019321779 = queryNorm
              0.44329908 = fieldWeight in 5296, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.08865387 = weight(abstract_txt:defined in 5296) [ClassicSimilarity], result of:
            0.08865387 = score(doc=5296,freq=1.0), product of:
              0.27134216 = queryWeight, product of:
                2.6863942 = boost
                5.2275767 = idf(docFreq=647, maxDocs=44421)
                0.019321779 = queryNorm
              0.32672355 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2275767 = idf(docFreq=647, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.19048597 = weight(abstract_txt:discovery in 5296) [ClassicSimilarity], result of:
            0.19048597 = score(doc=5296,freq=3.0), product of:
              0.31327114 = queryWeight, product of:
                2.8864982 = boost
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.019321779 = queryNorm
              0.60805464 = fieldWeight in 5296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.32626268 = weight(abstract_txt:mining in 5296) [ClassicSimilarity], result of:
            0.32626268 = score(doc=5296,freq=5.0), product of:
              0.37824547 = queryWeight, product of:
                3.171743 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019321779 = queryNorm
              0.8625686 = fieldWeight in 5296, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.13099955 = weight(abstract_txt:data in 5296) [ClassicSimilarity], result of:
            0.13099955 = score(doc=5296,freq=6.0), product of:
              0.2569452 = queryWeight, product of:
                3.9931881 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019321779 = queryNorm
              0.5098346 = fieldWeight in 5296, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
        0.48 = coord(12/25)
    
  2. Fayyad, U.M.: Data mining and knowledge dicovery : making sense out of data (1996) 0.25
    0.2533598 = sum of:
      0.2533598 = product of:
        1.0556659 = sum of:
          0.13773745 = weight(abstract_txt:extracting in 76) [ClassicSimilarity], result of:
            0.13773745 = score(doc=76,freq=1.0), product of:
              0.15898632 = queryWeight, product of:
                1.1872177 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.019321779 = queryNorm
              0.86634785 = fieldWeight in 76, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
          0.11650906 = weight(abstract_txt:process in 76) [ClassicSimilarity], result of:
            0.11650906 = score(doc=76,freq=2.0), product of:
              0.16277784 = queryWeight, product of:
                2.0806966 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.019321779 = queryNorm
              0.71575505 = fieldWeight in 76, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
          0.10438562 = weight(abstract_txt:knowledge in 76) [ClassicSimilarity], result of:
            0.10438562 = score(doc=76,freq=2.0), product of:
              0.1665056 = queryWeight, product of:
                2.4299364 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.019321779 = queryNorm
              0.62691957 = fieldWeight in 76, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
          0.21995425 = weight(abstract_txt:discovery in 76) [ClassicSimilarity], result of:
            0.21995425 = score(doc=76,freq=1.0), product of:
              0.31327114 = queryWeight, product of:
                2.8864982 = boost
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.019321779 = queryNorm
              0.702121 = fieldWeight in 76, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
          0.2918182 = weight(abstract_txt:mining in 76) [ClassicSimilarity], result of:
            0.2918182 = score(doc=76,freq=1.0), product of:
              0.37824547 = queryWeight, product of:
                3.171743 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019321779 = queryNorm
              0.7715048 = fieldWeight in 76, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
          0.18526134 = weight(abstract_txt:data in 76) [ClassicSimilarity], result of:
            0.18526134 = score(doc=76,freq=3.0), product of:
              0.2569452 = queryWeight, product of:
                3.9931881 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019321779 = queryNorm
              0.721015 = fieldWeight in 76, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.125 = fieldNorm(doc=76)
        0.24 = coord(6/25)
    
  3. Berry, M.W.; Esau, R.; Kiefer, B.: ¬The use of text mining techniques in electronic discovery for legal matters (2012) 0.22
    0.22061211 = sum of:
      0.22061211 = product of:
        0.7879004 = sum of:
          0.0677386 = weight(abstract_txt:analyzing in 1091) [ClassicSimilarity], result of:
            0.0677386 = score(doc=1091,freq=1.0), product of:
              0.11999828 = queryWeight, product of:
                1.0314258 = boost
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.019321779 = queryNorm
              0.5644964 = fieldWeight in 1091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.07171725 = weight(abstract_txt:previously in 1091) [ClassicSimilarity], result of:
            0.07171725 = score(doc=1091,freq=1.0), product of:
              0.12465221 = queryWeight, product of:
                1.0512366 = boost
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.019321779 = queryNorm
              0.5753388 = fieldWeight in 1091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.136947 = idf(docFreq=260, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.044145852 = weight(abstract_txt:methods in 1091) [ClassicSimilarity], result of:
            0.044145852 = score(doc=1091,freq=1.0), product of:
              0.113646 = queryWeight, product of:
                1.4195235 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.019321779 = queryNorm
              0.38845056 = fieldWeight in 1091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.1070204 = weight(abstract_txt:process in 1091) [ClassicSimilarity], result of:
            0.1070204 = score(doc=1091,freq=3.0), product of:
              0.16277784 = queryWeight, product of:
                2.0806966 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.019321779 = queryNorm
              0.65746295 = fieldWeight in 1091, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.16496569 = weight(abstract_txt:discovery in 1091) [ClassicSimilarity], result of:
            0.16496569 = score(doc=1091,freq=1.0), product of:
              0.31327114 = queryWeight, product of:
                2.8864982 = boost
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.019321779 = queryNorm
              0.52659076 = fieldWeight in 1091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.21886365 = weight(abstract_txt:mining in 1091) [ClassicSimilarity], result of:
            0.21886365 = score(doc=1091,freq=1.0), product of:
              0.37824547 = queryWeight, product of:
                3.171743 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019321779 = queryNorm
              0.5786286 = fieldWeight in 1091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
          0.11344893 = weight(abstract_txt:data in 1091) [ClassicSimilarity], result of:
            0.11344893 = score(doc=1091,freq=2.0), product of:
              0.2569452 = queryWeight, product of:
                3.9931881 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019321779 = queryNorm
              0.4415297 = fieldWeight in 1091, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=1091)
        0.28 = coord(7/25)
    
  4. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.21
    0.2059129 = sum of:
      0.2059129 = product of:
        1.0295645 = sum of:
          0.087381795 = weight(abstract_txt:process in 3899) [ClassicSimilarity], result of:
            0.087381795 = score(doc=3899,freq=2.0), product of:
              0.16277784 = queryWeight, product of:
                2.0806966 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.019321779 = queryNorm
              0.5368163 = fieldWeight in 3899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.11071768 = weight(abstract_txt:knowledge in 3899) [ClassicSimilarity], result of:
            0.11071768 = score(doc=3899,freq=4.0), product of:
              0.1665056 = queryWeight, product of:
                2.4299364 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.019321779 = queryNorm
              0.66494864 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.23329672 = weight(abstract_txt:discovery in 3899) [ClassicSimilarity], result of:
            0.23329672 = score(doc=3899,freq=2.0), product of:
              0.31327114 = queryWeight, product of:
                2.8864982 = boost
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.019321779 = queryNorm
              0.7447118 = fieldWeight in 3899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.4377273 = weight(abstract_txt:mining in 3899) [ClassicSimilarity], result of:
            0.4377273 = score(doc=3899,freq=4.0), product of:
              0.37824547 = queryWeight, product of:
                3.171743 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019321779 = queryNorm
              1.1572572 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.16044103 = weight(abstract_txt:data in 3899) [ClassicSimilarity], result of:
            0.16044103 = score(doc=3899,freq=4.0), product of:
              0.2569452 = queryWeight, product of:
                3.9931881 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019321779 = queryNorm
              0.6244173 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
        0.2 = coord(5/25)
    
  5. Bath, P.A.: Data mining in health and medical information (2003) 0.20
    0.20336676 = sum of:
      0.20336676 = product of:
        0.72630984 = sum of:
          0.0677386 = weight(abstract_txt:analyzing in 5263) [ClassicSimilarity], result of:
            0.0677386 = score(doc=5263,freq=1.0), product of:
              0.11999828 = queryWeight, product of:
                1.0314258 = boost
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.019321779 = queryNorm
              0.5644964 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.021295 = idf(docFreq=292, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.044145852 = weight(abstract_txt:methods in 5263) [ClassicSimilarity], result of:
            0.044145852 = score(doc=5263,freq=1.0), product of:
              0.113646 = queryWeight, product of:
                1.4195235 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.019321779 = queryNorm
              0.38845056 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.061788265 = weight(abstract_txt:process in 5263) [ClassicSimilarity], result of:
            0.061788265 = score(doc=5263,freq=1.0), product of:
              0.16277784 = queryWeight, product of:
                2.0806966 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.019321779 = queryNorm
              0.37958646 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.05535884 = weight(abstract_txt:knowledge in 5263) [ClassicSimilarity], result of:
            0.05535884 = score(doc=5263,freq=1.0), product of:
              0.1665056 = queryWeight, product of:
                2.4299364 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.019321779 = queryNorm
              0.33247432 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.16496569 = weight(abstract_txt:discovery in 5263) [ClassicSimilarity], result of:
            0.16496569 = score(doc=5263,freq=1.0), product of:
              0.31327114 = queryWeight, product of:
                2.8864982 = boost
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.019321779 = queryNorm
              0.52659076 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.616968 = idf(docFreq=438, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.21886365 = weight(abstract_txt:mining in 5263) [ClassicSimilarity], result of:
            0.21886365 = score(doc=5263,freq=1.0), product of:
              0.37824547 = queryWeight, product of:
                3.171743 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019321779 = queryNorm
              0.5786286 = fieldWeight in 5263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
          0.11344893 = weight(abstract_txt:data in 5263) [ClassicSimilarity], result of:
            0.11344893 = score(doc=5263,freq=2.0), product of:
              0.2569452 = queryWeight, product of:
                3.9931881 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019321779 = queryNorm
              0.4415297 = fieldWeight in 5263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=5263)
        0.28 = coord(7/25)