Document (#20898)

Author
Wong, S.K.M.
Butz, C.J.
Xiang, X.
Title
Automated database schema design using mined data dependencies
Source
Journal of the American Society for Information Science. 49(1998) no.5, S.455-470
Year
1998
Abstract
Data dependencies are used in database schema design to enforce the correctness of a database as well as to reduce redundant data. These dependencies are usually determined from the semantics of the attributes and are then enforced upon the relations. Describes a bottom-up procedure for discovering multivalued dependencies in observed data without knowing a priori the relationships among the attributes. The proposed algorithm is an application of the technique designed for learning conditional independencies in probabilistic reasoning. A prototype system for automated database schema design has been implemented. Experiments were carried out to demonstrate both the effectiveness and efficiency of the method
Footnote
Contribution to a special issue devoted to knowledge discovery and data mining
Theme
Data Mining

Similar documents (author)

  1. Wong, S.K.M.: On modelling information retrieval with probabilistic inference (1995) 8.01
    8.008013 = sum of:
      8.008013 = sum of:
        2.8995771 = weight(author_txt:wong in 2006) [ClassicSimilarity], result of:
          2.8995771 = score(doc=2006,freq=1.0), product of:
            0.56542915 = queryWeight, product of:
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.068913095 = queryNorm
            5.1281 = fieldWeight in 2006, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.625 = fieldNorm(doc=2006)
        5.108435 = weight(author_txt:s.k.m in 2006) [ClassicSimilarity], result of:
          5.108435 = score(doc=2006,freq=1.0), product of:
            0.8247969 = queryWeight, product of:
              1.2077705 = boost
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.068913095 = queryNorm
            6.1935673 = fieldWeight in 2006, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.625 = fieldNorm(doc=2006)
    
  2. Wong, S.K.M.; Yao, Y.Y.: ¬An information-theoretic measure of term specifics (1992) 6.41
    6.40641 = sum of:
      6.40641 = sum of:
        2.3196619 = weight(author_txt:wong in 4806) [ClassicSimilarity], result of:
          2.3196619 = score(doc=4806,freq=1.0), product of:
            0.56542915 = queryWeight, product of:
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.068913095 = queryNorm
            4.10248 = fieldWeight in 4806, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.5 = fieldNorm(doc=4806)
        4.086748 = weight(author_txt:s.k.m in 4806) [ClassicSimilarity], result of:
          4.086748 = score(doc=4806,freq=1.0), product of:
            0.8247969 = queryWeight, product of:
              1.2077705 = boost
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.068913095 = queryNorm
            4.954854 = fieldWeight in 4806, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.5 = fieldNorm(doc=4806)
    
  3. Wong, S.K.M.; Yao, Y.Y.: Query formulation in linear retrieval models (1990) 6.41
    6.40641 = sum of:
      6.40641 = sum of:
        2.3196619 = weight(author_txt:wong in 3639) [ClassicSimilarity], result of:
          2.3196619 = score(doc=3639,freq=1.0), product of:
            0.56542915 = queryWeight, product of:
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.068913095 = queryNorm
            4.10248 = fieldWeight in 3639, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.5 = fieldNorm(doc=3639)
        4.086748 = weight(author_txt:s.k.m in 3639) [ClassicSimilarity], result of:
          4.086748 = score(doc=3639,freq=1.0), product of:
            0.8247969 = queryWeight, product of:
              1.2077705 = boost
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.068913095 = queryNorm
            4.954854 = fieldWeight in 3639, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.5 = fieldNorm(doc=3639)
    
  4. Wong, S.K.M.; Yao, Y.Y.; Salton, G.; Buckley, C.: Evaluation of an adaptive linear model (1991) 4.00
    4.0040064 = sum of:
      4.0040064 = sum of:
        1.4497886 = weight(author_txt:wong in 4835) [ClassicSimilarity], result of:
          1.4497886 = score(doc=4835,freq=1.0), product of:
            0.56542915 = queryWeight, product of:
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.068913095 = queryNorm
            2.56405 = fieldWeight in 4835, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.20496 = idf(docFreq=32, maxDocs=44421)
              0.3125 = fieldNorm(doc=4835)
        2.5542176 = weight(author_txt:s.k.m in 4835) [ClassicSimilarity], result of:
          2.5542176 = score(doc=4835,freq=1.0), product of:
            0.8247969 = queryWeight, product of:
              1.2077705 = boost
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.068913095 = queryNorm
            3.0967836 = fieldWeight in 4835, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.909708 = idf(docFreq=5, maxDocs=44421)
              0.3125 = fieldNorm(doc=4835)
    
  5. Wong, K.: Frühe Spuren des menschlichen Geistes (2005) 1.45
    1.4497886 = sum of:
      1.4497886 = product of:
        2.8995771 = sum of:
          2.8995771 = weight(author_txt:wong in 1983) [ClassicSimilarity], result of:
            2.8995771 = score(doc=1983,freq=1.0), product of:
              0.56542915 = queryWeight, product of:
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.068913095 = queryNorm
              5.1281 = fieldWeight in 1983, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.625 = fieldNorm(doc=1983)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Bosc, P.; Dubois, D.; Prade, H.: Fuzzy functional dependencies and redundancy elimination (1998) 0.15
    0.15375201 = sum of:
      0.15375201 = product of:
        0.96095014 = sum of:
          0.050581016 = weight(abstract_txt:design in 1590) [ClassicSimilarity], result of:
            0.050581016 = score(doc=1590,freq=2.0), product of:
              0.11683988 = queryWeight, product of:
                1.9398652 = boost
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.01537192 = queryNorm
              0.43290883 = fieldWeight in 1590, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.078125 = fieldNorm(doc=1590)
          0.04140692 = weight(abstract_txt:data in 1590) [ClassicSimilarity], result of:
            0.04140692 = score(doc=1590,freq=2.0), product of:
              0.11253672 = queryWeight, product of:
                2.198328 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01537192 = queryNorm
              0.3679414 = fieldWeight in 1590, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.078125 = fieldNorm(doc=1590)
          0.10776597 = weight(abstract_txt:database in 1590) [ClassicSimilarity], result of:
            0.10776597 = score(doc=1590,freq=3.0), product of:
              0.18601 = queryWeight, product of:
                2.8262691 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.01537192 = queryNorm
              0.5793558 = fieldWeight in 1590, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.078125 = fieldNorm(doc=1590)
          0.76119626 = weight(abstract_txt:dependencies in 1590) [ClassicSimilarity], result of:
            0.76119626 = score(doc=1590,freq=4.0), product of:
              0.6221571 = queryWeight, product of:
                5.1688676 = boost
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.01537192 = queryNorm
              1.2234792 = fieldWeight in 1590, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.078125 = fieldNorm(doc=1590)
        0.16 = coord(4/25)
    
  2. Hassanien, A.-E.: Rough set approach for attribute reduction and rule generation : a case of patients with suspected breast cancer (2004) 0.10
    0.09538572 = sum of:
      0.09538572 = product of:
        0.59616077 = sum of:
          0.08156156 = weight(abstract_txt:redundant in 3883) [ClassicSimilarity], result of:
            0.08156156 = score(doc=3883,freq=1.0), product of:
              0.16286683 = queryWeight, product of:
                1.3223052 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.01537192 = queryNorm
              0.5007868 = fieldWeight in 3883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.0625 = fieldNorm(doc=3883)
          0.16327417 = weight(abstract_txt:attributes in 3883) [ClassicSimilarity], result of:
            0.16327417 = score(doc=3883,freq=5.0), product of:
              0.19060779 = queryWeight, product of:
                2.0230224 = boost
                6.1293135 = idf(docFreq=262, maxDocs=44421)
                0.01537192 = queryNorm
              0.8565976 = fieldWeight in 3883, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1293135 = idf(docFreq=262, maxDocs=44421)
                0.0625 = fieldNorm(doc=3883)
          0.046846583 = weight(abstract_txt:data in 3883) [ClassicSimilarity], result of:
            0.046846583 = score(doc=3883,freq=4.0), product of:
              0.11253672 = queryWeight, product of:
                2.198328 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01537192 = queryNorm
              0.41627818 = fieldWeight in 3883, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=3883)
          0.3044785 = weight(abstract_txt:dependencies in 3883) [ClassicSimilarity], result of:
            0.3044785 = score(doc=3883,freq=1.0), product of:
              0.6221571 = queryWeight, product of:
                5.1688676 = boost
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.01537192 = queryNorm
              0.48939165 = fieldWeight in 3883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.0625 = fieldNorm(doc=3883)
        0.16 = coord(4/25)
    
  3. Peckham, J.; MacKellar, B.; Vorback, J.: ¬A unified approach to the design and generation of complex database schemata (1997) 0.09
    0.086623326 = sum of:
      0.086623326 = product of:
        0.5413958 = sum of:
          0.09751385 = weight(abstract_txt:automated in 2259) [ClassicSimilarity], result of:
            0.09751385 = score(doc=2259,freq=1.0), product of:
              0.15917364 = queryWeight, product of:
                1.8486979 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.01537192 = queryNorm
              0.6126256 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.070813425 = weight(abstract_txt:design in 2259) [ClassicSimilarity], result of:
            0.070813425 = score(doc=2259,freq=2.0), product of:
              0.11683988 = queryWeight, product of:
                1.9398652 = boost
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.01537192 = queryNorm
              0.60607237 = fieldWeight in 2259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.15087236 = weight(abstract_txt:database in 2259) [ClassicSimilarity], result of:
            0.15087236 = score(doc=2259,freq=3.0), product of:
              0.18601 = queryWeight, product of:
                2.8262691 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.01537192 = queryNorm
              0.8110981 = fieldWeight in 2259, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.22219616 = weight(abstract_txt:schema in 2259) [ClassicSimilarity], result of:
            0.22219616 = score(doc=2259,freq=1.0), product of:
              0.3155114 = queryWeight, product of:
                3.1877446 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.01537192 = queryNorm
              0.7042413 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
        0.16 = coord(4/25)
    
  4. Leazer, G.H.: ¬A conceptual schema for the control of bibliographic works (1994) 0.08
    0.0807923 = sum of:
      0.0807923 = product of:
        0.40396148 = sum of:
          0.028612945 = weight(abstract_txt:design in 3101) [ClassicSimilarity], result of:
            0.028612945 = score(doc=3101,freq=1.0), product of:
              0.11683988 = queryWeight, product of:
                1.9398652 = boost
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.01537192 = queryNorm
              0.24489023 = fieldWeight in 3101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9182436 = idf(docFreq=2399, maxDocs=44421)
                0.0625 = fieldNorm(doc=3101)
          0.07301843 = weight(abstract_txt:attributes in 3101) [ClassicSimilarity], result of:
            0.07301843 = score(doc=3101,freq=1.0), product of:
              0.19060779 = queryWeight, product of:
                2.0230224 = boost
                6.1293135 = idf(docFreq=262, maxDocs=44421)
                0.01537192 = queryNorm
              0.3830821 = fieldWeight in 3101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1293135 = idf(docFreq=262, maxDocs=44421)
                0.0625 = fieldNorm(doc=3101)
          0.05237607 = weight(abstract_txt:data in 3101) [ClassicSimilarity], result of:
            0.05237607 = score(doc=3101,freq=5.0), product of:
              0.11253672 = queryWeight, product of:
                2.198328 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.01537192 = queryNorm
              0.46541315 = fieldWeight in 3101, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=3101)
          0.07039243 = weight(abstract_txt:database in 3101) [ClassicSimilarity], result of:
            0.07039243 = score(doc=3101,freq=2.0), product of:
              0.18601 = queryWeight, product of:
                2.8262691 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.01537192 = queryNorm
              0.3784336 = fieldWeight in 3101, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3101)
          0.17956161 = weight(abstract_txt:schema in 3101) [ClassicSimilarity], result of:
            0.17956161 = score(doc=3101,freq=2.0), product of:
              0.3155114 = queryWeight, product of:
                3.1877446 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.01537192 = queryNorm
              0.5691129 = fieldWeight in 3101, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.0625 = fieldNorm(doc=3101)
        0.2 = coord(5/25)
    
  5. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 0.08
    0.07551314 = sum of:
      0.07551314 = product of:
        0.62927616 = sum of:
          0.09789597 = weight(abstract_txt:discovering in 3909) [ClassicSimilarity], result of:
            0.09789597 = score(doc=3909,freq=1.0), product of:
              0.14037551 = queryWeight, product of:
                1.2276118 = boost
                7.438788 = idf(docFreq=70, maxDocs=44421)
                0.01537192 = queryNorm
              0.6973864 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.438788 = idf(docFreq=70, maxDocs=44421)
                0.09375 = fieldNorm(doc=3909)
          0.074662454 = weight(abstract_txt:database in 3909) [ClassicSimilarity], result of:
            0.074662454 = score(doc=3909,freq=1.0), product of:
              0.18601 = queryWeight, product of:
                2.8262691 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.01537192 = queryNorm
              0.40138945 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.09375 = fieldNorm(doc=3909)
          0.45671773 = weight(abstract_txt:dependencies in 3909) [ClassicSimilarity], result of:
            0.45671773 = score(doc=3909,freq=1.0), product of:
              0.6221571 = queryWeight, product of:
                5.1688676 = boost
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.01537192 = queryNorm
              0.73408747 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.09375 = fieldNorm(doc=3909)
        0.12 = coord(3/25)