Document (#32603)

Author
Perugini, S.
Ramakrishnan, N.
Title
Mining Web functional dependencies for flexible information access
Source
Journal of the American Society for Information Science and Technology. 58(2007) no.12, S.1805-1819
Year
2007
Abstract
We present an approach to enhancing information access through Web structure mining in contrast to traditional approaches involving usage mining. Specifically, we mine the hardwired hierarchical hyperlink structure of Web sites to identify patterns of term-term co-occurrences we call Web functional dependencies (FDs). Intuitively, a Web FD x -> y declares that all paths through a site involving a hyperlink labeled x also contain a hyperlink labeled y. The complete set of FDs satisfied by a site help characterize (flexible and expressive) interaction paradigms supported by a site, where a paradigm is the set of explorable sequences therein. We describe algorithms for mining FDs and results from mining several hierarchical Web sites and present several interface designs that can exploit such FDs to provide compelling user experiences.
Footnote
Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"
Theme
Data Mining
Object
WWW

Similar documents (content)

  1. Park, H.W.; Barnett, G.A.; Nam, I.-Y.: Hyperlink - affiliation network structure of top Web sites : examining affiliates with hyperlink in Korea (2002) 0.15
    0.14616424 = sum of:
      0.14616424 = product of:
        0.91352654 = sum of:
          0.051920142 = weight(abstract_txt:structure in 1584) [ClassicSimilarity], result of:
            0.051920142 = score(doc=1584,freq=2.0), product of:
              0.089858316 = queryWeight, product of:
                1.1936296 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.017274177 = queryNorm
              0.5778001 = fieldWeight in 1584, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.09375 = fieldNorm(doc=1584)
          0.17149638 = weight(abstract_txt:sites in 1584) [ClassicSimilarity], result of:
            0.17149638 = score(doc=1584,freq=6.0), product of:
              0.13818596 = queryWeight, product of:
                1.4802068 = boost
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.017274177 = queryNorm
              1.241055 = fieldWeight in 1584, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.09375 = fieldNorm(doc=1584)
          0.12305791 = weight(abstract_txt:site in 1584) [ClassicSimilarity], result of:
            0.12305791 = score(doc=1584,freq=1.0), product of:
              0.23038161 = queryWeight, product of:
                2.3407767 = boost
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.017274177 = queryNorm
              0.53414816 = fieldWeight in 1584, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.09375 = fieldNorm(doc=1584)
          0.56705207 = weight(abstract_txt:hyperlink in 1584) [ClassicSimilarity], result of:
            0.56705207 = score(doc=1584,freq=3.0), product of:
              0.4423333 = queryWeight, product of:
                3.2434776 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.017274177 = queryNorm
              1.2819566 = fieldWeight in 1584, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.09375 = fieldNorm(doc=1584)
        0.16 = coord(4/25)
    
  2. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.10
    0.10259721 = sum of:
      0.10259721 = product of:
        0.8549768 = sum of:
          0.034613427 = weight(abstract_txt:structure in 1354) [ClassicSimilarity], result of:
            0.034613427 = score(doc=1354,freq=2.0), product of:
              0.089858316 = queryWeight, product of:
                1.1936296 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.017274177 = queryNorm
              0.38520005 = fieldWeight in 1354, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.21825846 = weight(abstract_txt:hyperlink in 1354) [ClassicSimilarity], result of:
            0.21825846 = score(doc=1354,freq=1.0), product of:
              0.4423333 = queryWeight, product of:
                3.2434776 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.017274177 = queryNorm
              0.4934253 = fieldWeight in 1354, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.6021049 = weight(abstract_txt:mining in 1354) [ClassicSimilarity], result of:
            0.6021049 = score(doc=1354,freq=12.0), product of:
              0.45058104 = queryWeight, product of:
                4.2261696 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017274177 = queryNorm
              1.3362855 = fieldWeight in 1354, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
        0.12 = coord(3/25)
    
  3. Zuccala, A.; Thelwall, M.; Oppenheim, C.; Dhiensa, R.: Web intelligence analyses of digital libraries : a case study of the National electronic Library for Health (NeLH) (2007) 0.08
    0.08220358 = sum of:
      0.08220358 = product of:
        0.4110179 = sum of:
          0.010793992 = weight(abstract_txt:access in 1838) [ClassicSimilarity], result of:
            0.010793992 = score(doc=1838,freq=1.0), product of:
              0.063069455 = queryWeight, product of:
                3.6510832 = idf(docFreq=3134, maxDocs=44421)
                0.017274177 = queryNorm
              0.17114453 = fieldWeight in 1838, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6510832 = idf(docFreq=3134, maxDocs=44421)
                0.046875 = fieldNorm(doc=1838)
          0.018356543 = weight(abstract_txt:structure in 1838) [ClassicSimilarity], result of:
            0.018356543 = score(doc=1838,freq=1.0), product of:
              0.089858316 = queryWeight, product of:
                1.1936296 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.017274177 = queryNorm
              0.20428318 = fieldWeight in 1838, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.046875 = fieldNorm(doc=1838)
          0.04950674 = weight(abstract_txt:sites in 1838) [ClassicSimilarity], result of:
            0.04950674 = score(doc=1838,freq=2.0), product of:
              0.13818596 = queryWeight, product of:
                1.4802068 = boost
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.017274177 = queryNorm
              0.3582617 = fieldWeight in 1838, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.046875 = fieldNorm(doc=1838)
          0.10657128 = weight(abstract_txt:site in 1838) [ClassicSimilarity], result of:
            0.10657128 = score(doc=1838,freq=3.0), product of:
              0.23038161 = queryWeight, product of:
                2.3407767 = boost
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.017274177 = queryNorm
              0.46258587 = fieldWeight in 1838, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.046875 = fieldNorm(doc=1838)
          0.22578934 = weight(abstract_txt:mining in 1838) [ClassicSimilarity], result of:
            0.22578934 = score(doc=1838,freq=3.0), product of:
              0.45058104 = queryWeight, product of:
                4.2261696 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017274177 = queryNorm
              0.50110704 = fieldWeight in 1838, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.046875 = fieldNorm(doc=1838)
        0.2 = coord(5/25)
    
  4. Thelwall, M.; Wilkinson, D.; Uppal, S.: Data mining emotion in social network communication : gender differences in MySpace (2009) 0.08
    0.0797585 = sum of:
      0.0797585 = product of:
        0.49849063 = sum of:
          0.030337714 = weight(abstract_txt:present in 309) [ClassicSimilarity], result of:
            0.030337714 = score(doc=309,freq=1.0), product of:
              0.08935532 = queryWeight, product of:
                1.1902841 = boost
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.017274177 = queryNorm
              0.3395177 = fieldWeight in 309, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.078125 = fieldNorm(doc=309)
          0.058344256 = weight(abstract_txt:sites in 309) [ClassicSimilarity], result of:
            0.058344256 = score(doc=309,freq=1.0), product of:
              0.13818596 = queryWeight, product of:
                1.4802068 = boost
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.017274177 = queryNorm
              0.4222155 = fieldWeight in 309, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4043584 = idf(docFreq=542, maxDocs=44421)
                0.078125 = fieldNorm(doc=309)
          0.10254826 = weight(abstract_txt:site in 309) [ClassicSimilarity], result of:
            0.10254826 = score(doc=309,freq=1.0), product of:
              0.23038161 = queryWeight, product of:
                2.3407767 = boost
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.017274177 = queryNorm
              0.44512346 = fieldWeight in 309, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.078125 = fieldNorm(doc=309)
          0.3072604 = weight(abstract_txt:mining in 309) [ClassicSimilarity], result of:
            0.3072604 = score(doc=309,freq=2.0), product of:
              0.45058104 = queryWeight, product of:
                4.2261696 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017274177 = queryNorm
              0.68192035 = fieldWeight in 309, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.078125 = fieldNorm(doc=309)
        0.16 = coord(4/25)
    
  5. Chau, M.; Shiu, B.; Chan, M.; Chen, H.: Redips: backlink search and analysis on the Web for business intelligence analysis (2007) 0.08
    0.07974079 = sum of:
      0.07974079 = product of:
        0.49837995 = sum of:
          0.024270171 = weight(abstract_txt:present in 1142) [ClassicSimilarity], result of:
            0.024270171 = score(doc=1142,freq=1.0), product of:
              0.08935532 = queryWeight, product of:
                1.1902841 = boost
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.017274177 = queryNorm
              0.27161416 = fieldWeight in 1142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.0625 = fieldNorm(doc=1142)
          0.08203861 = weight(abstract_txt:site in 1142) [ClassicSimilarity], result of:
            0.08203861 = score(doc=1142,freq=1.0), product of:
              0.23038161 = queryWeight, product of:
                2.3407767 = boost
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.017274177 = queryNorm
              0.35609877 = fieldWeight in 1142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6975803 = idf(docFreq=404, maxDocs=44421)
                0.0625 = fieldNorm(doc=1142)
          0.21825846 = weight(abstract_txt:hyperlink in 1142) [ClassicSimilarity], result of:
            0.21825846 = score(doc=1142,freq=1.0), product of:
              0.4423333 = queryWeight, product of:
                3.2434776 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.017274177 = queryNorm
              0.4934253 = fieldWeight in 1142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0625 = fieldNorm(doc=1142)
          0.17381272 = weight(abstract_txt:mining in 1142) [ClassicSimilarity], result of:
            0.17381272 = score(doc=1142,freq=1.0), product of:
              0.45058104 = queryWeight, product of:
                4.2261696 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017274177 = queryNorm
              0.3857524 = fieldWeight in 1142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=1142)
        0.16 = coord(4/25)