Document (#28096)

Author
Prime-Claverie, C.
Beigbeder, M.
Lafouge, T.
Title
Transposition of the cocitation method with a view to classifying Web pages
Source
Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1282-1289
Year
2004
Abstract
The Web is a huge source of information, and one of the main problems facing users is finding documents which correspond to their requirements. Apart from the problem of thematic relevance, the documents retrieved by search engines do not always meet the users' expectations. The document may be too general, or conversely too specialized, or of a different type from what the user is looking for, and so forth. We think that adding metadata to pages can considerably improve the process of searching for information an the Web. This article presents a possible typology for Web sites and pages, as weIl as a method for propagating metadata values, based an the study of the Web graph and more specifically the method of cocitation in this graph.
Footnote
Beitrag in einem Themenheft über Webometrics
Theme
Internet
Informetrie
Citation indexing

Similar documents (content)

  1. Villela Dantas, J.R.; Muniz Farias, P.F.: Conceptual navigation in knowledge management environments using NavCon (2010) 0.09
    0.093801774 = sum of:
      0.093801774 = product of:
        0.46900886 = sum of:
          0.046767518 = weight(abstract_txt:meet in 230) [ClassicSimilarity], result of:
            0.046767518 = score(doc=230,freq=1.0), product of:
              0.12544735 = queryWeight, product of:
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.02103094 = queryNorm
              0.37280595 = fieldWeight in 230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0625 = fieldNorm(doc=230)
          0.030896384 = weight(abstract_txt:documents in 230) [ClassicSimilarity], result of:
            0.030896384 = score(doc=230,freq=1.0), product of:
              0.11988929 = queryWeight, product of:
                1.3825296 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02103094 = queryNorm
              0.25770763 = fieldWeight in 230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=230)
          0.051195346 = weight(abstract_txt:metadata in 230) [ClassicSimilarity], result of:
            0.051195346 = score(doc=230,freq=1.0), product of:
              0.1678787 = queryWeight, product of:
                1.6359953 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.02103094 = queryNorm
              0.30495438 = fieldWeight in 230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.0625 = fieldNorm(doc=230)
          0.17546676 = weight(abstract_txt:graph in 230) [ClassicSimilarity], result of:
            0.17546676 = score(doc=230,freq=2.0), product of:
              0.30289716 = queryWeight, product of:
                2.197515 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.02103094 = queryNorm
              0.57929486 = fieldWeight in 230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.0625 = fieldNorm(doc=230)
          0.16468288 = weight(abstract_txt:pages in 230) [ClassicSimilarity], result of:
            0.16468288 = score(doc=230,freq=2.0), product of:
              0.33237472 = queryWeight, product of:
                2.8193166 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.02103094 = queryNorm
              0.49547353 = fieldWeight in 230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0625 = fieldNorm(doc=230)
        0.2 = coord(5/25)
    
  2. Yang, C.C.; Liu, N.: Web site topic-hierarchy generation based on link structure (2009) 0.08
    0.084250025 = sum of:
      0.084250025 = product of:
        0.5265627 = sum of:
          0.052420206 = weight(abstract_txt:always in 3738) [ClassicSimilarity], result of:
            0.052420206 = score(doc=3738,freq=1.0), product of:
              0.13536231 = queryWeight, product of:
                1.038767 = boost
                6.196136 = idf(docFreq=245, maxDocs=44421)
                0.02103094 = queryNorm
              0.3872585 = fieldWeight in 3738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.196136 = idf(docFreq=245, maxDocs=44421)
                0.0625 = fieldNorm(doc=3738)
          0.0945576 = weight(abstract_txt:correspond in 3738) [ClassicSimilarity], result of:
            0.0945576 = score(doc=3738,freq=1.0), product of:
              0.20058398 = queryWeight, product of:
                1.2644957 = boost
                7.5425844 = idf(docFreq=63, maxDocs=44421)
                0.02103094 = queryNorm
              0.47141153 = fieldWeight in 3738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5425844 = idf(docFreq=63, maxDocs=44421)
                0.0625 = fieldNorm(doc=3738)
          0.21490201 = weight(abstract_txt:graph in 3738) [ClassicSimilarity], result of:
            0.21490201 = score(doc=3738,freq=3.0), product of:
              0.30289716 = queryWeight, product of:
                2.197515 = boost
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.02103094 = queryNorm
              0.7094884 = fieldWeight in 3738, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.553973 = idf(docFreq=171, maxDocs=44421)
                0.0625 = fieldNorm(doc=3738)
          0.16468288 = weight(abstract_txt:pages in 3738) [ClassicSimilarity], result of:
            0.16468288 = score(doc=3738,freq=2.0), product of:
              0.33237472 = queryWeight, product of:
                2.8193166 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.02103094 = queryNorm
              0.49547353 = fieldWeight in 3738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0625 = fieldNorm(doc=3738)
        0.16 = coord(4/25)
    
  3. Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.08
    0.07853973 = sum of:
      0.07853973 = product of:
        0.39269865 = sum of:
          0.07015128 = weight(abstract_txt:meet in 5142) [ClassicSimilarity], result of:
            0.07015128 = score(doc=5142,freq=1.0), product of:
              0.12544735 = queryWeight, product of:
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.02103094 = queryNorm
              0.5592089 = fieldWeight in 5142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.09375 = fieldNorm(doc=5142)
          0.08005781 = weight(abstract_txt:specialized in 5142) [ClassicSimilarity], result of:
            0.08005781 = score(doc=5142,freq=1.0), product of:
              0.13699569 = queryWeight, product of:
                1.0450155 = boost
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.02103094 = queryNorm
              0.58438194 = fieldWeight in 5142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.09375 = fieldNorm(doc=5142)
          0.105854824 = weight(abstract_txt:thematic in 5142) [ClassicSimilarity], result of:
            0.105854824 = score(doc=5142,freq=1.0), product of:
              0.16503581 = queryWeight, product of:
                1.1469866 = boost
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.02103094 = queryNorm
              0.64140517 = fieldWeight in 5142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.09375 = fieldNorm(doc=5142)
          0.04634458 = weight(abstract_txt:documents in 5142) [ClassicSimilarity], result of:
            0.04634458 = score(doc=5142,freq=1.0), product of:
              0.11988929 = queryWeight, product of:
                1.3825296 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02103094 = queryNorm
              0.38656145 = fieldWeight in 5142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.09375 = fieldNorm(doc=5142)
          0.09029015 = weight(abstract_txt:method in 5142) [ClassicSimilarity], result of:
            0.09029015 = score(doc=5142,freq=1.0), product of:
              0.21407788 = queryWeight, product of:
                2.2626417 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.02103094 = queryNorm
              0.42176312 = fieldWeight in 5142, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.09375 = fieldNorm(doc=5142)
        0.2 = coord(5/25)
    
  4. Baker, T.: Languages for Dublin Core (1998) 0.08
    0.07821346 = sum of:
      0.07821346 = product of:
        0.32588944 = sum of:
          0.041337036 = weight(abstract_txt:meet in 2257) [ClassicSimilarity], result of:
            0.041337036 = score(doc=2257,freq=2.0), product of:
              0.12544735 = queryWeight, product of:
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.02103094 = queryNorm
              0.329517 = fieldWeight in 2257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9648952 = idf(docFreq=309, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
          0.047174517 = weight(abstract_txt:specialized in 2257) [ClassicSimilarity], result of:
            0.047174517 = score(doc=2257,freq=2.0), product of:
              0.13699569 = queryWeight, product of:
                1.0450155 = boost
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.02103094 = queryNorm
              0.34435037 = fieldWeight in 2257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
          0.03676429 = weight(abstract_txt:looking in 2257) [ClassicSimilarity], result of:
            0.03676429 = score(doc=2257,freq=1.0), product of:
              0.14617151 = queryWeight, product of:
                1.0794452 = boost
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.02103094 = queryNorm
              0.25151473 = fieldWeight in 2257, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4387774 = idf(docFreq=192, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
          0.01931024 = weight(abstract_txt:documents in 2257) [ClassicSimilarity], result of:
            0.01931024 = score(doc=2257,freq=1.0), product of:
              0.11988929 = queryWeight, product of:
                1.3825296 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02103094 = queryNorm
              0.16106726 = fieldWeight in 2257, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
          0.07837655 = weight(abstract_txt:metadata in 2257) [ClassicSimilarity], result of:
            0.07837655 = score(doc=2257,freq=6.0), product of:
              0.1678787 = queryWeight, product of:
                1.6359953 = boost
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.02103094 = queryNorm
              0.46686414 = fieldWeight in 2257, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.87927 = idf(docFreq=917, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
          0.1029268 = weight(abstract_txt:pages in 2257) [ClassicSimilarity], result of:
            0.1029268 = score(doc=2257,freq=2.0), product of:
              0.33237472 = queryWeight, product of:
                2.8193166 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.02103094 = queryNorm
              0.30967095 = fieldWeight in 2257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0390625 = fieldNorm(doc=2257)
        0.24 = coord(6/25)
    
  5. Collins-Thompson, K.; Callan, J.: Predicting reading difficulty with statistical language models (2005) 0.08
    0.07794242 = sum of:
      0.07794242 = product of:
        0.3897121 = sum of:
          0.04827106 = weight(abstract_txt:weil in 5579) [ClassicSimilarity], result of:
            0.04827106 = score(doc=5579,freq=1.0), product of:
              0.12812184 = queryWeight, product of:
                1.0106035 = boost
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.02103094 = queryNorm
              0.37675902 = fieldWeight in 5579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0281444 = idf(docFreq=290, maxDocs=44421)
                0.0625 = fieldNorm(doc=5579)
          0.06305061 = weight(abstract_txt:classifying in 5579) [ClassicSimilarity], result of:
            0.06305061 = score(doc=5579,freq=1.0), product of:
              0.153094 = queryWeight, product of:
                1.1047101 = boost
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.02103094 = queryNorm
              0.4118425 = fieldWeight in 5579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.58948 = idf(docFreq=165, maxDocs=44421)
                0.0625 = fieldNorm(doc=5579)
          0.053514108 = weight(abstract_txt:documents in 5579) [ClassicSimilarity], result of:
            0.053514108 = score(doc=5579,freq=3.0), product of:
              0.11988929 = queryWeight, product of:
                1.3825296 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.02103094 = queryNorm
              0.4463627 = fieldWeight in 5579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=5579)
          0.060193434 = weight(abstract_txt:method in 5579) [ClassicSimilarity], result of:
            0.060193434 = score(doc=5579,freq=1.0), product of:
              0.21407788 = queryWeight, product of:
                2.2626417 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.02103094 = queryNorm
              0.2811754 = fieldWeight in 5579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=5579)
          0.16468288 = weight(abstract_txt:pages in 5579) [ClassicSimilarity], result of:
            0.16468288 = score(doc=5579,freq=2.0), product of:
              0.33237472 = queryWeight, product of:
                2.8193166 = boost
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.02103094 = queryNorm
              0.49547353 = fieldWeight in 5579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6056433 = idf(docFreq=443, maxDocs=44421)
                0.0625 = fieldNorm(doc=5579)
        0.2 = coord(5/25)