Document (#24106)

Author
Munson, K.I.
Title
Internet search engines : understanding their design to improve information retrieval
Source
Journal of Internet cataloging. 2(2000) nos.3/4, S.47-60
Year
2000
Abstract
The relationship between the methods currently used for indexing the World Wide Web and the programs, languages, and protocols on which the World Wide Web is based is examined. Two methods for indexing the Web are described, directories being briefly discussed while search engines are considered in detail. The automated approach used to create these tools is examined with special emphasis on the parts of a document used in indexing. Shortcomings of the approach are described. Suggestions for effective use of Web search engines are given
Theme
Suchmaschinen

Similar documents (content)

  1. Hock, R.: Search engines (2009) 0.28
    0.2785532 = sum of:
      0.2785532 = product of:
        1.1606383 = sum of:
          0.054917 = weight(abstract_txt:special in 863) [ClassicSimilarity], result of:
            0.054917 = score(doc=863,freq=1.0), product of:
              0.11765423 = queryWeight, product of:
                4.978838 = idf(docFreq=830, maxDocs=44421)
                0.023630861 = queryNorm
              0.46676606 = fieldWeight in 863, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.978838 = idf(docFreq=830, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
          0.08839859 = weight(abstract_txt:briefly in 863) [ClassicSimilarity], result of:
            0.08839859 = score(doc=863,freq=1.0), product of:
              0.16159697 = queryWeight, product of:
                1.1719601 = boost
                5.8349996 = idf(docFreq=352, maxDocs=44421)
                0.023630861 = queryNorm
              0.5470312 = fieldWeight in 863, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8349996 = idf(docFreq=352, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
          0.15510383 = weight(abstract_txt:described in 863) [ClassicSimilarity], result of:
            0.15510383 = score(doc=863,freq=2.0), product of:
              0.23508127 = queryWeight, product of:
                1.9990343 = boost
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.023630861 = queryNorm
              0.65978813 = fieldWeight in 863, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
          0.15960209 = weight(abstract_txt:search in 863) [ClassicSimilarity], result of:
            0.15960209 = score(doc=863,freq=6.0), product of:
              0.1901744 = queryWeight, product of:
                2.2020788 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.023630861 = queryNorm
              0.8392407 = fieldWeight in 863, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
          0.10990172 = weight(abstract_txt:indexing in 863) [ClassicSimilarity], result of:
            0.10990172 = score(doc=863,freq=1.0), product of:
              0.26947165 = queryWeight, product of:
                2.62128 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.023630861 = queryNorm
              0.4078415 = fieldWeight in 863, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
          0.5927151 = weight(abstract_txt:engines in 863) [ClassicSimilarity], result of:
            0.5927151 = score(doc=863,freq=9.0), product of:
              0.39840665 = queryWeight, product of:
                3.1872795 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.023630861 = queryNorm
              1.4877138 = fieldWeight in 863, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.09375 = fieldNorm(doc=863)
        0.24 = coord(6/25)
    
  2. Rudich, J.: Internet search engines on CD-ROM (1996) 0.26
    0.26464015 = sum of:
      0.26464015 = product of:
        1.3232007 = sum of:
          0.2795943 = weight(abstract_txt:directories in 5722) [ClassicSimilarity], result of:
            0.2795943 = score(doc=5722,freq=1.0), product of:
              0.24769813 = queryWeight, product of:
                1.4509672 = boost
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.023630861 = queryNorm
              1.1287704 = fieldWeight in 5722, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.15625 = fieldNorm(doc=5722)
          0.18198752 = weight(abstract_txt:world in 5722) [ClassicSimilarity], result of:
            0.18198752 = score(doc=5722,freq=2.0), product of:
              0.18603654 = queryWeight, product of:
                1.7783219 = boost
                4.426988 = idf(docFreq=1442, maxDocs=44421)
                0.023630861 = queryNorm
              0.97823536 = fieldWeight in 5722, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.426988 = idf(docFreq=1442, maxDocs=44421)
                0.15625 = fieldNorm(doc=5722)
          0.2423607 = weight(abstract_txt:wide in 5722) [ClassicSimilarity], result of:
            0.2423607 = score(doc=5722,freq=2.0), product of:
              0.22518803 = queryWeight, product of:
                1.9565182 = boost
                4.8705935 = idf(docFreq=925, maxDocs=44421)
                0.023630861 = queryNorm
              1.0762593 = fieldWeight in 5722, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8705935 = idf(docFreq=925, maxDocs=44421)
                0.15625 = fieldNorm(doc=5722)
          0.1535772 = weight(abstract_txt:search in 5722) [ClassicSimilarity], result of:
            0.1535772 = score(doc=5722,freq=2.0), product of:
              0.1901744 = queryWeight, product of:
                2.2020788 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.023630861 = queryNorm
              0.8075597 = fieldWeight in 5722, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.15625 = fieldNorm(doc=5722)
          0.46568096 = weight(abstract_txt:engines in 5722) [ClassicSimilarity], result of:
            0.46568096 = score(doc=5722,freq=2.0), product of:
              0.39840665 = queryWeight, product of:
                3.1872795 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.023630861 = queryNorm
              1.1688584 = fieldWeight in 5722, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.15625 = fieldNorm(doc=5722)
        0.2 = coord(5/25)
    
  3. McMurdo, G.: How the Internet was indexed (1995) 0.21
    0.21459283 = sum of:
      0.21459283 = product of:
        0.6706026 = sum of:
          0.04685557 = weight(abstract_txt:considered in 3411) [ClassicSimilarity], result of:
            0.04685557 = score(doc=3411,freq=1.0), product of:
              0.11951745 = queryWeight, product of:
                1.0078871 = boost
                5.0181065 = idf(docFreq=798, maxDocs=44421)
                0.023630861 = queryNorm
              0.39203957 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0181065 = idf(docFreq=798, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.059568934 = weight(abstract_txt:currently in 3411) [ClassicSimilarity], result of:
            0.059568934 = score(doc=3411,freq=1.0), product of:
              0.14026104 = queryWeight, product of:
                1.0918545 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.023630861 = queryNorm
              0.4247005 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.06515878 = weight(abstract_txt:automated in 3411) [ClassicSimilarity], result of:
            0.06515878 = score(doc=3411,freq=1.0), product of:
              0.1489038 = queryWeight, product of:
                1.1249912 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.023630861 = queryNorm
              0.43758973 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.07460707 = weight(abstract_txt:methods in 3411) [ClassicSimilarity], result of:
            0.07460707 = score(doc=3411,freq=2.0), product of:
              0.16297106 = queryWeight, product of:
                1.6644336 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.023630861 = queryNorm
              0.45779335 = fieldWeight in 3411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.09139581 = weight(abstract_txt:described in 3411) [ClassicSimilarity], result of:
            0.09139581 = score(doc=3411,freq=1.0), product of:
              0.23508127 = queryWeight, product of:
                1.9990343 = boost
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.023630861 = queryNorm
              0.38878387 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.0767886 = weight(abstract_txt:search in 3411) [ClassicSimilarity], result of:
            0.0767886 = score(doc=3411,freq=2.0), product of:
              0.1901744 = queryWeight, product of:
                2.2020788 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.023630861 = queryNorm
              0.40377986 = fieldWeight in 3411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.091584764 = weight(abstract_txt:indexing in 3411) [ClassicSimilarity], result of:
            0.091584764 = score(doc=3411,freq=1.0), product of:
              0.26947165 = queryWeight, product of:
                2.62128 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.023630861 = queryNorm
              0.33986792 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
          0.1646431 = weight(abstract_txt:engines in 3411) [ClassicSimilarity], result of:
            0.1646431 = score(doc=3411,freq=1.0), product of:
              0.39840665 = queryWeight, product of:
                3.1872795 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.023630861 = queryNorm
              0.41325387 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.078125 = fieldNorm(doc=3411)
        0.32 = coord(8/25)
    
  4. Köhler, J.; Philippi, S.; Specht, M.; Rüegg, A.: Ontology based text indexing and querying for the semantic web (2006) 0.19
    0.18768558 = sum of:
      0.18768558 = product of:
        0.58651745 = sum of:
          0.05212702 = weight(abstract_txt:automated in 267) [ClassicSimilarity], result of:
            0.05212702 = score(doc=267,freq=1.0), product of:
              0.1489038 = queryWeight, product of:
                1.1249912 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.023630861 = queryNorm
              0.3500718 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.043932892 = weight(abstract_txt:approach in 267) [ClassicSimilarity], result of:
            0.043932892 = score(doc=267,freq=2.0), product of:
              0.13285881 = queryWeight, product of:
                1.5028181 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.023630861 = queryNorm
              0.33067352 = fieldWeight in 267, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.0730997 = weight(abstract_txt:methods in 267) [ClassicSimilarity], result of:
            0.0730997 = score(doc=267,freq=3.0), product of:
              0.16297106 = queryWeight, product of:
                1.6644336 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.023630861 = queryNorm
              0.44854406 = fieldWeight in 267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.073116645 = weight(abstract_txt:described in 267) [ClassicSimilarity], result of:
            0.073116645 = score(doc=267,freq=1.0), product of:
              0.23508127 = queryWeight, product of:
                1.9990343 = boost
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.023630861 = queryNorm
              0.3110271 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.03367321 = weight(abstract_txt:used in 267) [ClassicSimilarity], result of:
            0.03367321 = score(doc=267,freq=1.0), product of:
              0.16048235 = queryWeight, product of:
                2.0228817 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.023630861 = queryNorm
              0.20982501 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.07523715 = weight(abstract_txt:search in 267) [ClassicSimilarity], result of:
            0.07523715 = score(doc=267,freq=3.0), product of:
              0.1901744 = queryWeight, product of:
                2.2020788 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.023630861 = queryNorm
              0.39562184 = fieldWeight in 267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.103616334 = weight(abstract_txt:indexing in 267) [ClassicSimilarity], result of:
            0.103616334 = score(doc=267,freq=2.0), product of:
              0.26947165 = queryWeight, product of:
                2.62128 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.023630861 = queryNorm
              0.38451666 = fieldWeight in 267, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
          0.13171448 = weight(abstract_txt:engines in 267) [ClassicSimilarity], result of:
            0.13171448 = score(doc=267,freq=1.0), product of:
              0.39840665 = queryWeight, product of:
                3.1872795 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.023630861 = queryNorm
              0.3306031 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.0625 = fieldNorm(doc=267)
        0.32 = coord(8/25)
    
  5. Frakes, W.B.: Stemming algorithms (1992) 0.18
    0.18153639 = sum of:
      0.18153639 = product of:
        0.75640166 = sum of:
          0.122183256 = weight(abstract_txt:detail in 4503) [ClassicSimilarity], result of:
            0.122183256 = score(doc=4503,freq=1.0), product of:
              0.16552044 = queryWeight, product of:
                1.186102 = boost
                5.90541 = idf(docFreq=328, maxDocs=44421)
                0.023630861 = queryNorm
              0.7381762 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.90541 = idf(docFreq=328, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
          0.12652968 = weight(abstract_txt:programs in 4503) [ClassicSimilarity], result of:
            0.12652968 = score(doc=4503,freq=1.0), product of:
              0.16942291 = queryWeight, product of:
                1.2000029 = boost
                5.97462 = idf(docFreq=306, maxDocs=44421)
                0.023630861 = queryNorm
              0.7468275 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.97462 = idf(docFreq=306, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
          0.14623329 = weight(abstract_txt:described in 4503) [ClassicSimilarity], result of:
            0.14623329 = score(doc=4503,freq=1.0), product of:
              0.23508127 = queryWeight, product of:
                1.9990343 = boost
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.023630861 = queryNorm
              0.6220542 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9764338 = idf(docFreq=832, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
          0.06734642 = weight(abstract_txt:used in 4503) [ClassicSimilarity], result of:
            0.06734642 = score(doc=4503,freq=1.0), product of:
              0.16048235 = queryWeight, product of:
                2.0228817 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.023630861 = queryNorm
              0.41965002 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
          0.08687637 = weight(abstract_txt:search in 4503) [ClassicSimilarity], result of:
            0.08687637 = score(doc=4503,freq=1.0), product of:
              0.1901744 = queryWeight, product of:
                2.2020788 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.023630861 = queryNorm
              0.45682475 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
          0.20723267 = weight(abstract_txt:indexing in 4503) [ClassicSimilarity], result of:
            0.20723267 = score(doc=4503,freq=2.0), product of:
              0.26947165 = queryWeight, product of:
                2.62128 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.023630861 = queryNorm
              0.7690333 = fieldWeight in 4503, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.125 = fieldNorm(doc=4503)
        0.24 = coord(6/25)