Document (#33046)

Author
Mowshowitz, A.
Kawaguchi, A.
Title
Measuring search engine bias
Source
Information processing and management. 41(2005) no.5, S.1193-1206
Year
2005
Abstract
This paper examines a real-time measure of bias in Web search engines. The measure captures the degree to which the distribution of URLs, retrieved in response to a query, deviates from an ideal or fair distribution for that query. This ideal is approximated by the distribution produced by a collection of search engines. Differences between bias and classical retrieval measures are highlighted by examining the possibilities for bias in four extreme cases of recall and precision. The results of experiments examining the influence on bias measurement of subject domains, search engines, and search terms are presented. Three general conclusions are drawn: (1) the performance of search engines can be distinguished with the aid of the bias measure; (2) bias values depend on the subject matter under consideration; (3) choice of search terms does not account for much of the variance in bias values. These conclusions underscore the need to develop "bias profiles" for search engines.
Theme
Suchmaschinen

Similar documents (content)

  1. Mowshowitz, A.; Kawaguchi, A.: Assessing bias in search engines (2002) 0.56
    0.56403595 = sum of:
      0.56403595 = product of:
        1.5667665 = sum of:
          0.0643392 = weight(abstract_txt:measurement in 3574) [ClassicSimilarity], result of:
            0.0643392 = score(doc=3574,freq=2.0), product of:
              0.08412352 = queryWeight, product of:
                1.0117937 = boost
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.012010809 = queryNorm
              0.7648182 = fieldWeight in 3574, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.922344 = idf(docFreq=118, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.051708013 = weight(abstract_txt:urls in 3574) [ClassicSimilarity], result of:
            0.051708013 = score(doc=3574,freq=1.0), product of:
              0.09161831 = queryWeight, product of:
                1.0559039 = boost
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.012010809 = queryNorm
              0.5643852 = fieldWeight in 3574, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2241306 = idf(docFreq=87, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.016396193 = weight(abstract_txt:subject in 3574) [ClassicSimilarity], result of:
            0.016396193 = score(doc=3574,freq=1.0), product of:
              0.05367627 = queryWeight, product of:
                1.1429821 = boost
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.012010809 = queryNorm
              0.30546445 = fieldWeight in 3574, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.13347 = weight(abstract_txt:deviates in 3574) [ClassicSimilarity], result of:
            0.13347 = score(doc=3574,freq=1.0), product of:
              0.17239822 = queryWeight, product of:
                1.4484372 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.012010809 = queryNorm
              0.7741959 = fieldWeight in 3574, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.17054453 = weight(abstract_txt:ideal in 3574) [ClassicSimilarity], result of:
            0.17054453 = score(doc=3574,freq=4.0), product of:
              0.16112348 = queryWeight, product of:
                1.980285 = boost
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.012010809 = queryNorm
              1.058471 = fieldWeight in 3574, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.12493185 = weight(abstract_txt:distribution in 3574) [ClassicSimilarity], result of:
            0.12493185 = score(doc=3574,freq=3.0), product of:
              0.16496524 = queryWeight, product of:
                2.454088 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.012010809 = queryNorm
              0.75732225 = fieldWeight in 3574, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.1757974 = weight(abstract_txt:engines in 3574) [ClassicSimilarity], result of:
            0.1757974 = score(doc=3574,freq=3.0), product of:
              0.2456037 = queryWeight, product of:
                3.8657675 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.012010809 = queryNorm
              0.7157767 = fieldWeight in 3574, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.10711243 = weight(abstract_txt:search in 3574) [ClassicSimilarity], result of:
            0.10711243 = score(doc=3574,freq=4.0), product of:
              0.18757729 = queryWeight, product of:
                4.2733493 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.012010809 = queryNorm
              0.5710309 = fieldWeight in 3574, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
          0.72246695 = weight(abstract_txt:bias in 3574) [ClassicSimilarity], result of:
            0.72246695 = score(doc=3574,freq=3.0), product of:
              0.76653045 = queryWeight, product of:
                9.162611 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.012010809 = queryNorm
              0.9425156 = fieldWeight in 3574, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.078125 = fieldNorm(doc=3574)
        0.36 = coord(9/25)
    
  2. Jahani, H.; Azzopardi, L.; Sanderson, M.: Measuring the retrievability of digital library content using analytics data (2024) 0.15
    0.15266958 = sum of:
      0.15266958 = product of:
        0.76334786 = sum of:
          0.029481081 = weight(abstract_txt:query in 2386) [ClassicSimilarity], result of:
            0.029481081 = score(doc=2386,freq=1.0), product of:
              0.0793687 = queryWeight, product of:
                1.389866 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.012010809 = queryNorm
              0.37144467 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.078125 = fieldNorm(doc=2386)
          0.06609986 = weight(abstract_txt:measure in 2386) [ClassicSimilarity], result of:
            0.06609986 = score(doc=2386,freq=1.0), product of:
              0.15563877 = queryWeight, product of:
                2.3837066 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.012010809 = queryNorm
              0.4247005 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=2386)
          0.14353797 = weight(abstract_txt:engines in 2386) [ClassicSimilarity], result of:
            0.14353797 = score(doc=2386,freq=2.0), product of:
              0.2456037 = queryWeight, product of:
                3.8657675 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.012010809 = queryNorm
              0.5844292 = fieldWeight in 2386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.078125 = fieldNorm(doc=2386)
          0.10711243 = weight(abstract_txt:search in 2386) [ClassicSimilarity], result of:
            0.10711243 = score(doc=2386,freq=4.0), product of:
              0.18757729 = queryWeight, product of:
                4.2733493 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.012010809 = queryNorm
              0.5710309 = fieldWeight in 2386, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=2386)
          0.4171165 = weight(abstract_txt:bias in 2386) [ClassicSimilarity], result of:
            0.4171165 = score(doc=2386,freq=1.0), product of:
              0.76653045 = queryWeight, product of:
                9.162611 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.012010809 = queryNorm
              0.5441617 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.078125 = fieldNorm(doc=2386)
        0.2 = coord(5/25)
    
  3. Chau, M.; Lu, Y.; Fang, X.; Yang, C.C.: Characteristics of character usage in Chinese Web searching (2009) 0.14
    0.14202224 = sum of:
      0.14202224 = product of:
        0.5072223 = sum of:
          0.029019622 = weight(abstract_txt:terms in 3456) [ClassicSimilarity], result of:
            0.029019622 = score(doc=3456,freq=4.0), product of:
              0.057411846 = queryWeight, product of:
                1.1820859 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.012010809 = queryNorm
              0.505464 = fieldWeight in 3456, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.023584865 = weight(abstract_txt:query in 3456) [ClassicSimilarity], result of:
            0.023584865 = score(doc=3456,freq=1.0), product of:
              0.0793687 = queryWeight, product of:
                1.389866 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.012010809 = queryNorm
              0.29715574 = fieldWeight in 3456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.060772363 = weight(abstract_txt:values in 3456) [ClassicSimilarity], result of:
            0.060772363 = score(doc=3456,freq=2.0), product of:
              0.118400745 = queryWeight, product of:
                1.6975614 = boost
                5.807065 = idf(docFreq=362, maxDocs=44421)
                0.012010809 = queryNorm
              0.5132769 = fieldWeight in 3456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.807065 = idf(docFreq=362, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.05025085 = weight(abstract_txt:examining in 3456) [ClassicSimilarity], result of:
            0.05025085 = score(doc=3456,freq=1.0), product of:
              0.13141833 = queryWeight, product of:
                1.7884477 = boost
                6.1179714 = idf(docFreq=265, maxDocs=44421)
                0.012010809 = queryNorm
              0.3823732 = fieldWeight in 3456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1179714 = idf(docFreq=265, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.115407094 = weight(abstract_txt:distribution in 3456) [ClassicSimilarity], result of:
            0.115407094 = score(doc=3456,freq=4.0), product of:
              0.16496524 = queryWeight, product of:
                2.454088 = boost
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.012010809 = queryNorm
              0.6995843 = fieldWeight in 3456, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.5966744 = idf(docFreq=447, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.114830375 = weight(abstract_txt:engines in 3456) [ClassicSimilarity], result of:
            0.114830375 = score(doc=3456,freq=2.0), product of:
              0.2456037 = queryWeight, product of:
                3.8657675 = boost
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.012010809 = queryNorm
              0.46754336 = fieldWeight in 3456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2896495 = idf(docFreq=608, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
          0.11335714 = weight(abstract_txt:search in 3456) [ClassicSimilarity], result of:
            0.11335714 = score(doc=3456,freq=7.0), product of:
              0.18757729 = queryWeight, product of:
                4.2733493 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.012010809 = queryNorm
              0.6043223 = fieldWeight in 3456, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=3456)
        0.28 = coord(7/25)
    
  4. Keen, E.M.: Interactive ranked retrieval (1995) 0.13
    0.13364904 = sum of:
      0.13364904 = product of:
        0.66824514 = sum of:
          0.01967543 = weight(abstract_txt:subject in 2487) [ClassicSimilarity], result of:
            0.01967543 = score(doc=2487,freq=1.0), product of:
              0.05367627 = queryWeight, product of:
                1.1429821 = boost
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.012010809 = queryNorm
              0.36655733 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9099448 = idf(docFreq=2419, maxDocs=44421)
                0.09375 = fieldNorm(doc=2487)
          0.021764716 = weight(abstract_txt:terms in 2487) [ClassicSimilarity], result of:
            0.021764716 = score(doc=2487,freq=1.0), product of:
              0.057411846 = queryWeight, product of:
                1.1820859 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.012010809 = queryNorm
              0.379098 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.09375 = fieldNorm(doc=2487)
          0.035377298 = weight(abstract_txt:query in 2487) [ClassicSimilarity], result of:
            0.035377298 = score(doc=2487,freq=1.0), product of:
              0.0793687 = queryWeight, product of:
                1.389866 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.012010809 = queryNorm
              0.4457336 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.09375 = fieldNorm(doc=2487)
          0.09088792 = weight(abstract_txt:search in 2487) [ClassicSimilarity], result of:
            0.09088792 = score(doc=2487,freq=2.0), product of:
              0.18757729 = queryWeight, product of:
                4.2733493 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.012010809 = queryNorm
              0.4845358 = fieldWeight in 2487, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.09375 = fieldNorm(doc=2487)
          0.5005398 = weight(abstract_txt:bias in 2487) [ClassicSimilarity], result of:
            0.5005398 = score(doc=2487,freq=1.0), product of:
              0.76653045 = queryWeight, product of:
                9.162611 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.012010809 = queryNorm
              0.652994 = fieldWeight in 2487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.09375 = fieldNorm(doc=2487)
        0.2 = coord(5/25)
    
  5. Bashir, S.; Rauber, A.: On the relationship between query characteristics and IR functions retrieval bias (2011) 0.12
    0.11922239 = sum of:
      0.11922239 = product of:
        0.9935199 = sum of:
          0.057770886 = weight(abstract_txt:query in 628) [ClassicSimilarity], result of:
            0.057770886 = score(doc=628,freq=6.0), product of:
              0.0793687 = queryWeight, product of:
                1.389866 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.012010809 = queryNorm
              0.72787994 = fieldWeight in 628, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
          0.05287989 = weight(abstract_txt:measure in 628) [ClassicSimilarity], result of:
            0.05287989 = score(doc=628,freq=1.0), product of:
              0.15563877 = queryWeight, product of:
                2.3837066 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.012010809 = queryNorm
              0.3397604 = fieldWeight in 628, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
          0.8828691 = weight(abstract_txt:bias in 628) [ClassicSimilarity], result of:
            0.8828691 = score(doc=628,freq=7.0), product of:
              0.76653045 = queryWeight, product of:
                9.162611 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.012010809 = queryNorm
              1.1517731 = fieldWeight in 628, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
        0.12 = coord(3/25)