Document (#44387)

Author
Jahani, H.
Azzopardi, L.
Sanderson, M.
Title
Measuring the retrievability of digital library content using analytics data
Source
Journal of the Association for Information Science and Technology. 75(2024) no.11, S.1233-1248
Year
2024
Abstract
Digital libraries aim to provide value to users by housing content that is accessible and searchable. Often such access is afforded through external web search engines. In this article, we measure how easily digital library content can be retrieved (i.e., how retrievable) through a well-known search engine (Google) using its analytics platforms. Using two measures of document retrievability, we contrast our results with simulation-based studies that employed synthetic query sets. We determine that estimating the retrievability of content given a Digital Library index is not a strong predictor of how retrievable the content is in practice (via external search engines). Retrievability established the notion that search algorithms can be biased. In our work, we find that while there such bias is present, much of the variation in retrievability appears to be strongly influenced by the queries submitted to the library, a side of retrievability less examined in past work.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24886. DOI: https://doi.org/10.1002/asi.24886.

Similar documents (author)

  1. Sanderson, M.: ¬The Reuters test collection (1996) 1.66
    1.662663 = sum of:
      1.662663 = product of:
        3.325326 = sum of:
          3.325326 = weight(author_txt:sanderson in 40) [ClassicSimilarity], result of:
            3.325326 = score(doc=40,freq=1.0), product of:
              0.62112284 = queryWeight, product of:
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.07251048 = queryNorm
              5.353733 = fieldWeight in 40, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.625 = fieldNorm(doc=40)
        0.5 = coord(1/2)
    
  2. Sanderson, M.: Revisiting h measured on UK LIS and IR academics (2008) 1.66
    1.662663 = sum of:
      1.662663 = product of:
        3.325326 = sum of:
          3.325326 = weight(author_txt:sanderson in 2867) [ClassicSimilarity], result of:
            3.325326 = score(doc=2867,freq=1.0), product of:
              0.62112284 = queryWeight, product of:
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.07251048 = queryNorm
              5.353733 = fieldWeight in 2867, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.625 = fieldNorm(doc=2867)
        0.5 = coord(1/2)
    
  3. Baillie, M.; Azzopardi, L.; Ruthven, I.: Evaluating epistemic uncertainty under incomplete assessments (2008) 1.41
    1.4139205 = sum of:
      1.4139205 = product of:
        2.827841 = sum of:
          2.827841 = weight(author_txt:azzopardi in 3065) [ClassicSimilarity], result of:
            2.827841 = score(doc=3065,freq=1.0), product of:
              0.7837132 = queryWeight, product of:
                1.1232847 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07251048 = queryNorm
              3.60826 = fieldWeight in 3065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.375 = fieldNorm(doc=3065)
        0.5 = coord(1/2)
    
  4. Balog, K.; Azzopardi, L.; Rijke, M. de: ¬A language modeling framework for expert finding (2009) 1.41
    1.4139205 = sum of:
      1.4139205 = product of:
        2.827841 = sum of:
          2.827841 = weight(author_txt:azzopardi in 3447) [ClassicSimilarity], result of:
            2.827841 = score(doc=3447,freq=1.0), product of:
              0.7837132 = queryWeight, product of:
                1.1232847 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07251048 = queryNorm
              3.60826 = fieldWeight in 3447, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.375 = fieldNorm(doc=3447)
        0.5 = coord(1/2)
    
  5. Layfield, C.; Azzopardi, J,; Staff, C.: Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback (2017) 1.41
    1.4139205 = sum of:
      1.4139205 = product of:
        2.827841 = sum of:
          2.827841 = weight(author_txt:azzopardi in 4478) [ClassicSimilarity], result of:
            2.827841 = score(doc=4478,freq=1.0), product of:
              0.7837132 = queryWeight, product of:
                1.1232847 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.07251048 = queryNorm
              3.60826 = fieldWeight in 4478, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.375 = fieldNorm(doc=4478)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Bashir, S.; Rauber, A.: On the relationship between query characteristics and IR functions retrieval bias (2011) 0.15
    0.14848213 = sum of:
      0.14848213 = product of:
        1.2373511 = sum of:
          0.09051026 = weight(abstract_txt:bias in 628) [ClassicSimilarity], result of:
            0.09051026 = score(doc=628,freq=7.0), product of:
              0.07858341 = queryWeight, product of:
                1.0037432 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.011240105 = queryNorm
              1.1517731 = fieldWeight in 628, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
          0.05308448 = weight(abstract_txt:estimating in 628) [ClassicSimilarity], result of:
            0.05308448 = score(doc=628,freq=1.0), product of:
              0.1053279 = queryWeight, product of:
                1.1620609 = boost
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.011240105 = queryNorm
              0.5039926 = fieldWeight in 628, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
          1.0937563 = weight(abstract_txt:retrievability in 628) [ClassicSimilarity], result of:
            1.0937563 = score(doc=628,freq=5.0), product of:
              0.8412127 = queryWeight, product of:
                8.044252 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.011240105 = queryNorm
              1.3002138 = fieldWeight in 628, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=628)
        0.12 = coord(3/25)
    
  2. Kilgour, F.: ¬An experiment using coordinate title word searches (2004) 0.11
    0.10539091 = sum of:
      0.10539091 = product of:
        0.6586932 = sum of:
          0.015682518 = weight(abstract_txt:using in 3065) [ClassicSimilarity], result of:
            0.015682518 = score(doc=3065,freq=1.0), product of:
              0.05806877 = queryWeight, product of:
                1.4944766 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.011240105 = queryNorm
              0.27006802 = fieldWeight in 3065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=3065)
          0.008368986 = weight(abstract_txt:that in 3065) [ClassicSimilarity], result of:
            0.008368986 = score(doc=3065,freq=1.0), product of:
              0.04529639 = queryWeight, product of:
                1.7040172 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.011240105 = queryNorm
              0.18476056 = fieldWeight in 3065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=3065)
          0.023213288 = weight(abstract_txt:library in 3065) [ClassicSimilarity], result of:
            0.023213288 = score(doc=3065,freq=2.0), product of:
              0.065885946 = queryWeight, product of:
                1.8381611 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.011240105 = queryNorm
              0.35232532 = fieldWeight in 3065, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.078125 = fieldNorm(doc=3065)
          0.6114284 = weight(abstract_txt:retrievability in 3065) [ClassicSimilarity], result of:
            0.6114284 = score(doc=3065,freq=1.0), product of:
              0.8412127 = queryWeight, product of:
                8.044252 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.011240105 = queryNorm
              0.7268416 = fieldWeight in 3065, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=3065)
        0.16 = coord(4/25)
    
  3. Hider, P.M.: Search goal redefinition through user-system interaction (2007) 0.09
    0.086598106 = sum of:
      0.086598106 = product of:
        0.5412382 = sum of:
          0.012546015 = weight(abstract_txt:using in 1827) [ClassicSimilarity], result of:
            0.012546015 = score(doc=1827,freq=1.0), product of:
              0.05806877 = queryWeight, product of:
                1.4944766 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.011240105 = queryNorm
              0.21605442 = fieldWeight in 1827, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=1827)
          0.011596407 = weight(abstract_txt:that in 1827) [ClassicSimilarity], result of:
            0.011596407 = score(doc=1827,freq=3.0), product of:
              0.04529639 = queryWeight, product of:
                1.7040172 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.011240105 = queryNorm
              0.25601172 = fieldWeight in 1827, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=1827)
          0.02795303 = weight(abstract_txt:search in 1827) [ClassicSimilarity], result of:
            0.02795303 = score(doc=1827,freq=2.0), product of:
              0.08653549 = queryWeight, product of:
                2.1066108 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.011240105 = queryNorm
              0.3230239 = fieldWeight in 1827, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=1827)
          0.48914272 = weight(abstract_txt:retrievability in 1827) [ClassicSimilarity], result of:
            0.48914272 = score(doc=1827,freq=1.0), product of:
              0.8412127 = queryWeight, product of:
                8.044252 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.011240105 = queryNorm
              0.5814733 = fieldWeight in 1827, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=1827)
        0.16 = coord(4/25)
    
  4. Gnoli, C.: Classification transcends library business : the case of BiblioPhil (2010) 0.07
    0.07377872 = sum of:
      0.07377872 = product of:
        0.26349542 = sum of:
          0.03278372 = weight(abstract_txt:searchable in 685) [ClassicSimilarity], result of:
            0.03278372 = score(doc=685,freq=1.0), product of:
              0.0834959 = queryWeight, product of:
                1.0346411 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.011240105 = queryNorm
              0.39263868 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.010146856 = weight(abstract_txt:that in 685) [ClassicSimilarity], result of:
            0.010146856 = score(doc=685,freq=3.0), product of:
              0.04529639 = queryWeight, product of:
                1.7040172 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.011240105 = queryNorm
              0.22401026 = fieldWeight in 685, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.011489992 = weight(abstract_txt:library in 685) [ClassicSimilarity], result of:
            0.011489992 = score(doc=685,freq=1.0), product of:
              0.065885946 = queryWeight, product of:
                1.8381611 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.011240105 = queryNorm
              0.17439215 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.024458902 = weight(abstract_txt:search in 685) [ClassicSimilarity], result of:
            0.024458902 = score(doc=685,freq=2.0), product of:
              0.08653549 = queryWeight, product of:
                2.1066108 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.011240105 = queryNorm
              0.2826459 = fieldWeight in 685, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.028714685 = weight(abstract_txt:digital in 685) [ClassicSimilarity], result of:
            0.028714685 = score(doc=685,freq=1.0), product of:
              0.12133396 = queryWeight, product of:
                2.4944704 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.011240105 = queryNorm
              0.23665828 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.12356275 = weight(abstract_txt:retrievable in 685) [ClassicSimilarity], result of:
            0.12356275 = score(doc=685,freq=1.0), product of:
              0.25477767 = queryWeight, product of:
                2.555953 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.011240105 = queryNorm
              0.48498267 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
          0.032338504 = weight(abstract_txt:content in 685) [ClassicSimilarity], result of:
            0.032338504 = score(doc=685,freq=1.0), product of:
              0.14148039 = queryWeight, product of:
                3.0115514 = boost
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.011240105 = queryNorm
              0.22857234 = fieldWeight in 685, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.0546875 = fieldNorm(doc=685)
        0.28 = coord(7/25)
    
  5. Crane, G.: What do you do with a million books? (2006) 0.07
    0.07146119 = sum of:
      0.07146119 = product of:
        0.22331622 = sum of:
          0.03278372 = weight(abstract_txt:searchable in 2180) [ClassicSimilarity], result of:
            0.03278372 = score(doc=2180,freq=1.0), product of:
              0.0834959 = queryWeight, product of:
                1.0346411 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.011240105 = queryNorm
              0.39263868 = fieldWeight in 2180, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.033879783 = weight(abstract_txt:synthetic in 2180) [ClassicSimilarity], result of:
            0.033879783 = score(doc=2180,freq=1.0), product of:
              0.0853467 = queryWeight, product of:
                1.0460454 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.011240105 = queryNorm
              0.39696652 = fieldWeight in 2180, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.0096402485 = weight(abstract_txt:work in 2180) [ClassicSimilarity], result of:
            0.0096402485 = score(doc=2180,freq=1.0), product of:
              0.046518795 = queryWeight, product of:
                1.0921603 = boost
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.011240105 = queryNorm
              0.2072334 = fieldWeight in 2180, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7894108 = idf(docFreq=2729, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.011336864 = weight(abstract_txt:through in 2180) [ClassicSimilarity], result of:
            0.011336864 = score(doc=2180,freq=1.0), product of:
              0.051828057 = queryWeight, product of:
                1.1528018 = boost
                3.9998152 = idf(docFreq=2211, maxDocs=44421)
                0.011240105 = queryNorm
              0.2187399 = fieldWeight in 2180, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9998152 = idf(docFreq=2211, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.013099536 = weight(abstract_txt:that in 2180) [ClassicSimilarity], result of:
            0.013099536 = score(doc=2180,freq=5.0), product of:
              0.04529639 = queryWeight, product of:
                1.7040172 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.011240105 = queryNorm
              0.28919604 = fieldWeight in 2180, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.019901248 = weight(abstract_txt:library in 2180) [ClassicSimilarity], result of:
            0.019901248 = score(doc=2180,freq=3.0), product of:
              0.065885946 = queryWeight, product of:
                1.8381611 = boost
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.011240105 = queryNorm
              0.30205604 = fieldWeight in 2180, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.188885 = idf(docFreq=4976, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.07033633 = weight(abstract_txt:digital in 2180) [ClassicSimilarity], result of:
            0.07033633 = score(doc=2180,freq=6.0), product of:
              0.12133396 = queryWeight, product of:
                2.4944704 = boost
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.011240105 = queryNorm
              0.579692 = fieldWeight in 2180, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.3274655 = idf(docFreq=1593, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
          0.032338504 = weight(abstract_txt:content in 2180) [ClassicSimilarity], result of:
            0.032338504 = score(doc=2180,freq=1.0), product of:
              0.14148039 = queryWeight, product of:
                3.0115514 = boost
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.011240105 = queryNorm
              0.22857234 = fieldWeight in 2180, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1796083 = idf(docFreq=1847, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2180)
        0.32 = coord(8/25)