Document (#44073)

Author
Berg, A.
Nelimarkka, M.
Title
Do you see what I see? : measuring the semantic differences in image-recognition services' outputs
Source
Journal of the Association for Information Science and Technology. 74(2023) no.11, S.1307-1324
Year
2023
Abstract
As scholars increasingly undertake large-scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross-service label agreement concretized such issues: using eight datasets, spanning professionally produced and user-generated images, the work showed that image-recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image-recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word-embedding-based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/toc/23301643/current. https://doi.org/10.1002/asi.24827.
Field
Kognitionswissenschaft
Informatik
Form
Bilder

Similar documents (author)

  1. Berg, O.: Current problems with MARC/ISBD formats in relation to online public access of bibliographic information (1991) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:berg in 468) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 468, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=468)
    
  2. Berg, S.: Auf dem Weg : Fallbeispiel: Vorbereitungen für einen elektronischen Katalog (1995) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:berg in 716) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 716, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=716)
    
  3. Berg, L.: Wie das Internet die Gesellschaft verändert : Google gründet ein Forschungsinstitut in Berlin (2011) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:berg in 552) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 552, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=552)
    
  4. Berg, L.: Pablo will es wissen : Lernen mit Salman Khan (2012) 5.41
    5.4105906 = sum of:
      5.4105906 = weight(author_txt:berg in 1228) [ClassicSimilarity], result of:
        5.4105906 = fieldWeight in 1228, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.625 = fieldNorm(doc=1228)
    
  5. Berg, J. van den: ¬The ICONCLASS browser user's guide (1992) 4.33
    4.3284726 = sum of:
      4.3284726 = weight(author_txt:berg in 3269) [ClassicSimilarity], result of:
        4.3284726 = fieldWeight in 3269, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.656945 = idf(docFreq=20, maxDocs=44421)
          0.5 = fieldNorm(doc=3269)
    

Similar documents (content)

  1. Heidorn, P.B.: Image retrieval as linguistic and nonlinguistic visual model matching (1999) 0.14
    0.13673827 = sum of:
      0.13673827 = product of:
        0.6836913 = sum of:
          0.06891312 = weight(abstract_txt:images in 966) [ClassicSimilarity], result of:
            0.06891312 = score(doc=966,freq=2.0), product of:
              0.14362161 = queryWeight, product of:
                1.4890494 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.017767388 = queryNorm
              0.47982416 = fieldWeight in 966, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.0625 = fieldNorm(doc=966)
          0.07922661 = weight(abstract_txt:vision in 966) [ClassicSimilarity], result of:
            0.07922661 = score(doc=966,freq=1.0), product of:
              0.19858323 = queryWeight, product of:
                1.750937 = boost
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.017767388 = queryNorm
              0.39895922 = fieldWeight in 966, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3833475 = idf(docFreq=203, maxDocs=44421)
                0.0625 = fieldNorm(doc=966)
          0.16388105 = weight(abstract_txt:image in 966) [ClassicSimilarity], result of:
            0.16388105 = score(doc=966,freq=3.0), product of:
              0.28163326 = queryWeight, product of:
                2.9488738 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.017767388 = queryNorm
              0.58189523 = fieldWeight in 966, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=966)
          0.17474884 = weight(abstract_txt:labels in 966) [ClassicSimilarity], result of:
            0.17474884 = score(doc=966,freq=1.0), product of:
              0.3851842 = queryWeight, product of:
                2.986614 = boost
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.017767388 = queryNorm
              0.45367602 = fieldWeight in 966, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2588162 = idf(docFreq=84, maxDocs=44421)
                0.0625 = fieldNorm(doc=966)
          0.1969217 = weight(abstract_txt:recognition in 966) [ClassicSimilarity], result of:
            0.1969217 = score(doc=966,freq=2.0), product of:
              0.36438257 = queryWeight, product of:
                3.3542314 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.017767388 = queryNorm
              0.5404257 = fieldWeight in 966, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=966)
        0.2 = coord(5/25)
    
  2. McCain, K.W.: Assessing obliteration by incorporation : issues and caveats (2012) 0.11
    0.10648501 = sum of:
      0.10648501 = product of:
        0.8873751 = sum of:
          0.014759089 = weight(abstract_txt:analysis in 1485) [ClassicSimilarity], result of:
            0.014759089 = score(doc=1485,freq=1.0), product of:
              0.064774126 = queryWeight, product of:
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.017767388 = queryNorm
              0.2278547 = fieldWeight in 1485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0625 = fieldNorm(doc=1485)
          0.024211274 = weight(abstract_txt:issues in 1485) [ClassicSimilarity], result of:
            0.024211274 = score(doc=1485,freq=1.0), product of:
              0.0900963 = queryWeight, product of:
                1.1793771 = boost
                4.299626 = idf(docFreq=1638, maxDocs=44421)
                0.017767388 = queryNorm
              0.26872662 = fieldWeight in 1485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.299626 = idf(docFreq=1638, maxDocs=44421)
                0.0625 = fieldNorm(doc=1485)
          0.84840477 = weight(title_txt:caveats in 1485) [ClassicSimilarity], result of:
            0.84840477 = score(doc=1485,freq=1.0), product of:
              0.23191015 = queryWeight, product of:
                1.337963 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.017767388 = queryNorm
              3.6583338 = fieldWeight in 1485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.375 = fieldNorm(doc=1485)
        0.12 = coord(3/25)
    
  3. Berinstein, P.: Images in your future : the missing picture in an online search (1997) 0.10
    0.1022352 = sum of:
      0.1022352 = product of:
        0.63897 = sum of:
          0.17055127 = weight(abstract_txt:images in 556) [ClassicSimilarity], result of:
            0.17055127 = score(doc=556,freq=4.0), product of:
              0.14362161 = queryWeight, product of:
                1.4890494 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.017767388 = queryNorm
              1.1875043 = fieldWeight in 556, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.109375 = fieldNorm(doc=556)
          0.059161264 = weight(abstract_txt:services in 556) [ClassicSimilarity], result of:
            0.059161264 = score(doc=556,freq=1.0), product of:
              0.12884232 = queryWeight, product of:
                1.7273253 = boost
                4.198178 = idf(docFreq=1813, maxDocs=44421)
                0.017767388 = queryNorm
              0.4591757 = fieldWeight in 556, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.198178 = idf(docFreq=1813, maxDocs=44421)
                0.109375 = fieldNorm(doc=556)
          0.16557935 = weight(abstract_txt:image in 556) [ClassicSimilarity], result of:
            0.16557935 = score(doc=556,freq=1.0), product of:
              0.28163326 = queryWeight, product of:
                2.9488738 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.017767388 = queryNorm
              0.58792543 = fieldWeight in 556, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.109375 = fieldNorm(doc=556)
          0.24367818 = weight(abstract_txt:recognition in 556) [ClassicSimilarity], result of:
            0.24367818 = score(doc=556,freq=1.0), product of:
              0.36438257 = queryWeight, product of:
                3.3542314 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.017767388 = queryNorm
              0.6687427 = fieldWeight in 556, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.109375 = fieldNorm(doc=556)
        0.16 = coord(4/25)
    
  4. Town, C.; Harrison, K.: Large-scale grid computing for content-based image retrieval (2010) 0.09
    0.093646884 = sum of:
      0.093646884 = product of:
        0.58529305 = sum of:
          0.025828404 = weight(abstract_txt:analysis in 934) [ClassicSimilarity], result of:
            0.025828404 = score(doc=934,freq=4.0), product of:
              0.064774126 = queryWeight, product of:
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.017767388 = queryNorm
              0.39874572 = fieldWeight in 934, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0546875 = fieldNorm(doc=934)
          0.07385086 = weight(abstract_txt:images in 934) [ClassicSimilarity], result of:
            0.07385086 = score(doc=934,freq=3.0), product of:
              0.14362161 = queryWeight, product of:
                1.4890494 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.017767388 = queryNorm
              0.5142044 = fieldWeight in 934, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.0546875 = fieldNorm(doc=934)
          0.2745823 = weight(abstract_txt:image in 934) [ClassicSimilarity], result of:
            0.2745823 = score(doc=934,freq=11.0), product of:
              0.28163326 = queryWeight, product of:
                2.9488738 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.017767388 = queryNorm
              0.974964 = fieldWeight in 934, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0546875 = fieldNorm(doc=934)
          0.2110315 = weight(abstract_txt:recognition in 934) [ClassicSimilarity], result of:
            0.2110315 = score(doc=934,freq=3.0), product of:
              0.36438257 = queryWeight, product of:
                3.3542314 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.017767388 = queryNorm
              0.5791482 = fieldWeight in 934, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0546875 = fieldNorm(doc=934)
        0.16 = coord(4/25)
    
  5. Rabitti, F.; Savino, P.: Automatic image indexation to support content-based retrieval (1992) 0.09
    0.08917926 = sum of:
      0.08917926 = product of:
        0.5573704 = sum of:
          0.029518178 = weight(abstract_txt:analysis in 3031) [ClassicSimilarity], result of:
            0.029518178 = score(doc=3031,freq=4.0), product of:
              0.064774126 = queryWeight, product of:
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.017767388 = queryNorm
              0.4557094 = fieldWeight in 3031, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0625 = fieldNorm(doc=3031)
          0.11936102 = weight(abstract_txt:images in 3031) [ClassicSimilarity], result of:
            0.11936102 = score(doc=3031,freq=6.0), product of:
              0.14362161 = queryWeight, product of:
                1.4890494 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.017767388 = queryNorm
              0.83107984 = fieldWeight in 3031, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.0625 = fieldNorm(doc=3031)
          0.21156953 = weight(abstract_txt:image in 3031) [ClassicSimilarity], result of:
            0.21156953 = score(doc=3031,freq=5.0), product of:
              0.28163326 = queryWeight, product of:
                2.9488738 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.017767388 = queryNorm
              0.75122356 = fieldWeight in 3031, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=3031)
          0.1969217 = weight(abstract_txt:recognition in 3031) [ClassicSimilarity], result of:
            0.1969217 = score(doc=3031,freq=2.0), product of:
              0.36438257 = queryWeight, product of:
                3.3542314 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.017767388 = queryNorm
              0.5404257 = fieldWeight in 3031, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=3031)
        0.16 = coord(4/25)