Document (#41599)

Author
Gasser, M.
Wanger, R.
Prada, I.
Title
Wenn Algorithmen Zeitschriften lesen : vom Mehrwert automatisierter Textanreicherung
Source
o-bib: Das offene Bibliotheksjournal. 5(2018) Nr.4, S.181-192
Year
2018
Abstract
In Zusammenarbeit mit dem Institut für Computerlinguistik der Universität Zürich (ICL UZH) lancierte die ETH-Bibliothek Zürich ein Pilotprojekt im Bereich automatisierter Textanreicherung. Grundlage für den Piloten bildeten Volltextdateien der Schweizer Zeitschriftenplattform E-Periodica. Anhand eines ausgewählten Korpus dieser OCR-Daten wurden mit automatisierten Verfahren Tests in den Bereichen OCR-Korrektur, Erkennung von Personen-, Orts- und Ländernamen sowie Verlinkung identifizierter Personen mit der Gemeinsamen Normdatei GND durchgeführt. Insgesamt wurden sehr positive Resultate erzielt. Das verwendete System dient nun als Grundlage für den weiteren Kompetenzausbau der ETH-Bibliothek auf diesem Gebiet. Das gesamte bestehende Angebot der Plattform E-Periodica soll automatisiert angereichert und um neue Funktionalitäten erweitert werden. Dies mit dem Ziel, Forschenden einen Mehrwert bei der Informationsbeschaffung zu bieten. Im vorliegenden Beitrag werden Projektinhalt, Methodik und Resultate erläutert sowie das weitere Vorgehen skizziert.
Content
Vortrag anlässlich des 107. Deutschen Bibliothekartages 2018 in Berlin, Themenkreis "Fokus Erschließen & Bewahren". https://www.o-bib.de/article/view/5382. https://doi.org/10.5282/o-bib/2018H4S181-192.
Form
Zeitungen
Location
CH

Similar documents (author)

  1. Palfrey, J.; Gasser, U.: Generation Internet : die Digital Natives: Wie sie leben - Was sie denken - Wie sie arbeiten (2008) 4.88
    4.8777785 = sum of:
      4.8777785 = weight(author_txt:gasser in 56) [ClassicSimilarity], result of:
        4.8777785 = fieldWeight in 56, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.5 = fieldNorm(doc=56)
    
  2. Stvilia, B.; Gasser, L.: Value-based metadata quality assessment (2008) 4.88
    4.8777785 = sum of:
      4.8777785 = weight(author_txt:gasser in 1252) [ClassicSimilarity], result of:
        4.8777785 = fieldWeight in 1252, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.5 = fieldNorm(doc=1252)
    
  3. Gasser, U.; Thurman, J.: Themen und Herausforderungen der Regulierung von Suchmaschinen (2007) 4.88
    4.8777785 = sum of:
      4.8777785 = weight(author_txt:gasser in 1382) [ClassicSimilarity], result of:
        4.8777785 = fieldWeight in 1382, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.5 = fieldNorm(doc=1382)
    
  4. Stvilia, B.; Gasser, L.; Twidale, M.B.; Smith, L.C.: ¬A framework for information quality assessment (2007) 3.05
    3.0486116 = sum of:
      3.0486116 = weight(author_txt:gasser in 1610) [ClassicSimilarity], result of:
        3.0486116 = fieldWeight in 1610, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.3125 = fieldNorm(doc=1610)
    
  5. Stvilia, B.; Twidale, M.B.; Smith, L.C.; Gasser, L.: Information quality work organization in wikipedia (2008) 3.05
    3.0486116 = sum of:
      3.0486116 = weight(author_txt:gasser in 2859) [ClassicSimilarity], result of:
        3.0486116 = fieldWeight in 2859, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.3125 = fieldNorm(doc=2859)
    

Similar documents (content)

  1. Bubenhofer, N.: Einführung in die Korpuslinguistik : Praktische Grundlagen und Werkzeuge (2006) 0.09
    0.093287125 = sum of:
      0.093287125 = product of:
        0.77739275 = sum of:
          0.15090765 = weight(abstract_txt:computerlinguistik in 113) [ClassicSimilarity], result of:
            0.15090765 = score(doc=113,freq=1.0), product of:
              0.16187495 = queryWeight, product of:
                1.0463159 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.018151112 = queryNorm
              0.93224835 = fieldWeight in 113, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.109375 = fieldNorm(doc=113)
          0.21710518 = weight(abstract_txt:korpus in 113) [ClassicSimilarity], result of:
            0.21710518 = score(doc=113,freq=1.0), product of:
              0.20629351 = queryWeight, product of:
                1.1811792 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.018151112 = queryNorm
              1.0524092 = fieldWeight in 113, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.109375 = fieldNorm(doc=113)
          0.40937993 = weight(abstract_txt:zürich in 113) [ClassicSimilarity], result of:
            0.40937993 = score(doc=113,freq=2.0), product of:
              0.3148641 = queryWeight, product of:
                2.0637143 = boost
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.018151112 = queryNorm
              1.3001797 = fieldWeight in 113, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.109375 = fieldNorm(doc=113)
        0.12 = coord(3/25)
    
  2. Jutzi, U.; Keller, A.: Dissertationen Online an der ETH-Bibliothek Zürich (2001) 0.08
    0.0783216 = sum of:
      0.0783216 = product of:
        0.65268 = sum of:
          0.11945393 = weight(abstract_txt:schweizer in 6670) [ClassicSimilarity], result of:
            0.11945393 = score(doc=6670,freq=1.0), product of:
              0.1535101 = queryWeight, product of:
                1.0189233 = boost
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.018151112 = queryNorm
              0.7781503 = fieldWeight in 6670, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.09375 = fieldNorm(doc=6670)
          0.103466675 = weight(abstract_txt:bibliothek in 6670) [ClassicSimilarity], result of:
            0.103466675 = score(doc=6670,freq=3.0), product of:
              0.12185403 = queryWeight, product of:
                1.2838312 = boost
                5.229121 = idf(docFreq=646, maxDocs=44421)
                0.018151112 = queryNorm
              0.84910345 = fieldWeight in 6670, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.229121 = idf(docFreq=646, maxDocs=44421)
                0.09375 = fieldNorm(doc=6670)
          0.42975938 = weight(abstract_txt:zürich in 6670) [ClassicSimilarity], result of:
            0.42975938 = score(doc=6670,freq=3.0), product of:
              0.3148641 = queryWeight, product of:
                2.0637143 = boost
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.018151112 = queryNorm
              1.3649044 = fieldWeight in 6670, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.09375 = fieldNorm(doc=6670)
        0.12 = coord(3/25)
    
  3. Jensen, N.: Evaluierung von mehrsprachigem Web-Retrieval : Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.08
    0.07618904 = sum of:
      0.07618904 = product of:
        0.4761815 = sum of:
          0.11292135 = weight(abstract_txt:erzielt in 964) [ClassicSimilarity], result of:
            0.11292135 = score(doc=964,freq=1.0), product of:
              0.14786112 = queryWeight, product of:
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.018151112 = queryNorm
              0.7636987 = fieldWeight in 964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.09375 = fieldNorm(doc=964)
          0.041236892 = weight(abstract_txt:sowie in 964) [ClassicSimilarity], result of:
            0.041236892 = score(doc=964,freq=1.0), product of:
              0.09517815 = queryWeight, product of:
                1.1346362 = boost
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.018151112 = queryNorm
              0.43326008 = fieldWeight in 964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.09375 = fieldNorm(doc=964)
          0.2631712 = weight(abstract_txt:korpus in 964) [ClassicSimilarity], result of:
            0.2631712 = score(doc=964,freq=2.0), product of:
              0.20629351 = queryWeight, product of:
                1.1811792 = boost
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.018151112 = queryNorm
              1.2757125 = fieldWeight in 964, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.622026 = idf(docFreq=7, maxDocs=44421)
                0.09375 = fieldNorm(doc=964)
          0.05885206 = weight(abstract_txt:wurden in 964) [ClassicSimilarity], result of:
            0.05885206 = score(doc=964,freq=1.0), product of:
              0.12064827 = queryWeight, product of:
                1.2774637 = boost
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.018151112 = queryNorm
              0.48779863 = fieldWeight in 964, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.09375 = fieldNorm(doc=964)
        0.16 = coord(4/25)
    
  4. Beck, C.: ¬Die Qualität der Fremddatenanreicherung FRED (2021) 0.07
    0.07272086 = sum of:
      0.07272086 = product of:
        0.45450538 = sum of:
          0.027491262 = weight(abstract_txt:sowie in 1378) [ClassicSimilarity], result of:
            0.027491262 = score(doc=1378,freq=1.0), product of:
              0.09517815 = queryWeight, product of:
                1.1346362 = boost
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.018151112 = queryNorm
              0.28884006 = fieldWeight in 1378, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.0625 = fieldNorm(doc=1378)
          0.03923471 = weight(abstract_txt:wurden in 1378) [ClassicSimilarity], result of:
            0.03923471 = score(doc=1378,freq=1.0), product of:
              0.12064827 = queryWeight, product of:
                1.2774637 = boost
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.018151112 = queryNorm
              0.3251991 = fieldWeight in 1378, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.0625 = fieldNorm(doc=1378)
          0.153848 = weight(abstract_txt:resultate in 1378) [ClassicSimilarity], result of:
            0.153848 = score(doc=1378,freq=1.0), product of:
              0.30000976 = queryWeight, product of:
                2.0144463 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.018151112 = queryNorm
              0.51281 = fieldWeight in 1378, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=1378)
          0.23393139 = weight(abstract_txt:zürich in 1378) [ClassicSimilarity], result of:
            0.23393139 = score(doc=1378,freq=2.0), product of:
              0.3148641 = queryWeight, product of:
                2.0637143 = boost
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.018151112 = queryNorm
              0.74295986 = fieldWeight in 1378, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.0625 = fieldNorm(doc=1378)
        0.16 = coord(4/25)
    
  5. Buurman, G.M.: Wissenterritorien : ein Werkzeug zur Visualisierung wissenschaftlicher Diskurse (2001) 0.07
    0.06602641 = sum of:
      0.06602641 = product of:
        0.33013207 = sum of:
          0.075453825 = weight(abstract_txt:computerlinguistik in 6889) [ClassicSimilarity], result of:
            0.075453825 = score(doc=6889,freq=1.0), product of:
              0.16187495 = queryWeight, product of:
                1.0463159 = boost
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.018151112 = queryNorm
              0.46612418 = fieldWeight in 6889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6889)
          0.024054855 = weight(abstract_txt:sowie in 6889) [ClassicSimilarity], result of:
            0.024054855 = score(doc=6889,freq=1.0), product of:
              0.09517815 = queryWeight, product of:
                1.1346362 = boost
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.018151112 = queryNorm
              0.25273505 = fieldWeight in 6889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6889)
          0.03433037 = weight(abstract_txt:wurden in 6889) [ClassicSimilarity], result of:
            0.03433037 = score(doc=6889,freq=1.0), product of:
              0.12064827 = queryWeight, product of:
                1.2774637 = boost
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.018151112 = queryNorm
              0.2845492 = fieldWeight in 6889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2031856 = idf(docFreq=663, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6889)
          0.051555336 = weight(abstract_txt:grundlage in 6889) [ClassicSimilarity], result of:
            0.051555336 = score(doc=6889,freq=1.0), product of:
              0.15821628 = queryWeight, product of:
                1.4628965 = boost
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.018151112 = queryNorm
              0.32585353 = fieldWeight in 6889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9584646 = idf(docFreq=311, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6889)
          0.14473766 = weight(abstract_txt:zürich in 6889) [ClassicSimilarity], result of:
            0.14473766 = score(doc=6889,freq=1.0), product of:
              0.3148641 = queryWeight, product of:
                2.0637143 = boost
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.018151112 = queryNorm
              0.45968294 = fieldWeight in 6889, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.405631 = idf(docFreq=26, maxDocs=44421)
                0.0546875 = fieldNorm(doc=6889)
        0.2 = coord(5/25)