Document (#40953)

Author
Drees, B.
Title
Text und data mining : Herausforderungen und Möglichkeiten für Bibliotheken
Source
Perspektive Bibliothek. 5(2016) H.1, S.49-73
Year
2016
Abstract
Text und Data Mining (TDM) gewinnt als wissenschaftliche Methode zunehmend an Bedeutung und stellt wissenschaftliche Bibliotheken damit vor neue Herausforderungen, bietet gleichzeitig aber auch neue Möglichkeiten. Der vorliegende Beitrag gibt einen Überblick über das Thema TDM aus bibliothekarischer Sicht. Hierzu wird der Begriff Text und Data Mining im Kontext verwandter Begriffe diskutiert sowie Ziele, Aufgaben und Methoden von TDM erläutert. Diese werden anhand beispielhafter TDM-Anwendungen in Wissenschaft und Forschung illustriert. Ferner werden technische und rechtliche Probleme und Hindernisse im TDM-Kontext dargelegt. Abschließend wird die Relevanz von TDM für Bibliotheken, sowohl in ihrer Rolle als Informationsvermittler und -anbieter als auch als Anwender von TDM-Methoden, aufgezeigt. Zudem wurde im Rahmen dieser Arbeit eine Befragung der Betreiber von Dokumentenservern an Bibliotheken in Deutschland zum aktuellen Umgang mit TDM durchgeführt, die zeigt, dass hier noch viel Ausbaupotential besteht. Die dem Artikel zugrunde liegenden Forschungsdaten sind unter dem DOI 10.11588/data/10090 publiziert.
Content
Vgl.: http://journals.ub.uni-heidelberg.de/index.php/bibliothek/article/view/33691/pdf.
Theme
Data Mining
Location
D
Area
Wissenschaftliche Bibliotheken

Similar documents (content)

  1. Schöning-Walter, C.: Persistant Identifier für Netzpublikationen (2007) 0.12
    0.11779528 = sum of:
      0.11779528 = product of:
        0.49081367 = sum of:
          0.085011825 = weight(abstract_txt:gewinnt in 3409) [ClassicSimilarity], result of:
            0.085011825 = score(doc=3409,freq=1.0), product of:
              0.17324339 = queryWeight, product of:
                1.0405436 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.021205755 = queryNorm
              0.4907075 = fieldWeight in 3409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
          0.0912179 = weight(abstract_txt:publiziert in 3409) [ClassicSimilarity], result of:
            0.0912179 = score(doc=3409,freq=1.0), product of:
              0.18157545 = queryWeight, product of:
                1.065272 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.021205755 = queryNorm
              0.5023691 = fieldWeight in 3409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
          0.04003266 = weight(abstract_txt:neue in 3409) [ClassicSimilarity], result of:
            0.04003266 = score(doc=3409,freq=1.0), product of:
              0.13211639 = queryWeight, product of:
                1.2850655 = boost
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.021205755 = queryNorm
              0.30301052 = fieldWeight in 3409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
          0.10647971 = weight(abstract_txt:wissenschaftliche in 3409) [ClassicSimilarity], result of:
            0.10647971 = score(doc=3409,freq=2.0), product of:
              0.20130213 = queryWeight, product of:
                1.586248 = boost
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.021205755 = queryNorm
              0.52895474 = fieldWeight in 3409, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
          0.09310363 = weight(abstract_txt:herausforderungen in 3409) [ClassicSimilarity], result of:
            0.09310363 = score(doc=3409,freq=1.0), product of:
              0.23191287 = queryWeight, product of:
                1.7025871 = boost
                6.4233527 = idf(docFreq=195, maxDocs=44421)
                0.021205755 = queryNorm
              0.40145954 = fieldWeight in 3409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4233527 = idf(docFreq=195, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
          0.074967995 = weight(abstract_txt:bibliotheken in 3409) [ClassicSimilarity], result of:
            0.074967995 = score(doc=3409,freq=1.0), product of:
              0.25289544 = queryWeight, product of:
                2.5143888 = boost
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.021205755 = queryNorm
              0.2964387 = fieldWeight in 3409, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.0625 = fieldNorm(doc=3409)
        0.24 = coord(6/25)
    
  2. Kreutzkam, E.: Neue Wege in alten Online-Katalogen : Catalog Enrichment als Methode der Sacherschließung? ; Stand, Entwicklung und Umsetzung in Bibliotheken Deutschlands (2007) 0.12
    0.11527158 = sum of:
      0.11527158 = product of:
        0.48029828 = sum of:
          0.033347525 = weight(abstract_txt:wird in 650) [ClassicSimilarity], result of:
            0.033347525 = score(doc=650,freq=2.0), product of:
              0.08000298 = queryWeight, product of:
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.021205755 = queryNorm
              0.41682854 = fieldWeight in 650, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
          0.106264785 = weight(abstract_txt:befragung in 650) [ClassicSimilarity], result of:
            0.106264785 = score(doc=650,freq=1.0), product of:
              0.17324339 = queryWeight, product of:
                1.0405436 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.021205755 = queryNorm
              0.61338437 = fieldWeight in 650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
          0.11190731 = weight(abstract_txt:rechtliche in 650) [ClassicSimilarity], result of:
            0.11190731 = score(doc=650,freq=1.0), product of:
              0.17932303 = queryWeight, product of:
                1.058644 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.021205755 = queryNorm
              0.6240543 = fieldWeight in 650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
          0.05004082 = weight(abstract_txt:neue in 650) [ClassicSimilarity], result of:
            0.05004082 = score(doc=650,freq=1.0), product of:
              0.13211639 = queryWeight, product of:
                1.2850655 = boost
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.021205755 = queryNorm
              0.37876314 = fieldWeight in 650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
          0.08502785 = weight(abstract_txt:methoden in 650) [ClassicSimilarity], result of:
            0.08502785 = score(doc=650,freq=1.0), product of:
              0.1881256 = queryWeight, product of:
                1.5334544 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.021205755 = queryNorm
              0.45197386 = fieldWeight in 650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
          0.09371 = weight(abstract_txt:bibliotheken in 650) [ClassicSimilarity], result of:
            0.09371 = score(doc=650,freq=1.0), product of:
              0.25289544 = queryWeight, product of:
                2.5143888 = boost
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.021205755 = queryNorm
              0.37054837 = fieldWeight in 650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.078125 = fieldNorm(doc=650)
        0.24 = coord(6/25)
    
  3. Schoepe, B.: ¬Die Digitalisierung und ihre Effizienzgewinneaus pädagogischer Perspektive : Datenschutzrechtliche Probleme der Coronakrisen-induzierten"Digitalisierungsoffensive". Mit einer kritischen Bewertungzur Einführung der Microsoft 365 / Microsoft Teams -Software an derSekundarstufe Iund II der Fritz-Schumacher-Schule (FSS), Hamburg (2021) 0.11
    0.11385639 = sum of:
      0.11385639 = product of:
        0.35580122 = sum of:
          0.031193752 = weight(abstract_txt:wird in 1116) [ClassicSimilarity], result of:
            0.031193752 = score(doc=1116,freq=7.0), product of:
              0.08000298 = queryWeight, product of:
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.021205755 = queryNorm
              0.3899074 = fieldWeight in 1116, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.060640212 = weight(abstract_txt:dargelegt in 1116) [ClassicSimilarity], result of:
            0.060640212 = score(doc=1116,freq=1.0), product of:
              0.18920134 = queryWeight, product of:
                1.0874118 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.021205755 = queryNorm
              0.32050624 = fieldWeight in 1116, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.02502041 = weight(abstract_txt:neue in 1116) [ClassicSimilarity], result of:
            0.02502041 = score(doc=1116,freq=1.0), product of:
              0.13211639 = queryWeight, product of:
                1.2850655 = boost
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.021205755 = queryNorm
              0.18938157 = fieldWeight in 1116, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.042513926 = weight(abstract_txt:methoden in 1116) [ClassicSimilarity], result of:
            0.042513926 = score(doc=1116,freq=1.0), product of:
              0.1881256 = queryWeight, product of:
                1.5334544 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.021205755 = queryNorm
              0.22598693 = fieldWeight in 1116, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.048592154 = weight(abstract_txt:text in 1116) [ClassicSimilarity], result of:
            0.048592154 = score(doc=1116,freq=5.0), product of:
              0.13767178 = queryWeight, product of:
                1.6066269 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.021205755 = queryNorm
              0.35295653 = fieldWeight in 1116, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.07343243 = weight(abstract_txt:kontext in 1116) [ClassicSimilarity], result of:
            0.07343243 = score(doc=1116,freq=2.0), product of:
              0.21495242 = queryWeight, product of:
                1.6391478 = boost
                6.184015 = idf(docFreq=248, maxDocs=44421)
                0.021205755 = queryNorm
              0.3416218 = fieldWeight in 1116, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.184015 = idf(docFreq=248, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.058189772 = weight(abstract_txt:herausforderungen in 1116) [ClassicSimilarity], result of:
            0.058189772 = score(doc=1116,freq=1.0), product of:
              0.23191287 = queryWeight, product of:
                1.7025871 = boost
                6.4233527 = idf(docFreq=195, maxDocs=44421)
                0.021205755 = queryNorm
              0.25091222 = fieldWeight in 1116, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4233527 = idf(docFreq=195, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
          0.016218558 = weight(abstract_txt:data in 1116) [ClassicSimilarity], result of:
            0.016218558 = score(doc=1116,freq=1.0), product of:
              0.124674775 = queryWeight, product of:
                1.765433 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.021205755 = queryNorm
              0.13008693 = fieldWeight in 1116, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0390625 = fieldNorm(doc=1116)
        0.32 = coord(8/25)
    
  4. Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.11
    0.10533282 = sum of:
      0.10533282 = product of:
        0.43888676 = sum of:
          0.04003266 = weight(abstract_txt:neue in 4780) [ClassicSimilarity], result of:
            0.04003266 = score(doc=4780,freq=1.0), product of:
              0.13211639 = queryWeight, product of:
                1.2850655 = boost
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.021205755 = queryNorm
              0.30301052 = fieldWeight in 4780, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8481684 = idf(docFreq=946, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
          0.084390014 = weight(abstract_txt:möglichkeiten in 4780) [ClassicSimilarity], result of:
            0.084390014 = score(doc=4780,freq=2.0), product of:
              0.17239757 = queryWeight, product of:
                1.4679542 = boost
                5.5381527 = idf(docFreq=474, maxDocs=44421)
                0.021205755 = queryNorm
              0.48950815 = fieldWeight in 4780, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5381527 = idf(docFreq=474, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
          0.034769714 = weight(abstract_txt:text in 4780) [ClassicSimilarity], result of:
            0.034769714 = score(doc=4780,freq=1.0), product of:
              0.13767178 = queryWeight, product of:
                1.6066269 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.021205755 = queryNorm
              0.25255513 = fieldWeight in 4780, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
          0.025949694 = weight(abstract_txt:data in 4780) [ClassicSimilarity], result of:
            0.025949694 = score(doc=4780,freq=1.0), product of:
              0.124674775 = queryWeight, product of:
                1.765433 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.021205755 = queryNorm
              0.20813909 = fieldWeight in 4780, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
          0.123896345 = weight(abstract_txt:mining in 4780) [ClassicSimilarity], result of:
            0.123896345 = score(doc=4780,freq=1.0), product of:
              0.321181 = queryWeight, product of:
                2.45396 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.021205755 = queryNorm
              0.3857524 = fieldWeight in 4780, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
          0.12984838 = weight(abstract_txt:bibliotheken in 4780) [ClassicSimilarity], result of:
            0.12984838 = score(doc=4780,freq=3.0), product of:
              0.25289544 = queryWeight, product of:
                2.5143888 = boost
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.021205755 = queryNorm
              0.51344687 = fieldWeight in 4780, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.743019 = idf(docFreq=1051, maxDocs=44421)
                0.0625 = fieldNorm(doc=4780)
        0.24 = coord(6/25)
    
  5. Mache, B.; Klaffki, L.: ¬Das DARIAH-DE Repository : Elementarer Teil einer modularen Infrastruktur für geistes- und kulturwissenschaftliche Forschungsdaten (2018) 0.10
    0.10316915 = sum of:
      0.10316915 = product of:
        0.6448072 = sum of:
          0.234375 = weight(abstract_txt:forschungsdaten in 485) [ClassicSimilarity], result of:
            0.234375 = score(doc=485,freq=3.0), product of:
              0.16263431 = queryWeight, product of:
                1.0081799 = boost
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.021205755 = queryNorm
              1.4411166 = fieldWeight in 485, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.607123 = idf(docFreq=59, maxDocs=44421)
                0.109375 = fieldNorm(doc=485)
          0.15963131 = weight(abstract_txt:publiziert in 485) [ClassicSimilarity], result of:
            0.15963131 = score(doc=485,freq=1.0), product of:
              0.18157545 = queryWeight, product of:
                1.065272 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.021205755 = queryNorm
              0.8791459 = fieldWeight in 485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.109375 = fieldNorm(doc=485)
          0.11903899 = weight(abstract_txt:methoden in 485) [ClassicSimilarity], result of:
            0.11903899 = score(doc=485,freq=1.0), product of:
              0.1881256 = queryWeight, product of:
                1.5334544 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.021205755 = queryNorm
              0.6327634 = fieldWeight in 485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.109375 = fieldNorm(doc=485)
          0.13176192 = weight(abstract_txt:wissenschaftliche in 485) [ClassicSimilarity], result of:
            0.13176192 = score(doc=485,freq=1.0), product of:
              0.20130213 = queryWeight, product of:
                1.586248 = boost
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.021205755 = queryNorm
              0.6545481 = fieldWeight in 485, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.98444 = idf(docFreq=303, maxDocs=44421)
                0.109375 = fieldNorm(doc=485)
        0.16 = coord(4/25)