Document (#43130)

Giesselbach, S.
Estler-Ziegler, T.
Dokumente schneller analysieren mit Künstlicher Intelligenz
Mail an Inetbib, 06.02.2021, [von Tania Estler-Ziegler]
Künstliche Intelligenz (KI) und natürliches Sprachverstehen (natural language understanding/NLU) verändern viele Aspekte unseres Alltags und unserer Arbeitsweise. Besondere Prominenz erlangte NLU durch Sprachassistenten wie Siri, Alexa und Google Now. NLU bietet Firmen und Einrichtungen das Potential, Prozesse effizienter zu gestalten und Mehrwert aus textuellen Inhalten zu schöpfen. So sind NLU-Lösungen in der Lage, komplexe, unstrukturierte Dokumente inhaltlich zu erschließen. Für die semantische Textanalyse hat das NLU-Team des IAIS Sprachmodelle entwickelt, die mit Deep-Learning-Verfahren trainiert werden. Die NLU-Suite analysiert Dokumente, extrahiert Eckdaten und erstellt bei Bedarf sogar eine strukturierte Zusammenfassung. Mit diesen Ergebnissen, aber auch über den Inhalt der Dokumente selbst, lassen sich Dokumente vergleichen oder Texte mit ähnlichen Informationen finden. KI-basierten Sprachmodelle sind der klassischen Verschlagwortung deutlich überlegen. Denn sie finden nicht nur Texte mit vordefinierten Schlagwörtern, sondern suchen intelligent nach Begriffen, die in ähnlichem Zusammenhang auftauchen oder als Synonym gebraucht werden. Der Vortrag liefert eine Einordnung der Begriffe "Künstliche Intelligenz" und "Natural Language Understanding" und zeigt Möglichkeiten, Grenzen, aktuelle Forschungsrichtungen und Methoden auf. Anhand von Praxisbeispielen wird anschließend demonstriert, wie NLU zur automatisierten Belegverarbeitung, zur Katalogisierung von großen Datenbeständen wie Nachrichten und Patenten und zur automatisierten thematischen Gruppierung von Social Media Beiträgen und Publikationen genutzt werden kann.
Vortrag im Rahmen des Berliner Arbeitskreis Information (BAK) am 25.02.2021.
Automatisches Indexieren

Similar documents (author)

  1. Ziegler, R.A.; Ziegler, R.S.: ¬The National Film Registry : a videography (1995) 6.36
    6.3593063 = sum of:
      6.3593063 = weight(author_txt:ziegler in 3444) [ClassicSimilarity], result of:
        6.3593063 = fieldWeight in 3444, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.993418 = idf(docFreq=14, maxDocs=44421)
          0.5 = fieldNorm(doc=3444)
  2. Ziegler, J.: ¬Der Auskunftsbibliothekar : ein Zauberlehrling? (1991) 5.62
    5.620886 = sum of:
      5.620886 = weight(author_txt:ziegler in 4324) [ClassicSimilarity], result of:
        5.620886 = fieldWeight in 4324, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.993418 = idf(docFreq=14, maxDocs=44421)
          0.625 = fieldNorm(doc=4324)
  3. Ziegler, B.: ESS: ein schneller Algorithmus zur Mustersuche in Zeichenfolgen (1996) 5.62
    5.620886 = sum of:
      5.620886 = weight(author_txt:ziegler in 612) [ClassicSimilarity], result of:
        5.620886 = fieldWeight in 612, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.993418 = idf(docFreq=14, maxDocs=44421)
          0.625 = fieldNorm(doc=612)
  4. Ziegler, C.: Smartes Chaos : Web 2.0 versus Semantic Web (2006) 5.62
    5.620886 = sum of:
      5.620886 = weight(author_txt:ziegler in 5868) [ClassicSimilarity], result of:
        5.620886 = fieldWeight in 5868, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.993418 = idf(docFreq=14, maxDocs=44421)
          0.625 = fieldNorm(doc=5868)
  5. Ziegler, C.: Weltendämmerung : XML und Datenbanken: Einblick in Tamino (2001) 5.62
    5.620886 = sum of:
      5.620886 = weight(author_txt:ziegler in 6802) [ClassicSimilarity], result of:
        5.620886 = fieldWeight in 6802, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.993418 = idf(docFreq=14, maxDocs=44421)
          0.625 = fieldNorm(doc=6802)

Similar documents (content)

  1. Sack, H.: Hybride Künstliche Intelligenz in der automatisierten Inhaltserschließung (2021) 0.14
    0.13786009 = sum of:
      0.13786009 = product of:
        0.86162555 = sum of:
          0.09627302 = weight(abstract_txt:verschlagwortung in 1373) [ClassicSimilarity], result of:
            0.09627302 = score(doc=1373,freq=1.0), product of:
              0.1445776 = queryWeight, product of:
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.016962405 = queryNorm
              0.6658917 = fieldWeight in 1373, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.523414 = idf(docFreq=23, maxDocs=44421)
                0.078125 = fieldNorm(doc=1373)
          0.26470155 = weight(abstract_txt:automatisierten in 1373) [ClassicSimilarity], result of:
            0.26470155 = score(doc=1373,freq=2.0), product of:
              0.28374982 = queryWeight, product of:
                1.9812181 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.016962405 = queryNorm
              0.93286943 = fieldWeight in 1373, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.078125 = fieldNorm(doc=1373)
          0.17117769 = weight(abstract_txt:intelligenz in 1373) [ClassicSimilarity], result of:
            0.17117769 = score(doc=1373,freq=2.0), product of:
              0.2428994 = queryWeight, product of:
                2.2450361 = boost
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.016962405 = queryNorm
              0.70472664 = fieldWeight in 1373, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.078125 = fieldNorm(doc=1373)
          0.32947335 = weight(abstract_txt:dokumente in 1373) [ClassicSimilarity], result of:
            0.32947335 = score(doc=1373,freq=3.0), product of:
              0.38927907 = queryWeight, product of:
                3.6691463 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.016962405 = queryNorm
              0.846368 = fieldWeight in 1373, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.078125 = fieldNorm(doc=1373)
        0.16 = coord(4/25)
  2. Nohr, H.: Theorie des Information Retrieval II : Automatische Indexierung (2004) 0.08
    0.07783988 = sum of:
      0.07783988 = product of:
        0.48649925 = sum of:
          0.09486283 = weight(abstract_txt:unstrukturierte in 1008) [ClassicSimilarity], result of:
            0.09486283 = score(doc=1008,freq=1.0), product of:
              0.16612513 = queryWeight, product of:
                1.0719318 = boost
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.016962405 = queryNorm
              0.5710324 = fieldWeight in 1008, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.0625 = fieldNorm(doc=1008)
          0.100162044 = weight(abstract_txt:textanalyse in 1008) [ClassicSimilarity], result of:
            0.100162044 = score(doc=1008,freq=1.0), product of:
              0.17225562 = queryWeight, product of:
                1.0915313 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.016962405 = queryNorm
              0.5814733 = fieldWeight in 1008, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=1008)
          0.027895676 = weight(abstract_txt:werden in 1008) [ClassicSimilarity], result of:
            0.027895676 = score(doc=1008,freq=3.0), product of:
              0.073461965 = queryWeight, product of:
                1.234643 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.016962405 = queryNorm
              0.3797295 = fieldWeight in 1008, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.0625 = fieldNorm(doc=1008)
          0.26357868 = weight(abstract_txt:dokumente in 1008) [ClassicSimilarity], result of:
            0.26357868 = score(doc=1008,freq=3.0), product of:
              0.38927907 = queryWeight, product of:
                3.6691463 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.016962405 = queryNorm
              0.6770944 = fieldWeight in 1008, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.0625 = fieldNorm(doc=1008)
        0.16 = coord(4/25)
  3. Fuhser, M.: Dirigent eines Chores an Fotografien : Boris Eldagsen spricht über Möglichkeiten und Problme KI-gestützter Kunst - Mehr Interessenten als Plätze im Forum Alte Post (2024) 0.07
    0.07481219 = sum of:
      0.07481219 = product of:
        0.6234349 = sum of:
          0.08914884 = weight(abstract_txt:finden in 2086) [ClassicSimilarity], result of:
            0.08914884 = score(doc=2086,freq=1.0), product of:
              0.12650424 = queryWeight, product of:
                1.3228697 = boost
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.016962405 = queryNorm
              0.7047103 = fieldWeight in 2086, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.125 = fieldNorm(doc=2086)
          0.26040176 = weight(abstract_txt:künstliche in 2086) [ClassicSimilarity], result of:
            0.26040176 = score(doc=2086,freq=2.0), product of:
              0.20517002 = queryWeight, product of:
                1.6846956 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.016962405 = queryNorm
              1.2691998 = fieldWeight in 2086, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.125 = fieldNorm(doc=2086)
          0.2738843 = weight(abstract_txt:intelligenz in 2086) [ClassicSimilarity], result of:
            0.2738843 = score(doc=2086,freq=2.0), product of:
              0.2428994 = queryWeight, product of:
                2.2450361 = boost
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.016962405 = queryNorm
              1.1275626 = fieldWeight in 2086, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.125 = fieldNorm(doc=2086)
        0.12 = coord(3/25)
  4. Zilm, G.: "Kl ist ein glorifizierter Taschenrechner" (2023) 0.07
    0.06797966 = sum of:
      0.06797966 = product of:
        0.5664972 = sum of:
          0.03221115 = weight(abstract_txt:werden in 2131) [ClassicSimilarity], result of:
            0.03221115 = score(doc=2131,freq=1.0), product of:
              0.073461965 = queryWeight, product of:
                1.234643 = boost
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.016962405 = queryNorm
              0.43847388 = fieldWeight in 2131, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.507791 = idf(docFreq=3617, maxDocs=44421)
                0.125 = fieldNorm(doc=2131)
          0.26040176 = weight(abstract_txt:künstliche in 2131) [ClassicSimilarity], result of:
            0.26040176 = score(doc=2131,freq=2.0), product of:
              0.20517002 = queryWeight, product of:
                1.6846956 = boost
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.016962405 = queryNorm
              1.2691998 = fieldWeight in 2131, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.179679 = idf(docFreq=91, maxDocs=44421)
                0.125 = fieldNorm(doc=2131)
          0.2738843 = weight(abstract_txt:intelligenz in 2131) [ClassicSimilarity], result of:
            0.2738843 = score(doc=2131,freq=2.0), product of:
              0.2428994 = queryWeight, product of:
                2.2450361 = boost
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.016962405 = queryNorm
              1.1275626 = fieldWeight in 2131, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3784575 = idf(docFreq=204, maxDocs=44421)
                0.125 = fieldNorm(doc=2131)
        0.12 = coord(3/25)
  5. Ehrmann, S.: ¬Die Nadel im Bytehaufen : Finden statt suchen: Text Retrieval, Multimediadatenbanken, Dokumentenmanagement (2000) 0.07
    0.06583997 = sum of:
      0.06583997 = product of:
        0.5486664 = sum of:
          0.08914884 = weight(abstract_txt:finden in 6317) [ClassicSimilarity], result of:
            0.08914884 = score(doc=6317,freq=1.0), product of:
              0.12650424 = queryWeight, product of:
                1.3228697 = boost
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.016962405 = queryNorm
              0.7047103 = fieldWeight in 6317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.125 = fieldNorm(doc=6317)
          0.15516314 = weight(abstract_txt:texte in 6317) [ClassicSimilarity], result of:
            0.15516314 = score(doc=6317,freq=1.0), product of:
              0.18304323 = queryWeight, product of:
                1.5912607 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.016962405 = queryNorm
              0.8476858 = fieldWeight in 6317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.125 = fieldNorm(doc=6317)
          0.30435443 = weight(abstract_txt:dokumente in 6317) [ClassicSimilarity], result of:
            0.30435443 = score(doc=6317,freq=1.0), product of:
              0.38927907 = queryWeight, product of:
                3.6691463 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.016962405 = queryNorm
              0.7818413 = fieldWeight in 6317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.125 = fieldNorm(doc=6317)
        0.12 = coord(3/25)