Document (#20572)

Author
Volk, M.
Mittermaier, H.
Schurig, A.
Biedassek, T.
Title
Halbautomatische Volltextanalyse, Datenbankaufbau und Document Retrieval
Source
Datenanalyse, Klassifikation und Informationsverarbeitung: Methoden und Anwendungen in verschiedenen Fachgebieten. Hrsg.: H. Goebl u. M. Schader
Imprint
Heidelberg : Physica-Verlag
Year
1992
Pages
S.205-214
Abstract
In diesem Aufsatz beschreiben wir ein System zur Analyse von Kurzartikeln. Das System arbeitet halbautomatisch. Das heißt, zunächst wird der Artikel vom System analysiert und dann dem benutzer zur Nachberarbeitung vorgelegt. Die so gewonnene Information wird in einem Datenbankeintrag abgelegt. Über die Datenbank - in dBase IV implementiert - sind dann Abfragen und Zugriffe auf die Originaltexte effizient möglich. Der Kern dieses Aufsatzes betrifft die halbautomatische Analyse. Wir beschreiben unser Verfahren für parametrisiertes Pattern Matching sowie linguistische Heuristiken zur Ermittlung von Nominalphrasen und Präpositionalphrasen. Das System wurde für den praktischen Einsatz im Bonner Büro des 'Forums InformatikerInnen Für Frieden und gesellschaftliche Verantwortung e.V. (FIFF)' entwickelt
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Bernhardt, U.; Ruhmann, I.: ¬Die Informationsgesellschaft ist keine Jobmaschine : Trotz der Dynamik im Medien- und Telekommunikationsmarkt werden die ökonomischen Erwartungen nicht erfüllt (1998) 0.06
    0.06475562 = sum of:
      0.06475562 = product of:
        0.5396302 = sum of:
          0.12980992 = weight(abstract_txt:gesellschaftliche in 3317) [ClassicSimilarity], result of:
            0.12980992 = score(doc=3317,freq=1.0), product of:
              0.13681135 = queryWeight, product of:
                7.590594 = idf(docFreq=60, maxDocs=44421)
                0.0180238 = queryNorm
              0.9488242 = fieldWeight in 3317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.590594 = idf(docFreq=60, maxDocs=44421)
                0.125 = fieldNorm(doc=3317)
          0.13424718 = weight(abstract_txt:verantwortung in 3317) [ClassicSimilarity], result of:
            0.13424718 = score(doc=3317,freq=1.0), product of:
              0.13991158 = queryWeight, product of:
                1.0112668 = boost
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.0180238 = queryNorm
              0.9595145 = fieldWeight in 3317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.125 = fieldNorm(doc=3317)
          0.2755731 = weight(abstract_txt:frieden in 3317) [ClassicSimilarity], result of:
            0.2755731 = score(doc=3317,freq=1.0), product of:
              0.22598247 = queryWeight, product of:
                1.2852166 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.0180238 = queryNorm
              1.2194446 = fieldWeight in 3317, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.125 = fieldNorm(doc=3317)
        0.12 = coord(3/25)
    
  2. Datenschutz-Folgenabschätzung (DSFA) für die Corona-App (2020) 0.05
    0.048566718 = sum of:
      0.048566718 = product of:
        0.40472266 = sum of:
          0.097357444 = weight(abstract_txt:gesellschaftliche in 827) [ClassicSimilarity], result of:
            0.097357444 = score(doc=827,freq=1.0), product of:
              0.13681135 = queryWeight, product of:
                7.590594 = idf(docFreq=60, maxDocs=44421)
                0.0180238 = queryNorm
              0.7116182 = fieldWeight in 827, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.590594 = idf(docFreq=60, maxDocs=44421)
                0.09375 = fieldNorm(doc=827)
          0.10068539 = weight(abstract_txt:verantwortung in 827) [ClassicSimilarity], result of:
            0.10068539 = score(doc=827,freq=1.0), product of:
              0.13991158 = queryWeight, product of:
                1.0112668 = boost
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.0180238 = queryNorm
              0.71963584 = fieldWeight in 827, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.676116 = idf(docFreq=55, maxDocs=44421)
                0.09375 = fieldNorm(doc=827)
          0.20667982 = weight(abstract_txt:frieden in 827) [ClassicSimilarity], result of:
            0.20667982 = score(doc=827,freq=1.0), product of:
              0.22598247 = queryWeight, product of:
                1.2852166 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.0180238 = queryNorm
              0.91458344 = fieldWeight in 827, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.09375 = fieldNorm(doc=827)
        0.12 = coord(3/25)
    
  3. Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.04
    0.04333726 = sum of:
      0.04333726 = product of:
        0.36114383 = sum of:
          0.058327798 = weight(abstract_txt:betrifft in 3487) [ClassicSimilarity], result of:
            0.058327798 = score(doc=3487,freq=1.0), product of:
              0.1392671 = queryWeight, product of:
                1.0089351 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0180238 = queryNorm
              0.41881964 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.050077606 = weight(abstract_txt:analyse in 3487) [ClassicSimilarity], result of:
            0.050077606 = score(doc=3487,freq=1.0), product of:
              0.15850289 = queryWeight, product of:
                1.5222028 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0180238 = queryNorm
              0.31594127 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
          0.25273842 = weight(abstract_txt:halbautomatische in 3487) [ClassicSimilarity], result of:
            0.25273842 = score(doc=3487,freq=1.0), product of:
              0.46636108 = queryWeight, product of:
                2.6110494 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0180238 = queryNorm
              0.5419372 = fieldWeight in 3487, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3487)
        0.12 = coord(3/25)
    
  4. Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.04
    0.04333726 = sum of:
      0.04333726 = product of:
        0.36114383 = sum of:
          0.058327798 = weight(abstract_txt:betrifft in 163) [ClassicSimilarity], result of:
            0.058327798 = score(doc=163,freq=1.0), product of:
              0.1392671 = queryWeight, product of:
                1.0089351 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0180238 = queryNorm
              0.41881964 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.050077606 = weight(abstract_txt:analyse in 163) [ClassicSimilarity], result of:
            0.050077606 = score(doc=163,freq=1.0), product of:
              0.15850289 = queryWeight, product of:
                1.5222028 = boost
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0180238 = queryNorm
              0.31594127 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7772117 = idf(docFreq=373, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
          0.25273842 = weight(abstract_txt:halbautomatische in 163) [ClassicSimilarity], result of:
            0.25273842 = score(doc=163,freq=1.0), product of:
              0.46636108 = queryWeight, product of:
                2.6110494 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0180238 = queryNorm
              0.5419372 = fieldWeight in 163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.0546875 = fieldNorm(doc=163)
        0.12 = coord(3/25)
    
  5. Braschoß, K.; Hansmann, S.; Hesse, T.; Joosten-Wilke, U.; Ristau, R.; Rusch, B.; Taylor, V.: Indexierung von Online-Katalogen : Ein gemeinsames Konzept der ALEPH-Anwender in Berlin (2004) 0.04
    0.036825005 = sum of:
      0.036825005 = product of:
        0.23015629 = sum of:
          0.061199374 = weight(abstract_txt:linguistische in 3875) [ClassicSimilarity], result of:
            0.061199374 = score(doc=3875,freq=1.0), product of:
              0.17996229 = queryWeight, product of:
                1.1469109 = boost
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0180238 = queryNorm
              0.34006777 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.705735 = idf(docFreq=19, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3875)
          0.06344837 = weight(abstract_txt:abgelegt in 3875) [ClassicSimilarity], result of:
            0.06344837 = score(doc=3875,freq=1.0), product of:
              0.18434462 = queryWeight, product of:
                1.1607914 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0180238 = queryNorm
              0.34418344 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3875)
          0.046100635 = weight(abstract_txt:dann in 3875) [ClassicSimilarity], result of:
            0.046100635 = score(doc=3875,freq=2.0), product of:
              0.1489892 = queryWeight, product of:
                1.475813 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.0180238 = queryNorm
              0.30942267 = fieldWeight in 3875, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3875)
          0.0594079 = weight(abstract_txt:beschreiben in 3875) [ClassicSimilarity], result of:
            0.0594079 = score(doc=3875,freq=1.0), product of:
              0.22229157 = queryWeight, product of:
                1.8026667 = boost
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.0180238 = queryNorm
              0.26725215 = fieldWeight in 3875, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.0390625 = fieldNorm(doc=3875)
        0.16 = coord(4/25)