Document (#43878)

Wiegmann, S.
Hättest du die Titanic überlebt? : Eine kurze Einführung in das Data Mining mit freier Software
API Magazin. 4(2023), Nr.1 []
Am 10. April 1912 ging Elisabeth Walton Allen an Bord der "Titanic", um ihr Hab und Gut nach England zu holen. Eines Nachts wurde sie von ihrer aufgelösten Tante geweckt, deren Kajüte unter Wasser stand. Wie steht es um Elisabeths Chancen und hätte man selbst das Unglück damals überlebt? Das Titanic-Orakel ist eine algorithmusbasierte App, die entsprechende Prognosen aufstellt und im Rahmen des Kurses "Data Science" am Department Information der HAW Hamburg entstanden ist. Dieser Beitrag zeigt Schritt für Schritt, wie die App unter Verwendung freier Software entwickelt wurde. Code und Daten werden zur Nachnutzung bereitgestellt.
Data Mining

Similar documents (content)

  1. Lüttcher, B.; Zendel, O.: ¬Eine kurze Geschichte Freier Software : Interview mit Oliver Zendel (2005) 0.17
    0.1712483 = sum of:
      0.1712483 = product of:
        0.61160105 = sum of:
          0.013010406 = weight(abstract_txt:eine in 4503) [ClassicSimilarity], result of:
            0.013010406 = score(doc=4503,freq=1.0), product of:
              0.05966538 = queryWeight, product of:
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.01710149 = queryNorm
              0.2180562 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.052768152 = weight(abstract_txt:entstanden in 4503) [ClassicSimilarity], result of:
            0.052768152 = score(doc=4503,freq=1.0), product of:
              0.1204388 = queryWeight, product of:
                1.004632 = boost
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.01710149 = queryNorm
              0.4381325 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.06341385 = weight(abstract_txt:damals in 4503) [ClassicSimilarity], result of:
            0.06341385 = score(doc=4503,freq=1.0), product of:
              0.1361365 = queryWeight, product of:
                1.068098 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01710149 = queryNorm
              0.46581078 = fieldWeight in 4503, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.06211196 = weight(abstract_txt:software in 4503) [ClassicSimilarity], result of:
            0.06211196 = score(doc=4503,freq=6.0), product of:
              0.0930954 = queryWeight, product of:
                1.2491164 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01710149 = queryNorm
              0.66718614 = fieldWeight in 4503, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.05765992 = weight(abstract_txt:wurde in 4503) [ClassicSimilarity], result of:
            0.05765992 = score(doc=4503,freq=3.0), product of:
              0.11161883 = queryWeight, product of:
                1.3677526 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.01710149 = queryNorm
              0.5165788 = fieldWeight in 4503, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.047488555 = weight(abstract_txt:unter in 4503) [ClassicSimilarity], result of:
            0.047488555 = score(doc=4503,freq=2.0), product of:
              0.11226503 = queryWeight, product of:
                1.371706 = boost
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.01710149 = queryNorm
              0.423004 = fieldWeight in 4503, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
          0.3151482 = weight(abstract_txt:freier in 4503) [ClassicSimilarity], result of:
            0.3151482 = score(doc=4503,freq=2.0), product of:
              0.39645606 = queryWeight, product of:
                2.5777235 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01710149 = queryNorm
              0.7949133 = fieldWeight in 4503, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=4503)
        0.28 = coord(7/25)
  2. Lützenkirchen, F.: Multimediale Dokumentenserver als E-Learning Content Repository (2006) 0.10
    0.09992676 = sum of:
      0.09992676 = product of:
        0.4163615 = sum of:
          0.018399492 = weight(abstract_txt:eine in 50) [ClassicSimilarity], result of:
            0.018399492 = score(doc=50,freq=2.0), product of:
              0.05966538 = queryWeight, product of:
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.01710149 = queryNorm
              0.30837804 = fieldWeight in 50, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
          0.052768152 = weight(abstract_txt:entstanden in 50) [ClassicSimilarity], result of:
            0.052768152 = score(doc=50,freq=1.0), product of:
              0.1204388 = queryWeight, product of:
                1.004632 = boost
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.01710149 = queryNorm
              0.4381325 = fieldWeight in 50, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
          0.06341385 = weight(abstract_txt:hamburg in 50) [ClassicSimilarity], result of:
            0.06341385 = score(doc=50,freq=1.0), product of:
              0.1361365 = queryWeight, product of:
                1.068098 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01710149 = queryNorm
              0.46581078 = fieldWeight in 50, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
          0.0253571 = weight(abstract_txt:software in 50) [ClassicSimilarity], result of:
            0.0253571 = score(doc=50,freq=1.0), product of:
              0.0930954 = queryWeight, product of:
                1.2491164 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01710149 = queryNorm
              0.27237758 = fieldWeight in 50, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
          0.03357948 = weight(abstract_txt:unter in 50) [ClassicSimilarity], result of:
            0.03357948 = score(doc=50,freq=1.0), product of:
              0.11226503 = queryWeight, product of:
                1.371706 = boost
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.01710149 = queryNorm
              0.299109 = fieldWeight in 50, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
          0.22284344 = weight(abstract_txt:freier in 50) [ClassicSimilarity], result of:
            0.22284344 = score(doc=50,freq=1.0), product of:
              0.39645606 = queryWeight, product of:
                2.5777235 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01710149 = queryNorm
              0.5620886 = fieldWeight in 50, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0625 = fieldNorm(doc=50)
        0.24 = coord(6/25)
  3. Lischka, K.: 128 Zeichen für die Welt : Vor 40 Jahren schrieben Fachleute das Alphabet des Computers - und schufen damit dem ASCII-Standard (2003) 0.06
    0.063528165 = sum of:
      0.063528165 = product of:
        0.26470068 = sum of:
          0.017211149 = weight(abstract_txt:eine in 1391) [ClassicSimilarity], result of:
            0.017211149 = score(doc=1391,freq=7.0), product of:
              0.05966538 = queryWeight, product of:
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.01710149 = queryNorm
              0.28846124 = fieldWeight in 1391, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.026384076 = weight(abstract_txt:entstanden in 1391) [ClassicSimilarity], result of:
            0.026384076 = score(doc=1391,freq=1.0), product of:
              0.1204388 = queryWeight, product of:
                1.004632 = boost
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.01710149 = queryNorm
              0.21906625 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.01012 = idf(docFreq=108, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.021959892 = weight(abstract_txt:software in 1391) [ClassicSimilarity], result of:
            0.021959892 = score(doc=1391,freq=3.0), product of:
              0.0930954 = queryWeight, product of:
                1.2491164 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01710149 = queryNorm
              0.2358859 = fieldWeight in 1391, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.033289973 = weight(abstract_txt:wurde in 1391) [ClassicSimilarity], result of:
            0.033289973 = score(doc=1391,freq=4.0), product of:
              0.11161883 = queryWeight, product of:
                1.3677526 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.01710149 = queryNorm
              0.29824692 = fieldWeight in 1391, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.01678974 = weight(abstract_txt:unter in 1391) [ClassicSimilarity], result of:
            0.01678974 = score(doc=1391,freq=1.0), product of:
              0.11226503 = queryWeight, product of:
                1.371706 = boost
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.01710149 = queryNorm
              0.1495545 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
          0.14906584 = weight(abstract_txt:überlebt in 1391) [ClassicSimilarity], result of:
            0.14906584 = score(doc=1391,freq=1.0), product of:
              0.48135695 = queryWeight, product of:
                2.8403537 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.01710149 = queryNorm
              0.30967838 = fieldWeight in 1391, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.03125 = fieldNorm(doc=1391)
        0.24 = coord(6/25)
  4. Stoyan, H.: Information in der Informatik (2004) 0.06
    0.06264109 = sum of:
      0.06264109 = product of:
        0.26100457 = sum of:
          0.016901014 = weight(abstract_txt:eine in 3959) [ClassicSimilarity], result of:
            0.016901014 = score(doc=3959,freq=3.0), product of:
              0.05966538 = queryWeight, product of:
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.01710149 = queryNorm
              0.28326333 = fieldWeight in 3959, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
          0.042712368 = weight(abstract_txt:entsprechende in 3959) [ClassicSimilarity], result of:
            0.042712368 = score(doc=3959,freq=1.0), product of:
              0.12672047 = queryWeight, product of:
                1.0304981 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.01710149 = queryNorm
              0.33705974 = fieldWeight in 3959, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
          0.047560386 = weight(abstract_txt:damals in 3959) [ClassicSimilarity], result of:
            0.047560386 = score(doc=3959,freq=1.0), product of:
              0.1361365 = queryWeight, product of:
                1.068098 = boost
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.01710149 = queryNorm
              0.34935808 = fieldWeight in 3959, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4529724 = idf(docFreq=69, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
          0.047836374 = weight(abstract_txt:ging in 3959) [ClassicSimilarity], result of:
            0.047836374 = score(doc=3959,freq=1.0), product of:
              0.13666265 = queryWeight, product of:
                1.07016 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.01710149 = queryNorm
              0.35003254 = fieldWeight in 3959, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
          0.05605945 = weight(abstract_txt:england in 3959) [ClassicSimilarity], result of:
            0.05605945 = score(doc=3959,freq=1.0), product of:
              0.15190668 = queryWeight, product of:
                1.1282679 = boost
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.01710149 = queryNorm
              0.36903873 = fieldWeight in 3959, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.872826 = idf(docFreq=45, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
          0.04993496 = weight(abstract_txt:wurde in 3959) [ClassicSimilarity], result of:
            0.04993496 = score(doc=3959,freq=4.0), product of:
              0.11161883 = queryWeight, product of:
                1.3677526 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.01710149 = queryNorm
              0.44737038 = fieldWeight in 3959, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.046875 = fieldNorm(doc=3959)
        0.24 = coord(6/25)
  5. Wiesenmüller, H.: Zehn Jahre 'Functional Requirements for Bibliographic Records' (FRBR) : Vision, Theorie und praktische Anwendung (2008) 0.06
    0.060477417 = sum of:
      0.060477417 = product of:
        0.25198925 = sum of:
          0.021819113 = weight(abstract_txt:eine in 3616) [ClassicSimilarity], result of:
            0.021819113 = score(doc=3616,freq=5.0), product of:
              0.05966538 = queryWeight, product of:
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.01710149 = queryNorm
              0.36569136 = fieldWeight in 3616, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
          0.03973241 = weight(abstract_txt:chancen in 3616) [ClassicSimilarity], result of:
            0.03973241 = score(doc=3616,freq=1.0), product of:
              0.120755695 = queryWeight, product of:
                1.0059528 = boost
                7.019336 = idf(docFreq=107, maxDocs=44421)
                0.01710149 = queryNorm
              0.32903138 = fieldWeight in 3616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.019336 = idf(docFreq=107, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
          0.047836374 = weight(abstract_txt:ging in 3616) [ClassicSimilarity], result of:
            0.047836374 = score(doc=3616,freq=1.0), product of:
              0.13666265 = queryWeight, product of:
                1.07016 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.01710149 = queryNorm
              0.35003254 = fieldWeight in 3616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
          0.035309345 = weight(abstract_txt:wurde in 3616) [ClassicSimilarity], result of:
            0.035309345 = score(doc=3616,freq=2.0), product of:
              0.11161883 = queryWeight, product of:
                1.3677526 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.01710149 = queryNorm
              0.3163386 = fieldWeight in 3616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
          0.025184613 = weight(abstract_txt:unter in 3616) [ClassicSimilarity], result of:
            0.025184613 = score(doc=3616,freq=1.0), product of:
              0.11226503 = queryWeight, product of:
                1.371706 = boost
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.01710149 = queryNorm
              0.22433177 = fieldWeight in 3616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.785744 = idf(docFreq=1007, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
          0.08210739 = weight(abstract_txt:schritt in 3616) [ClassicSimilarity], result of:
            0.08210739 = score(doc=3616,freq=1.0), product of:
              0.24683638 = queryWeight, product of:
                2.0339646 = boost
                7.0962973 = idf(docFreq=99, maxDocs=44421)
                0.01710149 = queryNorm
              0.33263892 = fieldWeight in 3616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0962973 = idf(docFreq=99, maxDocs=44421)
                0.046875 = fieldNorm(doc=3616)
        0.24 = coord(6/25)