Document (#34435)

Title
Semantische Suche über 500 Millionen Web-Dokumente
Issue
[18.06.2009].
Source
http://www.heise.de/newsticker/Semantische-Suche-ueber-500-Millionen-Web-Dokumente--/meldung/140630
Year
2009
Content
"Wissenschaftler an der University of Washington haben eine neue Suchmaschinen-Engine geschrieben, die Zusammenhänge und Fakten aus mehr als 500 Millionen einzelner Web-Seiten zusammentragen kann. Das Werkzeug extrahiert dabei Informationen aus Milliarden von Textzeilen, indem die grundlegenden sprachlichen Beziehungen zwischen Wörtern analysiert werden. Experten glauben, dass solche Systeme zur automatischen Informationsgewinnung eines Tages die Grundlage deutlich smarterer Suchmaschinen bilden werden, als sie heute verfügbar sind. Dazu werden die wichtigsten Datenhappen zunächst von einem Algorithmus intern begutachtet und dann intelligent kombiniert, berichtet Technology Review in seiner Online-Ausgabe. Das Projekt US-Forscher stellt eine deutliche Ausweitung einer zuvor an der gleichen Hochschule entwickelten Technik namens TextRunner dar. Sowohl die Anzahl analysierbarer Seiten als auch die Themengebiete wurden dabei stark erweitert. "TextRunner ist deshalb so bedeutsam, weil es skaliert, ohne dass dabei ein Mensch eingreifen müsste", sagt Peter Norvig, Forschungsdirektor bei Google. Der Internet-Konzern spendete dem Projekt die riesige Datenbank aus einzelnen Web-Seiten, die TextRunner analysiert. "Das System kann Millionen von Beziehungen erkennen und erlernen - und zwar nicht nur jede einzeln. Einen Betreuer braucht die Software nicht, die Informationen werden selbstständig ermittelt.""
Footnote
Vgl.: http://www.cs.washington.edu/research/textrunner/; http://www.heise.de/tr/artikel/140629.
Theme
Suchmaschinen
Object
TextRunner

Similar documents (content)

  1. Hauer, M.: Neue OPACs braucht das Land ... dandelon.com (2006) 0.83
    0.8294797 = sum of:
      0.8294797 = product of:
        1.0368496 = sum of:
          0.23721981 = weight(abstract_txt:suche in 47) [ClassicSimilarity], result of:
            0.23721981 = score(doc=47,freq=2.0), product of:
              0.38271093 = queryWeight, product of:
                1.4154856 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.048193708 = queryNorm
              0.6198407 = fieldWeight in 47, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.078125 = fieldNorm(doc=47)
          0.23245373 = weight(abstract_txt:dokumente in 47) [ClassicSimilarity], result of:
            0.23245373 = score(doc=47,freq=1.0), product of:
              0.4757052 = queryWeight, product of:
                1.5781163 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.048193708 = queryNorm
              0.4886508 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.078125 = fieldNorm(doc=47)
          0.26191214 = weight(abstract_txt:millionen in 47) [ClassicSimilarity], result of:
            0.26191214 = score(doc=47,freq=1.0), product of:
              0.515091 = queryWeight, product of:
                1.642147 = boost
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.048193708 = queryNorm
              0.5084774 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.078125 = fieldNorm(doc=47)
          0.3052639 = weight(abstract_txt:semantische in 47) [ClassicSimilarity], result of:
            0.3052639 = score(doc=47,freq=1.0), product of:
              0.57046705 = queryWeight, product of:
                1.7281654 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.048193708 = queryNorm
              0.53511226 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.078125 = fieldNorm(doc=47)
        0.8 = coord(4/5)
    
  2. Wolf, S.: Neuer Meilenstein für BASE : 90 Millionen Dokumente (2016) 0.71
    0.712515 = sum of:
      0.712515 = product of:
        0.8906437 = sum of:
          0.13225271 = weight(abstract_txt:über in 3872) [ClassicSimilarity], result of:
            0.13225271 = score(doc=3872,freq=5.0), product of:
              0.19101168 = queryWeight, product of:
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.048193708 = queryNorm
              0.6923802 = fieldWeight in 3872, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.078125 = fieldNorm(doc=3872)
          0.16773973 = weight(abstract_txt:suche in 3872) [ClassicSimilarity], result of:
            0.16773973 = score(doc=3872,freq=1.0), product of:
              0.38271093 = queryWeight, product of:
                1.4154856 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.048193708 = queryNorm
              0.43829355 = fieldWeight in 3872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.078125 = fieldNorm(doc=3872)
          0.3287392 = weight(abstract_txt:dokumente in 3872) [ClassicSimilarity], result of:
            0.3287392 = score(doc=3872,freq=2.0), product of:
              0.4757052 = queryWeight, product of:
                1.5781163 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.048193708 = queryNorm
              0.69105655 = fieldWeight in 3872, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.078125 = fieldNorm(doc=3872)
          0.26191214 = weight(abstract_txt:millionen in 3872) [ClassicSimilarity], result of:
            0.26191214 = score(doc=3872,freq=1.0), product of:
              0.515091 = queryWeight, product of:
                1.642147 = boost
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.048193708 = queryNorm
              0.5084774 = fieldWeight in 3872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.078125 = fieldNorm(doc=3872)
        0.8 = coord(4/5)
    
  3. Otto, A.: Ordnungssysteme als Wissensbasis für die Suche in textbasierten Datenbeständen : dargestellt am Beispiel einer soziologischen Bibliographie (1998) 0.69
    0.689063 = sum of:
      0.689063 = product of:
        0.8613287 = sum of:
          0.041401643 = weight(abstract_txt:über in 625) [ClassicSimilarity], result of:
            0.041401643 = score(doc=625,freq=1.0), product of:
              0.19101168 = queryWeight, product of:
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.048193708 = queryNorm
              0.21674928 = fieldWeight in 625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.0546875 = fieldNorm(doc=625)
          0.28761375 = weight(abstract_txt:suche in 625) [ClassicSimilarity], result of:
            0.28761375 = score(doc=625,freq=6.0), product of:
              0.38271093 = queryWeight, product of:
                1.4154856 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.048193708 = queryNorm
              0.75151694 = fieldWeight in 625, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.0546875 = fieldNorm(doc=625)
          0.23011744 = weight(abstract_txt:dokumente in 625) [ClassicSimilarity], result of:
            0.23011744 = score(doc=625,freq=2.0), product of:
              0.4757052 = queryWeight, product of:
                1.5781163 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.048193708 = queryNorm
              0.48373958 = fieldWeight in 625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.0546875 = fieldNorm(doc=625)
          0.30219588 = weight(abstract_txt:semantische in 625) [ClassicSimilarity], result of:
            0.30219588 = score(doc=625,freq=2.0), product of:
              0.57046705 = queryWeight, product of:
                1.7281654 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.048193708 = queryNorm
              0.52973413 = fieldWeight in 625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0546875 = fieldNorm(doc=625)
        0.8 = coord(4/5)
    
  4. Deutsche Patentdatenbank mit maschinellen Abstract-Übersetzungen (2005) 0.63
    0.62846935 = sum of:
      0.62846935 = product of:
        1.0474489 = sum of:
          0.11710153 = weight(abstract_txt:über in 4344) [ClassicSimilarity], result of:
            0.11710153 = score(doc=4344,freq=2.0), product of:
              0.19101168 = queryWeight, product of:
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.048193708 = queryNorm
              0.6130595 = fieldWeight in 4344, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.109375 = fieldNorm(doc=4344)
          0.56367034 = weight(abstract_txt:dokumente in 4344) [ClassicSimilarity], result of:
            0.56367034 = score(doc=4344,freq=3.0), product of:
              0.4757052 = queryWeight, product of:
                1.5781163 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.048193708 = queryNorm
              1.1849152 = fieldWeight in 4344, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.109375 = fieldNorm(doc=4344)
          0.366677 = weight(abstract_txt:millionen in 4344) [ClassicSimilarity], result of:
            0.366677 = score(doc=4344,freq=1.0), product of:
              0.515091 = queryWeight, product of:
                1.642147 = boost
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.048193708 = queryNorm
              0.71186835 = fieldWeight in 4344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5085106 = idf(docFreq=179, maxDocs=44421)
                0.109375 = fieldNorm(doc=4344)
        0.6 = coord(3/5)
    
  5. Spree, U.; Feißt, N.; Lühr, A.; Piesztal, B.; Schroeder, N.; Wollschläger, P.: Semantic search : State-of-the-Art-Überblick zu semantischen Suchlösungen im WWW (2011) 0.61
    0.6115817 = sum of:
      0.6115817 = product of:
        1.0193027 = sum of:
          0.08280329 = weight(abstract_txt:über in 1345) [ClassicSimilarity], result of:
            0.08280329 = score(doc=1345,freq=1.0), product of:
              0.19101168 = queryWeight, product of:
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.048193708 = queryNorm
              0.43349856 = fieldWeight in 1345, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9634154 = idf(docFreq=2293, maxDocs=44421)
                0.109375 = fieldNorm(doc=1345)
          0.33210772 = weight(abstract_txt:suche in 1345) [ClassicSimilarity], result of:
            0.33210772 = score(doc=1345,freq=2.0), product of:
              0.38271093 = queryWeight, product of:
                1.4154856 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.048193708 = queryNorm
              0.86777693 = fieldWeight in 1345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.109375 = fieldNorm(doc=1345)
          0.60439175 = weight(abstract_txt:semantische in 1345) [ClassicSimilarity], result of:
            0.60439175 = score(doc=1345,freq=2.0), product of:
              0.57046705 = queryWeight, product of:
                1.7281654 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.048193708 = queryNorm
              1.0594683 = fieldWeight in 1345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.109375 = fieldNorm(doc=1345)
        0.6 = coord(3/5)