Document (#34435)

Title
Semantische Suche über 500 Millionen Web-Dokumente
Issue
[18.06.2009].
Source
http://www.heise.de/newsticker/Semantische-Suche-ueber-500-Millionen-Web-Dokumente--/meldung/140630
Year
2009
Content
"Wissenschaftler an der University of Washington haben eine neue Suchmaschinen-Engine geschrieben, die Zusammenhänge und Fakten aus mehr als 500 Millionen einzelner Web-Seiten zusammentragen kann. Das Werkzeug extrahiert dabei Informationen aus Milliarden von Textzeilen, indem die grundlegenden sprachlichen Beziehungen zwischen Wörtern analysiert werden. Experten glauben, dass solche Systeme zur automatischen Informationsgewinnung eines Tages die Grundlage deutlich smarterer Suchmaschinen bilden werden, als sie heute verfügbar sind. Dazu werden die wichtigsten Datenhappen zunächst von einem Algorithmus intern begutachtet und dann intelligent kombiniert, berichtet Technology Review in seiner Online-Ausgabe. Das Projekt US-Forscher stellt eine deutliche Ausweitung einer zuvor an der gleichen Hochschule entwickelten Technik namens TextRunner dar. Sowohl die Anzahl analysierbarer Seiten als auch die Themengebiete wurden dabei stark erweitert. "TextRunner ist deshalb so bedeutsam, weil es skaliert, ohne dass dabei ein Mensch eingreifen müsste", sagt Peter Norvig, Forschungsdirektor bei Google. Der Internet-Konzern spendete dem Projekt die riesige Datenbank aus einzelnen Web-Seiten, die TextRunner analysiert. "Das System kann Millionen von Beziehungen erkennen und erlernen - und zwar nicht nur jede einzeln. Einen Betreuer braucht die Software nicht, die Informationen werden selbstständig ermittelt.""
Footnote
Vgl.: http://www.cs.washington.edu/research/textrunner/; http://www.heise.de/tr/artikel/140629.
Theme
Suchmaschinen
Object
TextRunner

Similar documents (content)

  1. Hauer, M.: Neue OPACs braucht das Land ... dandelon.com (2006) 0.83
    0.8293524 = sum of:
      0.8293524 = product of:
        1.0366905 = sum of:
          0.23729838 = weight(abstract_txt:suche in 6047) [ClassicSimilarity], result of:
            0.23729838 = score(doc=6047,freq=2.0), product of:
              0.38284075 = queryWeight, product of:
                1.4150807 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.048224237 = queryNorm
              0.61983573 = fieldWeight in 6047, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.078125 = fieldNorm(doc=6047)
          0.23250721 = weight(abstract_txt:dokumente in 6047) [ClassicSimilarity], result of:
            0.23250721 = score(doc=6047,freq=1.0), product of:
              0.47583452 = queryWeight, product of:
                1.5776116 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.048224237 = queryNorm
              0.4886304 = fieldWeight in 6047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.078125 = fieldNorm(doc=6047)
          0.26212484 = weight(abstract_txt:millionen in 6047) [ClassicSimilarity], result of:
            0.26212484 = score(doc=6047,freq=1.0), product of:
              0.51543087 = queryWeight, product of:
                1.6419401 = boost
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.048224237 = queryNorm
              0.5085548 = fieldWeight in 6047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.078125 = fieldNorm(doc=6047)
          0.3047601 = weight(abstract_txt:semantische in 6047) [ClassicSimilarity], result of:
            0.3047601 = score(doc=6047,freq=1.0), product of:
              0.5699066 = queryWeight, product of:
                1.7265294 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.048224237 = queryNorm
              0.53475446 = fieldWeight in 6047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.078125 = fieldNorm(doc=6047)
        0.8 = coord(4/5)
    
  2. Wolf, S.: Neuer Meilenstein für BASE : 90 Millionen Dokumente (2016) 0.71
    0.71291614 = sum of:
      0.71291614 = product of:
        0.89114517 = sum of:
          0.13241014 = weight(abstract_txt:über in 2872) [ClassicSimilarity], result of:
            0.13241014 = score(doc=2872,freq=5.0), product of:
              0.19118586 = queryWeight, product of:
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.048224237 = queryNorm
              0.69257283 = fieldWeight in 2872, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.078125 = fieldNorm(doc=2872)
          0.1677953 = weight(abstract_txt:suche in 2872) [ClassicSimilarity], result of:
            0.1677953 = score(doc=2872,freq=1.0), product of:
              0.38284075 = queryWeight, product of:
                1.4150807 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.048224237 = queryNorm
              0.43829006 = fieldWeight in 2872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.078125 = fieldNorm(doc=2872)
          0.32881486 = weight(abstract_txt:dokumente in 2872) [ClassicSimilarity], result of:
            0.32881486 = score(doc=2872,freq=2.0), product of:
              0.47583452 = queryWeight, product of:
                1.5776116 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.048224237 = queryNorm
              0.69102776 = fieldWeight in 2872, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.078125 = fieldNorm(doc=2872)
          0.26212484 = weight(abstract_txt:millionen in 2872) [ClassicSimilarity], result of:
            0.26212484 = score(doc=2872,freq=1.0), product of:
              0.51543087 = queryWeight, product of:
                1.6419401 = boost
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.048224237 = queryNorm
              0.5085548 = fieldWeight in 2872, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.078125 = fieldNorm(doc=2872)
        0.8 = coord(4/5)
    
  3. Otto, A.: Ordnungssysteme als Wissensbasis für die Suche in textbasierten Datenbeständen : dargestellt am Beispiel einer soziologischen Bibliographie (1998) 0.69
    0.688822 = sum of:
      0.688822 = product of:
        0.8610274 = sum of:
          0.04145093 = weight(abstract_txt:über in 6625) [ClassicSimilarity], result of:
            0.04145093 = score(doc=6625,freq=1.0), product of:
              0.19118586 = queryWeight, product of:
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.048224237 = queryNorm
              0.21680959 = fieldWeight in 6625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.287709 = weight(abstract_txt:suche in 6625) [ClassicSimilarity], result of:
            0.287709 = score(doc=6625,freq=6.0), product of:
              0.38284075 = queryWeight, product of:
                1.4150807 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.048224237 = queryNorm
              0.7515109 = fieldWeight in 6625, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.23017041 = weight(abstract_txt:dokumente in 6625) [ClassicSimilarity], result of:
            0.23017041 = score(doc=6625,freq=2.0), product of:
              0.47583452 = queryWeight, product of:
                1.5776116 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.048224237 = queryNorm
              0.48371947 = fieldWeight in 6625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.30169708 = weight(abstract_txt:semantische in 6625) [ClassicSimilarity], result of:
            0.30169708 = score(doc=6625,freq=2.0), product of:
              0.5699066 = queryWeight, product of:
                1.7265294 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.048224237 = queryNorm
              0.52937984 = fieldWeight in 6625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
        0.8 = coord(4/5)
    
  4. Deutsche Patentdatenbank mit maschinellen Abstract-Übersetzungen (2005) 0.63
    0.6288094 = sum of:
      0.6288094 = product of:
        1.0480156 = sum of:
          0.11724093 = weight(abstract_txt:über in 3344) [ClassicSimilarity], result of:
            0.11724093 = score(doc=3344,freq=2.0), product of:
              0.19118586 = queryWeight, product of:
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.048224237 = queryNorm
              0.6132301 = fieldWeight in 3344, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.109375 = fieldNorm(doc=3344)
          0.5638 = weight(abstract_txt:dokumente in 3344) [ClassicSimilarity], result of:
            0.5638 = score(doc=3344,freq=3.0), product of:
              0.47583452 = queryWeight, product of:
                1.5776116 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.048224237 = queryNorm
              1.1848657 = fieldWeight in 3344, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.109375 = fieldNorm(doc=3344)
          0.36697477 = weight(abstract_txt:millionen in 3344) [ClassicSimilarity], result of:
            0.36697477 = score(doc=3344,freq=1.0), product of:
              0.51543087 = queryWeight, product of:
                1.6419401 = boost
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.048224237 = queryNorm
              0.7119767 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5095015 = idf(docFreq=178, maxDocs=44218)
                0.109375 = fieldNorm(doc=3344)
        0.6 = coord(3/5)
    
  5. Spree, U.; Feißt, N.; Lühr, A.; Piesztal, B.; Schroeder, N.; Wollschläger, P.: Semantic search : State-of-the-Art-Überblick zu semantischen Suchlösungen im WWW (2011) 0.61
    0.61110824 = sum of:
      0.61110824 = product of:
        1.0185137 = sum of:
          0.08290186 = weight(abstract_txt:über in 345) [ClassicSimilarity], result of:
            0.08290186 = score(doc=345,freq=1.0), product of:
              0.19118586 = queryWeight, product of:
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.048224237 = queryNorm
              0.43361917 = fieldWeight in 345, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.964518 = idf(docFreq=2280, maxDocs=44218)
                0.109375 = fieldNorm(doc=345)
          0.33221772 = weight(abstract_txt:suche in 345) [ClassicSimilarity], result of:
            0.33221772 = score(doc=345,freq=2.0), product of:
              0.38284075 = queryWeight, product of:
                1.4150807 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.048224237 = queryNorm
              0.86777 = fieldWeight in 345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.109375 = fieldNorm(doc=345)
          0.60339415 = weight(abstract_txt:semantische in 345) [ClassicSimilarity], result of:
            0.60339415 = score(doc=345,freq=2.0), product of:
              0.5699066 = queryWeight, product of:
                1.7265294 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.048224237 = queryNorm
              1.0587597 = fieldWeight in 345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.109375 = fieldNorm(doc=345)
        0.6 = coord(3/5)