Document (#29334)

Author
Westermeyer, D.
Title
Adaptive Techniken zur Informationsgewinnung : der Webcrawler InfoSpiders
Imprint
Münster : Institut für Wirtschaftsinformatik der Westfälische Wilhelms-Universität Münster
Year
2005
Pages
22 S
Abstract
Die Suche nach Informationen im Internet führt den Nutzer meistens direkt zu einer Suchmaschine. Teile der gelieferten Ergebnisse enthalten aber manchmal nicht das, was der Nutzer gesucht hat. Hier setzen sog. adaptive Agenten an, welche die Gewohnheiten ihres Nutzers zu erlernen versuchen, um später auf Basis dessen selbstständig Entscheidungen zu treffen, ohne dass der Nutzer dazu befragt werden muss. Zunächst werden im Grundlagenteil adaptive Techniken zur Informationsgewinnung sowie die grundlegenden Eigenschaften von Webcrawlern besprochen. Im Hauptteil wird daraufhin der Webcrawler InfoSpiders erläutert. Dieses Programm arbeitet mit mehreren adaptiven Agenten, die parallel basierend auf einem Satz von Startlinks das Internet nach Informationen durchsuchen. Dabei bedienen sich die Agenten verschiedenster Techniken. Darunter fallen beispielsweise statistische Methoden, die den Inhalt von Webseiten untersuchen sowie neuronale Netze, mit denen der Inhalt bewertet wird. Eine andere Technik implementiert der genetische Algorithmus mit Hilfe dessen die Agenten Nachkommen mit neuen Mutationen erzeugen können. Danach wird eine konkrete Implementierung des InfoSpiders-Algorithmus' anhand von MySpiders verdeutlicht. Im Anschluss daran wird der InfoSpiders-Algorithmus sowie MySpiders einer Evaluation bezüglich des zusätzlichen Nutzens gegenüber herkömmlichen Suchmaschinen unterzogen. Eine Zusammenfassung mit Ausblick zu weiteren Entwicklungen in dem Bereich adaptiver Agenten zur Suche im Internet wird das Thema abschließen.
Content
Ausarbeitung im Rahmen des Seminars Suchmaschinen und Suchalgorithmen, Institut für Wirtschaftsinformatik Praktische Informatik in der Wirtschaft, Westfälische Wilhelms-Universität Münster. - Vgl.: http://www-wi.uni-muenster.de/pi/lehre/ss05/seminarSuchen/Ausarbeitungen/DenisWestermeyer.pdf
Theme
Suchmaschinen
Object
InfoSpider

Similar documents (content)

  1. Lahme, N.: Information Retrieval im Wissensmanagement : ein am Vorwissen orientierter Ansatz zur Komposition von Informationsressourcen (2004) 0.18
    0.17702453 = sum of:
      0.17702453 = product of:
        0.5532017 = sum of:
          0.029042814 = weight(abstract_txt:nach in 5942) [ClassicSimilarity], result of:
            0.029042814 = score(doc=5942,freq=2.0), product of:
              0.0604691 = queryWeight, product of:
                1.0097308 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.013776146 = queryNorm
              0.48029184 = fieldWeight in 5942, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.030831343 = weight(abstract_txt:informationen in 5942) [ClassicSimilarity], result of:
            0.030831343 = score(doc=5942,freq=1.0), product of:
              0.079282865 = queryWeight, product of:
                1.1561881 = boost
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.013776146 = queryNorm
              0.38887775 = fieldWeight in 5942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.027582925 = weight(abstract_txt:eine in 5942) [ClassicSimilarity], result of:
            0.027582925 = score(doc=5942,freq=3.0), product of:
              0.05842534 = queryWeight, product of:
                1.2155843 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013776146 = queryNorm
              0.4721055 = fieldWeight in 5942, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.044141624 = weight(abstract_txt:suche in 5942) [ClassicSimilarity], result of:
            0.044141624 = score(doc=5942,freq=1.0), product of:
              0.10071247 = queryWeight, product of:
                1.3031081 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.013776146 = queryNorm
              0.43829355 = fieldWeight in 5942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.060303446 = weight(abstract_txt:dessen in 5942) [ClassicSimilarity], result of:
            0.060303446 = score(doc=5942,freq=1.0), product of:
              0.12399736 = queryWeight, product of:
                1.4459226 = boost
                6.225004 = idf(docFreq=238, maxDocs=44421)
                0.013776146 = queryNorm
              0.48632845 = fieldWeight in 5942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.225004 = idf(docFreq=238, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.052343365 = weight(abstract_txt:sowie in 5942) [ClassicSimilarity], result of:
            0.052343365 = score(doc=5942,freq=2.0), product of:
              0.10251307 = queryWeight, product of:
                1.6101787 = boost
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.013776146 = queryNorm
              0.5106019 = fieldWeight in 5942, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.621441 = idf(docFreq=1187, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.033559885 = weight(abstract_txt:wird in 5942) [ClassicSimilarity], result of:
            0.033559885 = score(doc=5942,freq=1.0), product of:
              0.11386179 = queryWeight, product of:
                2.1907754 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.013776146 = queryNorm
              0.2947423 = fieldWeight in 5942, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
          0.27539632 = weight(abstract_txt:algorithmus in 5942) [ClassicSimilarity], result of:
            0.27539632 = score(doc=5942,freq=2.0), product of:
              0.310106 = queryWeight, product of:
                2.8005257 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.013776146 = queryNorm
              0.88807154 = fieldWeight in 5942, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.078125 = fieldNorm(doc=5942)
        0.32 = coord(8/25)
    
  2. Voregger, M.: Angriff der Heinzelmännchen : Steuern in den Informationsfluten des Webs - mit Hilfe digitaler Agenten (1997) 0.17
    0.17484659 = sum of:
      0.17484659 = product of:
        2.1855824 = sum of:
          0.09330671 = weight(abstract_txt:internet in 379) [ClassicSimilarity], result of:
            0.09330671 = score(doc=379,freq=1.0), product of:
              0.066731244 = queryWeight, product of:
                1.2991195 = boost
                3.7286568 = idf(docFreq=2900, maxDocs=44421)
                0.013776146 = queryNorm
              1.3982463 = fieldWeight in 379, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7286568 = idf(docFreq=2900, maxDocs=44421)
                0.375 = fieldNorm(doc=379)
          2.0922756 = weight(abstract_txt:agenten in 379) [ClassicSimilarity], result of:
            2.0922756 = score(doc=379,freq=1.0), product of:
              0.6291432 = queryWeight, product of:
                5.1497197 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.013776146 = queryNorm
              3.3255954 = fieldWeight in 379, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.375 = fieldNorm(doc=379)
        0.08 = coord(2/25)
    
  3. Weiner, M.: ¬Die Agenten kommen (2002) 0.17
    0.17135005 = sum of:
      0.17135005 = product of:
        0.85675025 = sum of:
          0.035570037 = weight(abstract_txt:nach in 734) [ClassicSimilarity], result of:
            0.035570037 = score(doc=734,freq=3.0), product of:
              0.0604691 = queryWeight, product of:
                1.0097308 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.013776146 = queryNorm
              0.58823496 = fieldWeight in 734, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.078125 = fieldNorm(doc=734)
          0.030831343 = weight(abstract_txt:informationen in 734) [ClassicSimilarity], result of:
            0.030831343 = score(doc=734,freq=1.0), product of:
              0.079282865 = queryWeight, product of:
                1.1561881 = boost
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.013776146 = queryNorm
              0.38887775 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.078125 = fieldNorm(doc=734)
          0.01592501 = weight(abstract_txt:eine in 734) [ClassicSimilarity], result of:
            0.01592501 = score(doc=734,freq=1.0), product of:
              0.05842534 = queryWeight, product of:
                1.2155843 = boost
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.013776146 = queryNorm
              0.27257025 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4888992 = idf(docFreq=3686, maxDocs=44421)
                0.078125 = fieldNorm(doc=734)
          0.019438898 = weight(abstract_txt:internet in 734) [ClassicSimilarity], result of:
            0.019438898 = score(doc=734,freq=1.0), product of:
              0.066731244 = queryWeight, product of:
                1.2991195 = boost
                3.7286568 = idf(docFreq=2900, maxDocs=44421)
                0.013776146 = queryNorm
              0.2913013 = fieldWeight in 734, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7286568 = idf(docFreq=2900, maxDocs=44421)
                0.078125 = fieldNorm(doc=734)
          0.754985 = weight(abstract_txt:agenten in 734) [ClassicSimilarity], result of:
            0.754985 = score(doc=734,freq=3.0), product of:
              0.6291432 = queryWeight, product of:
                5.1497197 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.013776146 = queryNorm
              1.2000209 = fieldWeight in 734, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.078125 = fieldNorm(doc=734)
        0.2 = coord(5/25)
    
  4. Röscheisen, E.: Fin de such (2001) 0.17
    0.17092118 = sum of:
      0.17092118 = product of:
        1.0682573 = sum of:
          0.04107274 = weight(abstract_txt:nach in 496) [ClassicSimilarity], result of:
            0.04107274 = score(doc=496,freq=1.0), product of:
              0.0604691 = queryWeight, product of:
                1.0097308 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.013776146 = queryNorm
              0.6792352 = fieldWeight in 496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.15625 = fieldNorm(doc=496)
          0.08828325 = weight(abstract_txt:suche in 496) [ClassicSimilarity], result of:
            0.08828325 = score(doc=496,freq=1.0), product of:
              0.10071247 = queryWeight, product of:
                1.3031081 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.013776146 = queryNorm
              0.8765871 = fieldWeight in 496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.15625 = fieldNorm(doc=496)
          0.06711977 = weight(abstract_txt:wird in 496) [ClassicSimilarity], result of:
            0.06711977 = score(doc=496,freq=1.0), product of:
              0.11386179 = queryWeight, product of:
                2.1907754 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.013776146 = queryNorm
              0.5894846 = fieldWeight in 496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.15625 = fieldNorm(doc=496)
          0.8717816 = weight(abstract_txt:agenten in 496) [ClassicSimilarity], result of:
            0.8717816 = score(doc=496,freq=1.0), product of:
              0.6291432 = queryWeight, product of:
                5.1497197 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.013776146 = queryNorm
              1.3856648 = fieldWeight in 496, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.15625 = fieldNorm(doc=496)
        0.16 = coord(4/25)
    
  5. Göbel, R.: Semantic Web & Linked Data für professionelle Informationsangebote : Hoffnungsträger oder "alter Hut" - Eine Praxisbetrachtung für die Wirtschaftsinformationen (2010) 0.14
    0.1355904 = sum of:
      0.1355904 = product of:
        0.677952 = sum of:
          0.024643647 = weight(abstract_txt:nach in 258) [ClassicSimilarity], result of:
            0.024643647 = score(doc=258,freq=1.0), product of:
              0.0604691 = queryWeight, product of:
                1.0097308 = boost
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.013776146 = queryNorm
              0.40754116 = fieldWeight in 258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3471055 = idf(docFreq=1562, maxDocs=44421)
                0.09375 = fieldNorm(doc=258)
          0.036997613 = weight(abstract_txt:informationen in 258) [ClassicSimilarity], result of:
            0.036997613 = score(doc=258,freq=1.0), product of:
              0.079282865 = queryWeight, product of:
                1.1561881 = boost
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.013776146 = queryNorm
              0.46665332 = fieldWeight in 258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9776354 = idf(docFreq=831, maxDocs=44421)
                0.09375 = fieldNorm(doc=258)
          0.052969955 = weight(abstract_txt:suche in 258) [ClassicSimilarity], result of:
            0.052969955 = score(doc=258,freq=1.0), product of:
              0.10071247 = queryWeight, product of:
                1.3031081 = boost
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.013776146 = queryNorm
              0.5259523 = fieldWeight in 258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101575 = idf(docFreq=441, maxDocs=44421)
                0.09375 = fieldNorm(doc=258)
          0.040271863 = weight(abstract_txt:wird in 258) [ClassicSimilarity], result of:
            0.040271863 = score(doc=258,freq=1.0), product of:
              0.11386179 = queryWeight, product of:
                2.1907754 = boost
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.013776146 = queryNorm
              0.35369074 = fieldWeight in 258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7727013 = idf(docFreq=2775, maxDocs=44421)
                0.09375 = fieldNorm(doc=258)
          0.5230689 = weight(abstract_txt:agenten in 258) [ClassicSimilarity], result of:
            0.5230689 = score(doc=258,freq=1.0), product of:
              0.6291432 = queryWeight, product of:
                5.1497197 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.013776146 = queryNorm
              0.83139884 = fieldWeight in 258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.09375 = fieldNorm(doc=258)
        0.2 = coord(5/25)