Document (#30272)

Author
Avrahami, T.T.
Yau, L.
Si, L.
Callan, J.P.
Title
¬The FedLemur project : Federated search in the real world
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.347-358
Year
2006
Abstract
Federated search and distributed information retrieval systems provide a single user interface for searching multiple full-text search engines. They have been an active area of research for more than a decade, but in spite of their success as a research topic, they are still rare in operational environments. This article discusses a prototype federated search system developed for the U.S. government's FedStats Web portal, and the issues addressed in adapting research solutions to this operational environment. A series of experiments explore how well prior research results, parameter settings, and heuristics apply in the FedStats environment. The article concludes with a set of lessons learned from this technology transfer effort, including observations about search engine quality in the real world.
Theme
Verteilte bibliographische Datenbanken

Similar documents (author)

  1. Callan, J.: Distributed information retrieval (2000) 5.94
    5.9401517 = sum of:
      5.9401517 = weight(author_txt:callan in 1031) [ClassicSimilarity], result of:
        5.9401517 = fieldWeight in 1031, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.504243 = idf(docFreq=8, maxDocs=44421)
          0.625 = fieldNorm(doc=1031)
    
  2. Robertson, S.; Callan, J.: Routing and filtering (2005) 4.75
    4.7521214 = sum of:
      4.7521214 = weight(author_txt:callan in 5688) [ClassicSimilarity], result of:
        4.7521214 = fieldWeight in 5688, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.504243 = idf(docFreq=8, maxDocs=44421)
          0.5 = fieldNorm(doc=5688)
    
  3. Collins-Thompson, K.; Callan, J.: Predicting reading difficulty with statistical language models (2005) 4.16
    4.1581063 = sum of:
      4.1581063 = weight(author_txt:callan in 5579) [ClassicSimilarity], result of:
        4.1581063 = fieldWeight in 5579, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.504243 = idf(docFreq=8, maxDocs=44421)
          0.4375 = fieldNorm(doc=5579)
    
  4. Callan, J.; Croft, W.B.; Broglio, J.: TREC and TIPSTER experiments with INQUERY (1995) 3.56
    3.5640912 = sum of:
      3.5640912 = weight(author_txt:callan in 2944) [ClassicSimilarity], result of:
        3.5640912 = fieldWeight in 2944, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.504243 = idf(docFreq=8, maxDocs=44421)
          0.375 = fieldNorm(doc=2944)
    
  5. Allan, J.; Croft, W.B.; Callan, J.: ¬The University of Massachusetts and a dozen TRECs (2005) 3.56
    3.5640912 = sum of:
      3.5640912 = weight(author_txt:callan in 86) [ClassicSimilarity], result of:
        3.5640912 = fieldWeight in 86, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.504243 = idf(docFreq=8, maxDocs=44421)
          0.375 = fieldNorm(doc=86)
    

Similar documents (content)

  1. Peckham, J.; MacKellar, B.; Vorback, J.: ¬A unified approach to the design and generation of complex database schemata (1997) 0.17
    0.16666792 = sum of:
      0.16666792 = product of:
        0.59524256 = sum of:
          0.119304664 = weight(abstract_txt:active in 2259) [ClassicSimilarity], result of:
            0.119304664 = score(doc=2259,freq=2.0), product of:
              0.12183266 = queryWeight, product of:
                1.0347923 = boost
                6.3308296 = idf(docFreq=214, maxDocs=44421)
                0.018597301 = queryNorm
              0.97925025 = fieldWeight in 2259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3308296 = idf(docFreq=214, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.093849756 = weight(abstract_txt:learned in 2259) [ClassicSimilarity], result of:
            0.093849756 = score(doc=2259,freq=1.0), product of:
              0.13080496 = queryWeight, product of:
                1.0722188 = boost
                6.559804 = idf(docFreq=170, maxDocs=44421)
                0.018597301 = queryNorm
              0.7174786 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.559804 = idf(docFreq=170, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.112350285 = weight(abstract_txt:lessons in 2259) [ClassicSimilarity], result of:
            0.112350285 = score(doc=2259,freq=1.0), product of:
              0.14747494 = queryWeight, product of:
                1.1384932 = boost
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.018597301 = queryNorm
              0.7618263 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.965269 = idf(docFreq=113, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.035004105 = weight(abstract_txt:they in 2259) [ClassicSimilarity], result of:
            0.035004105 = score(doc=2259,freq=1.0), product of:
              0.08539349 = queryWeight, product of:
                1.2251766 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.018597301 = queryNorm
              0.4099154 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.093535304 = weight(abstract_txt:environment in 2259) [ClassicSimilarity], result of:
            0.093535304 = score(doc=2259,freq=2.0), product of:
              0.13051261 = queryWeight, product of:
                1.514651 = boost
                4.6332955 = idf(docFreq=1173, maxDocs=44421)
                0.018597301 = queryNorm
              0.71667635 = fieldWeight in 2259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6332955 = idf(docFreq=1173, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.09925061 = weight(abstract_txt:real in 2259) [ClassicSimilarity], result of:
            0.09925061 = score(doc=2259,freq=1.0), product of:
              0.17106752 = queryWeight, product of:
                1.7340839 = boost
                5.304538 = idf(docFreq=599, maxDocs=44421)
                0.018597301 = queryNorm
              0.5801838 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.304538 = idf(docFreq=599, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
          0.04194782 = weight(abstract_txt:research in 2259) [ClassicSimilarity], result of:
            0.04194782 = score(doc=2259,freq=1.0), product of:
              0.12138407 = queryWeight, product of:
                2.0657709 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.018597301 = queryNorm
              0.34557927 = fieldWeight in 2259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.109375 = fieldNorm(doc=2259)
        0.28 = coord(7/25)
    
  2. Mitchell, A.M.; Thompson, J.M.; Wu, A.: Agile cataloging : staffing and skills for a bibliographic future (2010) 0.14
    0.14498314 = sum of:
      0.14498314 = product of:
        0.6040964 = sum of:
          0.014037272 = weight(abstract_txt:this in 159) [ClassicSimilarity], result of:
            0.014037272 = score(doc=159,freq=2.0), product of:
              0.052800704 = queryWeight, product of:
                1.1799179 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018597301 = queryNorm
              0.26585388 = fieldWeight in 159, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
          0.02604896 = weight(abstract_txt:article in 159) [ClassicSimilarity], result of:
            0.02604896 = score(doc=159,freq=1.0), product of:
              0.08775888 = queryWeight, product of:
                1.2420293 = boost
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.018597301 = queryNorm
              0.29682422 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
          0.04724246 = weight(abstract_txt:environment in 159) [ClassicSimilarity], result of:
            0.04724246 = score(doc=159,freq=1.0), product of:
              0.13051261 = queryWeight, product of:
                1.514651 = boost
                4.6332955 = idf(docFreq=1173, maxDocs=44421)
                0.018597301 = queryNorm
              0.3619762 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6332955 = idf(docFreq=1173, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
          0.029962728 = weight(abstract_txt:research in 159) [ClassicSimilarity], result of:
            0.029962728 = score(doc=159,freq=1.0), product of:
              0.12138407 = queryWeight, product of:
                2.0657709 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.018597301 = queryNorm
              0.24684234 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
          0.057959057 = weight(abstract_txt:search in 159) [ClassicSimilarity], result of:
            0.057959057 = score(doc=159,freq=1.0), product of:
              0.20299797 = queryWeight, product of:
                2.9867725 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018597301 = queryNorm
              0.28551546 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
          0.42884597 = weight(abstract_txt:federated in 159) [ClassicSimilarity], result of:
            0.42884597 = score(doc=159,freq=1.0), product of:
              0.6501229 = queryWeight, product of:
                4.14028 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.018597301 = queryNorm
              0.65963835 = fieldWeight in 159, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.078125 = fieldNorm(doc=159)
        0.24 = coord(6/25)
    
  3. Taylor, M.: Using the Google search appliance for federated searching : a case study (2005) 0.12
    0.122480355 = sum of:
      0.122480355 = product of:
        0.7655022 = sum of:
          0.013896191 = weight(abstract_txt:this in 1355) [ClassicSimilarity], result of:
            0.013896191 = score(doc=1355,freq=1.0), product of:
              0.052800704 = queryWeight, product of:
                1.1799179 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018597301 = queryNorm
              0.26318192 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.109375 = fieldNorm(doc=1355)
          0.036468543 = weight(abstract_txt:article in 1355) [ClassicSimilarity], result of:
            0.036468543 = score(doc=1355,freq=1.0), product of:
              0.08775888 = queryWeight, product of:
                1.2420293 = boost
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.018597301 = queryNorm
              0.4155539 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.109375 = fieldNorm(doc=1355)
          0.11475309 = weight(abstract_txt:search in 1355) [ClassicSimilarity], result of:
            0.11475309 = score(doc=1355,freq=2.0), product of:
              0.20299797 = queryWeight, product of:
                2.9867725 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018597301 = queryNorm
              0.5652918 = fieldWeight in 1355, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.109375 = fieldNorm(doc=1355)
          0.60038435 = weight(abstract_txt:federated in 1355) [ClassicSimilarity], result of:
            0.60038435 = score(doc=1355,freq=1.0), product of:
              0.6501229 = queryWeight, product of:
                4.14028 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.018597301 = queryNorm
              0.9234937 = fieldWeight in 1355, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.109375 = fieldNorm(doc=1355)
        0.16 = coord(4/25)
    
  4. Joint, N.: ¬The one-stop shop search engine : a transformational library technology? ANTAEUS (2010) 0.12
    0.1213155 = sum of:
      0.1213155 = product of:
        0.6065775 = sum of:
          0.012034454 = weight(abstract_txt:this in 201) [ClassicSimilarity], result of:
            0.012034454 = score(doc=201,freq=3.0), product of:
              0.052800704 = queryWeight, product of:
                1.1799179 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018597301 = queryNorm
              0.22792223 = fieldWeight in 201, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0546875 = fieldNorm(doc=201)
          0.049625304 = weight(abstract_txt:real in 201) [ClassicSimilarity], result of:
            0.049625304 = score(doc=201,freq=1.0), product of:
              0.17106752 = queryWeight, product of:
                1.7340839 = boost
                5.304538 = idf(docFreq=599, maxDocs=44421)
                0.018597301 = queryNorm
              0.2900919 = fieldWeight in 201, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.304538 = idf(docFreq=599, maxDocs=44421)
                0.0546875 = fieldNorm(doc=201)
          0.029661588 = weight(abstract_txt:research in 201) [ClassicSimilarity], result of:
            0.029661588 = score(doc=201,freq=2.0), product of:
              0.12138407 = queryWeight, product of:
                2.0657709 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.018597301 = queryNorm
              0.24436146 = fieldWeight in 201, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.0546875 = fieldNorm(doc=201)
          0.09072028 = weight(abstract_txt:search in 201) [ClassicSimilarity], result of:
            0.09072028 = score(doc=201,freq=5.0), product of:
              0.20299797 = queryWeight, product of:
                2.9867725 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018597301 = queryNorm
              0.4469024 = fieldWeight in 201, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0546875 = fieldNorm(doc=201)
          0.42453587 = weight(abstract_txt:federated in 201) [ClassicSimilarity], result of:
            0.42453587 = score(doc=201,freq=2.0), product of:
              0.6501229 = queryWeight, product of:
                4.14028 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.018597301 = queryNorm
              0.65300864 = fieldWeight in 201, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.0546875 = fieldNorm(doc=201)
        0.2 = coord(5/25)
    
  5. Shechtman, N.; Chung, M.; Roschelle, J.: Supporting member collaboration in the Math Tools digital library : a formative user study (2004) 0.12
    0.117696166 = sum of:
      0.117696166 = product of:
        0.73560107 = sum of:
          0.017192077 = weight(abstract_txt:this in 2163) [ClassicSimilarity], result of:
            0.017192077 = score(doc=2163,freq=3.0), product of:
              0.052800704 = queryWeight, product of:
                1.1799179 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018597301 = queryNorm
              0.3256032 = fieldWeight in 2163, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=2163)
          0.029962728 = weight(abstract_txt:research in 2163) [ClassicSimilarity], result of:
            0.029962728 = score(doc=2163,freq=1.0), product of:
              0.12138407 = queryWeight, product of:
                2.0657709 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.018597301 = queryNorm
              0.24684234 = fieldWeight in 2163, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.078125 = fieldNorm(doc=2163)
          0.08196649 = weight(abstract_txt:search in 2163) [ClassicSimilarity], result of:
            0.08196649 = score(doc=2163,freq=2.0), product of:
              0.20299797 = queryWeight, product of:
                2.9867725 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.018597301 = queryNorm
              0.40377986 = fieldWeight in 2163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=2163)
          0.60647976 = weight(abstract_txt:federated in 2163) [ClassicSimilarity], result of:
            0.60647976 = score(doc=2163,freq=2.0), product of:
              0.6501229 = queryWeight, product of:
                4.14028 = boost
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.018597301 = queryNorm
              0.93286943 = fieldWeight in 2163, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.443371 = idf(docFreq=25, maxDocs=44421)
                0.078125 = fieldNorm(doc=2163)
        0.16 = coord(4/25)