Document (#32918)

Author
Abdelali, A.
Cowie, J.
Soliman, H.S.
Title
Improving query precision using semantic expansion
Source
Information processing and management. 43(2007) no.3, S.705-716
Year
2007
Abstract
Query Expansion (QE) is one of the most important mechanisms in the information retrieval field. A typical short Internet query will go through a process of refinement to improve its retrieval power. Most of the existing QE techniques suffer from retrieval performance degradation due to imprecise choice of query's additive terms in the QE process. In this paper, we introduce a novel automated QE mechanism. The new expansion process is guided by the semantics relations between the original query and the expanding words, in the context of the utilized corpus. Experimental results of our "controlled" query expansion, using the Arabic TREC-10 data, show a significant enhancement of recall and precision over current existing mechanisms in the field.
Footnote
Beitrag in: Special issue on Heterogeneous and Distributed IR
Theme
Retrievalalgorithmen

Similar documents (content)

  1. He, B.; Ounis, I.: Combining fields for query expansion and adaptive query expansion (2007) 0.32
    0.3190759 = sum of:
      0.3190759 = product of:
        0.9971122 = sum of:
          0.08495175 = weight(abstract_txt:mechanism in 1926) [ClassicSimilarity], result of:
            0.08495175 = score(doc=1926,freq=3.0), product of:
              0.12431898 = queryWeight, product of:
                1.0361787 = boost
                6.312396 = idf(docFreq=218, maxDocs=44421)
                0.01900678 = queryNorm
              0.6833369 = fieldWeight in 1926, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.312396 = idf(docFreq=218, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.058759835 = weight(abstract_txt:trec in 1926) [ClassicSimilarity], result of:
            0.058759835 = score(doc=1926,freq=1.0), product of:
              0.14023294 = queryWeight, product of:
                1.1005023 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.01900678 = queryNorm
              0.41901594 = fieldWeight in 1926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.016110478 = weight(abstract_txt:using in 1926) [ClassicSimilarity], result of:
            0.016110478 = score(doc=1926,freq=1.0), product of:
              0.07456676 = queryWeight, product of:
                1.1348895 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.01900678 = queryNorm
              0.21605442 = fieldWeight in 1926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.03526501 = weight(abstract_txt:field in 1926) [ClassicSimilarity], result of:
            0.03526501 = score(doc=1926,freq=1.0), product of:
              0.12570976 = queryWeight, product of:
                1.4735519 = boost
                4.4884357 = idf(docFreq=1356, maxDocs=44421)
                0.01900678 = queryNorm
              0.28052723 = fieldWeight in 1926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4884357 = idf(docFreq=1356, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.034760974 = weight(abstract_txt:retrieval in 1926) [ClassicSimilarity], result of:
            0.034760974 = score(doc=1926,freq=2.0), product of:
              0.11312398 = queryWeight, product of:
                1.7120006 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01900678 = queryNorm
              0.3072821 = fieldWeight in 1926, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.054914054 = weight(abstract_txt:process in 1926) [ClassicSimilarity], result of:
            0.054914054 = score(doc=1926,freq=2.0), product of:
              0.15344371 = queryWeight, product of:
                1.9938896 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01900678 = queryNorm
              0.35787752 = fieldWeight in 1926, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.27724278 = weight(abstract_txt:query in 1926) [ClassicSimilarity], result of:
            0.27724278 = score(doc=1926,freq=7.0), product of:
              0.3526364 = queryWeight, product of:
                3.9022448 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.01900678 = queryNorm
              0.78620017 = fieldWeight in 1926, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
          0.4351073 = weight(abstract_txt:expansion in 1926) [ClassicSimilarity], result of:
            0.4351073 = score(doc=1926,freq=6.0), product of:
              0.465404 = queryWeight, product of:
                4.0096917 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.01900678 = queryNorm
              0.9349023 = fieldWeight in 1926, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=1926)
        0.32 = coord(8/25)
    
  2. Xu, B.; Lin, H.; Lin, Y.: Assessment of learning to rank methods for query expansion (2016) 0.25
    0.24868694 = sum of:
      0.24868694 = product of:
        0.7771467 = sum of:
          0.044570956 = weight(abstract_txt:introduce in 3929) [ClassicSimilarity], result of:
            0.044570956 = score(doc=3929,freq=1.0), product of:
              0.116635546 = queryWeight, product of:
                1.0036479 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.01900678 = queryNorm
              0.3821387 = fieldWeight in 3929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.058759835 = weight(abstract_txt:trec in 3929) [ClassicSimilarity], result of:
            0.058759835 = score(doc=3929,freq=1.0), product of:
              0.14023294 = queryWeight, product of:
                1.1005023 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.01900678 = queryNorm
              0.41901594 = fieldWeight in 3929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.016110478 = weight(abstract_txt:using in 3929) [ClassicSimilarity], result of:
            0.016110478 = score(doc=3929,freq=1.0), product of:
              0.07456676 = queryWeight, product of:
                1.1348895 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.01900678 = queryNorm
              0.21605442 = fieldWeight in 3929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.081195205 = weight(abstract_txt:refinement in 3929) [ClassicSimilarity], result of:
            0.081195205 = score(doc=3929,freq=1.0), product of:
              0.17397355 = queryWeight, product of:
                1.225766 = boost
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.01900678 = queryNorm
              0.46671006 = fieldWeight in 3929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.467361 = idf(docFreq=68, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.02389469 = weight(abstract_txt:most in 3929) [ClassicSimilarity], result of:
            0.02389469 = score(doc=3929,freq=1.0), product of:
              0.09697815 = queryWeight, product of:
                1.2942492 = boost
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.01900678 = queryNorm
              0.2463925 = fieldWeight in 3929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.94228 = idf(docFreq=2342, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.04915944 = weight(abstract_txt:retrieval in 3929) [ClassicSimilarity], result of:
            0.04915944 = score(doc=3929,freq=4.0), product of:
              0.11312398 = queryWeight, product of:
                1.7120006 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01900678 = queryNorm
              0.4345625 = fieldWeight in 3929, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.14819251 = weight(abstract_txt:query in 3929) [ClassicSimilarity], result of:
            0.14819251 = score(doc=3929,freq=2.0), product of:
              0.3526364 = queryWeight, product of:
                3.9022448 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.01900678 = queryNorm
              0.42024165 = fieldWeight in 3929, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
          0.3552636 = weight(abstract_txt:expansion in 3929) [ClassicSimilarity], result of:
            0.3552636 = score(doc=3929,freq=4.0), product of:
              0.465404 = queryWeight, product of:
                4.0096917 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.01900678 = queryNorm
              0.7633445 = fieldWeight in 3929, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=3929)
        0.32 = coord(8/25)
    
  3. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.25
    0.24722087 = sum of:
      0.24722087 = product of:
        1.030087 = sum of:
          0.044086713 = weight(abstract_txt:corpus in 2338) [ClassicSimilarity], result of:
            0.044086713 = score(doc=2338,freq=1.0), product of:
              0.11578922 = queryWeight, product of:
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.01900678 = queryNorm
              0.38074973 = fieldWeight in 2338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.114311665 = weight(abstract_txt:imprecise in 2338) [ClassicSimilarity], result of:
            0.114311665 = score(doc=2338,freq=1.0), product of:
              0.21853617 = queryWeight, product of:
                1.3738129 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.01900678 = queryNorm
              0.5230789 = fieldWeight in 2338, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.054961927 = weight(abstract_txt:retrieval in 2338) [ClassicSimilarity], result of:
            0.054961927 = score(doc=2338,freq=5.0), product of:
              0.11312398 = queryWeight, product of:
                1.7120006 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01900678 = queryNorm
              0.48585567 = fieldWeight in 2338, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.067255706 = weight(abstract_txt:process in 2338) [ClassicSimilarity], result of:
            0.067255706 = score(doc=2338,freq=3.0), product of:
              0.15344371 = queryWeight, product of:
                1.9938896 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01900678 = queryNorm
              0.43830866 = fieldWeight in 2338, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.31436378 = weight(abstract_txt:query in 2338) [ClassicSimilarity], result of:
            0.31436378 = score(doc=2338,freq=9.0), product of:
              0.3526364 = queryWeight, product of:
                3.9022448 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.01900678 = queryNorm
              0.8914672 = fieldWeight in 2338, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
          0.4351073 = weight(abstract_txt:expansion in 2338) [ClassicSimilarity], result of:
            0.4351073 = score(doc=2338,freq=6.0), product of:
              0.465404 = queryWeight, product of:
                4.0096917 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.01900678 = queryNorm
              0.9349023 = fieldWeight in 2338, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=2338)
        0.24 = coord(6/25)
    
  4. Efthimiadis, E.N.: Query expansion (1996) 0.24
    0.2416621 = sum of:
      0.2416621 = product of:
        1.5103881 = sum of:
          0.04915944 = weight(abstract_txt:retrieval in 4915) [ClassicSimilarity], result of:
            0.04915944 = score(doc=4915,freq=1.0), product of:
              0.11312398 = queryWeight, product of:
                1.7120006 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01900678 = queryNorm
              0.4345625 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.125 = fieldNorm(doc=4915)
          0.0776602 = weight(abstract_txt:process in 4915) [ClassicSimilarity], result of:
            0.0776602 = score(doc=4915,freq=1.0), product of:
              0.15344371 = queryWeight, product of:
                1.9938896 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01900678 = queryNorm
              0.50611526 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.125 = fieldNorm(doc=4915)
          0.51335394 = weight(abstract_txt:query in 4915) [ClassicSimilarity], result of:
            0.51335394 = score(doc=4915,freq=6.0), product of:
              0.3526364 = queryWeight, product of:
                3.9022448 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.01900678 = queryNorm
              1.4557599 = fieldWeight in 4915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.125 = fieldNorm(doc=4915)
          0.8702146 = weight(abstract_txt:expansion in 4915) [ClassicSimilarity], result of:
            0.8702146 = score(doc=4915,freq=6.0), product of:
              0.465404 = queryWeight, product of:
                4.0096917 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.01900678 = queryNorm
              1.8698046 = fieldWeight in 4915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.125 = fieldNorm(doc=4915)
        0.16 = coord(4/25)
    
  5. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.24
    0.237648 = sum of:
      0.237648 = product of:
        0.84874284 = sum of:
          0.044570956 = weight(abstract_txt:introduce in 2343) [ClassicSimilarity], result of:
            0.044570956 = score(doc=2343,freq=1.0), product of:
              0.116635546 = queryWeight, product of:
                1.0036479 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.01900678 = queryNorm
              0.3821387 = fieldWeight in 2343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.016110478 = weight(abstract_txt:using in 2343) [ClassicSimilarity], result of:
            0.016110478 = score(doc=2343,freq=1.0), product of:
              0.07456676 = queryWeight, product of:
                1.1348895 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.01900678 = queryNorm
              0.21605442 = fieldWeight in 2343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.03902774 = weight(abstract_txt:existing in 2343) [ClassicSimilarity], result of:
            0.03902774 = score(doc=2343,freq=1.0), product of:
              0.13449988 = queryWeight, product of:
                1.5241997 = boost
                4.6427093 = idf(docFreq=1162, maxDocs=44421)
                0.01900678 = queryNorm
              0.29016933 = fieldWeight in 2343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6427093 = idf(docFreq=1162, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.02457972 = weight(abstract_txt:retrieval in 2343) [ClassicSimilarity], result of:
            0.02457972 = score(doc=2343,freq=1.0), product of:
              0.11312398 = queryWeight, product of:
                1.7120006 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.01900678 = queryNorm
              0.21728125 = fieldWeight in 2343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.09294425 = weight(abstract_txt:precision in 2343) [ClassicSimilarity], result of:
            0.09294425 = score(doc=2343,freq=2.0), product of:
              0.19037561 = queryWeight, product of:
                1.8133707 = boost
                5.5235233 = idf(docFreq=481, maxDocs=44421)
                0.01900678 = queryNorm
              0.4882151 = fieldWeight in 2343, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5235233 = idf(docFreq=481, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.23431292 = weight(abstract_txt:query in 2343) [ClassicSimilarity], result of:
            0.23431292 = score(doc=2343,freq=5.0), product of:
              0.3526364 = queryWeight, product of:
                3.9022448 = boost
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.01900678 = queryNorm
              0.6644604 = fieldWeight in 2343, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.754492 = idf(docFreq=1039, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
          0.39719677 = weight(abstract_txt:expansion in 2343) [ClassicSimilarity], result of:
            0.39719677 = score(doc=2343,freq=5.0), product of:
              0.465404 = queryWeight, product of:
                4.0096917 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.01900678 = queryNorm
              0.8534451 = fieldWeight in 2343, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=2343)
        0.28 = coord(7/25)