Document (#36470)

Author
Cao, Y.
Duan, H.
Lin, C.-L.
Yu, Y.
Title
Re-ranking question search results by clustering questions
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.6, S.1177-1187
Year
2011
Abstract
In this article, we address the problem of question clustering and study its use for re-ranking question search results. In question clustering we have to organize question search results into certain meaningful and condensed groups. Specifically, we propose to use a data structure consisting of question topic and question focus for modeling questions, and then cluster questions on the basis of the data structure. Experimental results show that our approach to question clustering improves the performance of question search significantly better than the approach not utilizing the topic-focus structure.

Similar documents (content)

  1. Spink, A.; Ozmultu, H.C.: Characteristics of question format web queries : an exploratory study (2002) 0.32
    0.319408 = sum of:
      0.319408 = product of:
        1.1407429 = sum of:
          0.012796434 = weight(abstract_txt:data in 4910) [ClassicSimilarity], result of:
            0.012796434 = score(doc=4910,freq=1.0), product of:
              0.0614802 = queryWeight, product of:
                1.2953408 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.014252058 = queryNorm
              0.20813909 = fieldWeight in 4910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.044721544 = weight(abstract_txt:topic in 4910) [ClassicSimilarity], result of:
            0.044721544 = score(doc=4910,freq=1.0), product of:
              0.14158607 = queryWeight, product of:
                1.9657426 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.014252058 = queryNorm
              0.3158612 = fieldWeight in 4910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.06083426 = weight(abstract_txt:structure in 4910) [ClassicSimilarity], result of:
            0.06083426 = score(doc=4910,freq=2.0), product of:
              0.157929 = queryWeight, product of:
                2.5426874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014252058 = queryNorm
              0.38520005 = fieldWeight in 4910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.02916947 = weight(abstract_txt:results in 4910) [ClassicSimilarity], result of:
            0.02916947 = score(doc=4910,freq=1.0), product of:
              0.13416472 = queryWeight, product of:
                2.706142 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.014252058 = queryNorm
              0.21741535 = fieldWeight in 4910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.08284999 = weight(abstract_txt:search in 4910) [ClassicSimilarity], result of:
            0.08284999 = score(doc=4910,freq=6.0), product of:
              0.14808026 = queryWeight, product of:
                2.8430204 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.014252058 = queryNorm
              0.5594938 = fieldWeight in 4910, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.060491953 = weight(abstract_txt:questions in 4910) [ClassicSimilarity], result of:
            0.060491953 = score(doc=4910,freq=1.0), product of:
              0.19823095 = queryWeight, product of:
                2.848707 = boost
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.014252058 = queryNorm
              0.30515897 = fieldWeight in 4910, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
          0.84987926 = weight(abstract_txt:question in 4910) [ClassicSimilarity], result of:
            0.84987926 = score(doc=4910,freq=15.0), product of:
              0.67497534 = queryWeight, product of:
                9.10472 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.014252058 = queryNorm
              1.2591264 = fieldWeight in 4910, product of:
                3.8729835 = tf(freq=15.0), with freq of:
                  15.0 = termFreq=15.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=4910)
        0.28 = coord(7/25)
    
  2. Chen, L.-C.: Next generation search engine for the result clustering technology (2012) 0.25
    0.24603654 = sum of:
      0.24603654 = product of:
        0.8787019 = sum of:
          0.03404159 = weight(abstract_txt:experimental in 1105) [ClassicSimilarity], result of:
            0.03404159 = score(doc=1105,freq=1.0), product of:
              0.08073575 = queryWeight, product of:
                1.0496254 = boost
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.014252058 = queryNorm
              0.4216421 = fieldWeight in 1105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.03580922 = weight(abstract_txt:significantly in 1105) [ClassicSimilarity], result of:
            0.03580922 = score(doc=1105,freq=1.0), product of:
              0.083506934 = queryWeight, product of:
                1.0674872 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.014252058 = queryNorm
              0.4288173 = fieldWeight in 1105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.0758814 = weight(abstract_txt:organize in 1105) [ClassicSimilarity], result of:
            0.0758814 = score(doc=1105,freq=2.0), product of:
              0.10934683 = queryWeight, product of:
                1.221531 = boost
                6.2809324 = idf(docFreq=225, maxDocs=44421)
                0.014252058 = queryNorm
              0.69395155 = fieldWeight in 1105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2809324 = idf(docFreq=225, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.07604282 = weight(abstract_txt:structure in 1105) [ClassicSimilarity], result of:
            0.07604282 = score(doc=1105,freq=2.0), product of:
              0.157929 = queryWeight, product of:
                2.5426874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014252058 = queryNorm
              0.48150006 = fieldWeight in 1105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.072923675 = weight(abstract_txt:results in 1105) [ClassicSimilarity], result of:
            0.072923675 = score(doc=1105,freq=4.0), product of:
              0.13416472 = queryWeight, product of:
                2.706142 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.014252058 = queryNorm
              0.5435384 = fieldWeight in 1105, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.07322973 = weight(abstract_txt:search in 1105) [ClassicSimilarity], result of:
            0.07322973 = score(doc=1105,freq=3.0), product of:
              0.14808026 = queryWeight, product of:
                2.8430204 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.014252058 = queryNorm
              0.49452728 = fieldWeight in 1105, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
          0.5107735 = weight(abstract_txt:clustering in 1105) [ClassicSimilarity], result of:
            0.5107735 = score(doc=1105,freq=6.0), product of:
              0.42905647 = queryWeight, product of:
                4.839368 = boost
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.014252058 = queryNorm
              1.1904575 = fieldWeight in 1105, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.2208285 = idf(docFreq=239, maxDocs=44421)
                0.078125 = fieldNorm(doc=1105)
        0.28 = coord(7/25)
    
  3. Le, L.T.; Shah, C.: Retrieving people : identifying potential answerers in Community Question-Answering (2018) 0.22
    0.21942578 = sum of:
      0.21942578 = product of:
        0.7836635 = sum of:
          0.027233273 = weight(abstract_txt:experimental in 467) [ClassicSimilarity], result of:
            0.027233273 = score(doc=467,freq=1.0), product of:
              0.08073575 = queryWeight, product of:
                1.0496254 = boost
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.014252058 = queryNorm
              0.33731368 = fieldWeight in 467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.025656432 = weight(abstract_txt:approach in 467) [ClassicSimilarity], result of:
            0.025656432 = score(doc=467,freq=2.0), product of:
              0.07758841 = queryWeight, product of:
                1.4551736 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.014252058 = queryNorm
              0.33067352 = fieldWeight in 467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.044721544 = weight(abstract_txt:topic in 467) [ClassicSimilarity], result of:
            0.044721544 = score(doc=467,freq=1.0), product of:
              0.14158607 = queryWeight, product of:
                1.9657426 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.014252058 = queryNorm
              0.3158612 = fieldWeight in 467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.02916947 = weight(abstract_txt:results in 467) [ClassicSimilarity], result of:
            0.02916947 = score(doc=467,freq=1.0), product of:
              0.13416472 = queryWeight, product of:
                2.706142 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.014252058 = queryNorm
              0.21741535 = fieldWeight in 467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.033823363 = weight(abstract_txt:search in 467) [ClassicSimilarity], result of:
            0.033823363 = score(doc=467,freq=1.0), product of:
              0.14808026 = queryWeight, product of:
                2.8430204 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.014252058 = queryNorm
              0.22841237 = fieldWeight in 467, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.08554854 = weight(abstract_txt:questions in 467) [ClassicSimilarity], result of:
            0.08554854 = score(doc=467,freq=2.0), product of:
              0.19823095 = queryWeight, product of:
                2.848707 = boost
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.014252058 = queryNorm
              0.43155995 = fieldWeight in 467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
          0.5375109 = weight(abstract_txt:question in 467) [ClassicSimilarity], result of:
            0.5375109 = score(doc=467,freq=6.0), product of:
              0.67497534 = queryWeight, product of:
                9.10472 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.014252058 = queryNorm
              0.7963415 = fieldWeight in 467, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=467)
        0.28 = coord(7/25)
    
  4. Luo, Z.; Yu, Y.; Osborne, M.; Wang, T.: Structuring tweets for improving Twitter search (2015) 0.21
    0.213635 = sum of:
      0.213635 = product of:
        0.5340875 = sum of:
          0.027233273 = weight(abstract_txt:experimental in 3335) [ClassicSimilarity], result of:
            0.027233273 = score(doc=3335,freq=1.0), product of:
              0.08073575 = queryWeight, product of:
                1.0496254 = boost
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.014252058 = queryNorm
              0.33731368 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.028647374 = weight(abstract_txt:significantly in 3335) [ClassicSimilarity], result of:
            0.028647374 = score(doc=3335,freq=1.0), product of:
              0.083506934 = queryWeight, product of:
                1.0674872 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.014252058 = queryNorm
              0.34305385 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.037375666 = weight(abstract_txt:modeling in 3335) [ClassicSimilarity], result of:
            0.037375666 = score(doc=3335,freq=1.0), product of:
              0.09970691 = queryWeight, product of:
                1.1664444 = boost
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.014252058 = queryNorm
              0.3748553 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.997685 = idf(docFreq=299, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.052520767 = weight(abstract_txt:improves in 3335) [ClassicSimilarity], result of:
            0.052520767 = score(doc=3335,freq=1.0), product of:
              0.12508926 = queryWeight, product of:
                1.306506 = boost
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.014252058 = queryNorm
              0.41986632 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.018141838 = weight(abstract_txt:approach in 3335) [ClassicSimilarity], result of:
            0.018141838 = score(doc=3335,freq=1.0), product of:
              0.07758841 = queryWeight, product of:
                1.4551736 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.014252058 = queryNorm
              0.2338215 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.044721544 = weight(abstract_txt:topic in 3335) [ClassicSimilarity], result of:
            0.044721544 = score(doc=3335,freq=1.0), product of:
              0.14158607 = queryWeight, product of:
                1.9657426 = boost
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.014252058 = queryNorm
              0.3158612 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.053779 = idf(docFreq=770, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.04301632 = weight(abstract_txt:structure in 3335) [ClassicSimilarity], result of:
            0.04301632 = score(doc=3335,freq=1.0), product of:
              0.157929 = queryWeight, product of:
                2.5426874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014252058 = queryNorm
              0.27237758 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.02916947 = weight(abstract_txt:results in 3335) [ClassicSimilarity], result of:
            0.02916947 = score(doc=3335,freq=1.0), product of:
              0.13416472 = queryWeight, product of:
                2.706142 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.014252058 = queryNorm
              0.21741535 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.033823363 = weight(abstract_txt:search in 3335) [ClassicSimilarity], result of:
            0.033823363 = score(doc=3335,freq=1.0), product of:
              0.14808026 = queryWeight, product of:
                2.8430204 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.014252058 = queryNorm
              0.22841237 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
          0.21943788 = weight(abstract_txt:question in 3335) [ClassicSimilarity], result of:
            0.21943788 = score(doc=3335,freq=1.0), product of:
              0.67497534 = queryWeight, product of:
                9.10472 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.014252058 = queryNorm
              0.32510504 = fieldWeight in 3335, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=3335)
        0.4 = coord(10/25)
    
  5. Bae, K.; Ko, Y.: Improving question retrieval in community question answering service using dependency relations and question classification (2019) 0.21
    0.21038216 = sum of:
      0.21038216 = product of:
        0.7513649 = sum of:
          0.02355032 = weight(abstract_txt:propose in 412) [ClassicSimilarity], result of:
            0.02355032 = score(doc=412,freq=1.0), product of:
              0.07328198 = queryWeight, product of:
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.014252058 = queryNorm
              0.32136577 = fieldWeight in 412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1418524 = idf(docFreq=705, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.027233273 = weight(abstract_txt:experimental in 412) [ClassicSimilarity], result of:
            0.027233273 = score(doc=412,freq=1.0), product of:
              0.08073575 = queryWeight, product of:
                1.0496254 = boost
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.014252058 = queryNorm
              0.33731368 = fieldWeight in 412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397019 = idf(docFreq=546, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.028647374 = weight(abstract_txt:significantly in 412) [ClassicSimilarity], result of:
            0.028647374 = score(doc=412,freq=1.0), product of:
              0.083506934 = queryWeight, product of:
                1.0674872 = boost
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.014252058 = queryNorm
              0.34305385 = fieldWeight in 412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4888616 = idf(docFreq=498, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.018141838 = weight(abstract_txt:approach in 412) [ClassicSimilarity], result of:
            0.018141838 = score(doc=412,freq=1.0), product of:
              0.07758841 = queryWeight, product of:
                1.4551736 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.014252058 = queryNorm
              0.2338215 = fieldWeight in 412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.05833894 = weight(abstract_txt:results in 412) [ClassicSimilarity], result of:
            0.05833894 = score(doc=412,freq=4.0), product of:
              0.13416472 = queryWeight, product of:
                2.706142 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.014252058 = queryNorm
              0.4348307 = fieldWeight in 412, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.10477514 = weight(abstract_txt:questions in 412) [ClassicSimilarity], result of:
            0.10477514 = score(doc=412,freq=3.0), product of:
              0.19823095 = queryWeight, product of:
                2.848707 = boost
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.014252058 = queryNorm
              0.52855086 = fieldWeight in 412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.8825436 = idf(docFreq=914, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
          0.490678 = weight(abstract_txt:question in 412) [ClassicSimilarity], result of:
            0.490678 = score(doc=412,freq=5.0), product of:
              0.67497534 = queryWeight, product of:
                9.10472 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.014252058 = queryNorm
              0.72695696 = fieldWeight in 412, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=412)
        0.28 = coord(7/25)