Document (#37093)

Author
Biskri, I.
Rompré, L.
Title
Using association rules for query reformulation
Source
Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Imprint
Hershey, PA : IGI Publishing
Year
2012
Pages
S.291-303
Abstract
In this paper the authors will present research on the combination of two methods of data mining: text classification and maximal association rules. Text classification has been the focus of interest of many researchers for a long time. However, the results take the form of lists of words (classes) that people often do not know what to do with. The use of maximal association rules induced a number of advantages: (1) the detection of dependencies and correlations between the relevant units of information (words) of different classes, (2) the extraction of hidden knowledge, often relevant, from a large volume of data. The authors will show how this combination can improve the process of information retrieval.
Footnote
Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64430.
Theme
Data Mining
Retrievalalgorithmen

Similar documents (content)

  1. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.16
    0.15747583 = sum of:
      0.15747583 = product of:
        0.49211198 = sum of:
          0.043708187 = weight(abstract_txt:mining in 3765) [ClassicSimilarity], result of:
            0.043708187 = score(doc=3765,freq=1.0), product of:
              0.12949294 = queryWeight, product of:
                1.0979278 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019109251 = queryNorm
              0.33753335 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.01373181 = weight(abstract_txt:data in 3765) [ClassicSimilarity], result of:
            0.01373181 = score(doc=3765,freq=1.0), product of:
              0.075399086 = queryWeight, product of:
                1.1848102 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019109251 = queryNorm
              0.18212171 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.15289827 = weight(abstract_txt:detection in 3765) [ClassicSimilarity], result of:
            0.15289827 = score(doc=3765,freq=7.0), product of:
              0.15599354 = queryWeight, product of:
                1.2050471 = boost
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.019109251 = queryNorm
              0.98015773 = fieldWeight in 3765, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.13175328 = weight(abstract_txt:hidden in 3765) [ClassicSimilarity], result of:
            0.13175328 = score(doc=3765,freq=4.0), product of:
              0.17022572 = queryWeight, product of:
                1.2588191 = boost
                7.0764947 = idf(docFreq=101, maxDocs=44421)
                0.019109251 = queryNorm
              0.7739916 = fieldWeight in 3765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.0764947 = idf(docFreq=101, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.02365541 = weight(abstract_txt:classification in 3765) [ClassicSimilarity], result of:
            0.02365541 = score(doc=3765,freq=1.0), product of:
              0.1083514 = queryWeight, product of:
                1.4203095 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.019109251 = queryNorm
              0.21832122 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.04906428 = weight(abstract_txt:text in 3765) [ClassicSimilarity], result of:
            0.04906428 = score(doc=3765,freq=4.0), product of:
              0.11101232 = queryWeight, product of:
                1.4376439 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.019109251 = queryNorm
              0.44197148 = fieldWeight in 3765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.036899466 = weight(abstract_txt:relevant in 3765) [ClassicSimilarity], result of:
            0.036899466 = score(doc=3765,freq=1.0), product of:
              0.14573401 = queryWeight, product of:
                1.6471995 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.019109251 = queryNorm
              0.25319734 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
          0.040401243 = weight(abstract_txt:often in 3765) [ClassicSimilarity], result of:
            0.040401243 = score(doc=3765,freq=1.0), product of:
              0.15481417 = queryWeight, product of:
                1.6977396 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.019109251 = queryNorm
              0.26096606 = fieldWeight in 3765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3765)
        0.32 = coord(8/25)
    
  2. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.16
    0.15694053 = sum of:
      0.15694053 = product of:
        0.56050193 = sum of:
          0.10088076 = weight(abstract_txt:extraction in 3895) [ClassicSimilarity], result of:
            0.10088076 = score(doc=3895,freq=4.0), product of:
              0.13033523 = queryWeight, product of:
                1.1014928 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.019109251 = queryNorm
              0.7740099 = fieldWeight in 3895, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.09340293 = weight(abstract_txt:detection in 3895) [ClassicSimilarity], result of:
            0.09340293 = score(doc=3895,freq=2.0), product of:
              0.15599354 = queryWeight, product of:
                1.2050471 = boost
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.019109251 = queryNorm
              0.59876156 = fieldWeight in 3895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.774214 = idf(docFreq=137, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.02803673 = weight(abstract_txt:text in 3895) [ClassicSimilarity], result of:
            0.02803673 = score(doc=3895,freq=1.0), product of:
              0.11101232 = queryWeight, product of:
                1.4376439 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.019109251 = queryNorm
              0.25255513 = fieldWeight in 3895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.042170815 = weight(abstract_txt:relevant in 3895) [ClassicSimilarity], result of:
            0.042170815 = score(doc=3895,freq=1.0), product of:
              0.14573401 = queryWeight, product of:
                1.6471995 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.019109251 = queryNorm
              0.2893684 = fieldWeight in 3895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.08239037 = weight(abstract_txt:combination in 3895) [ClassicSimilarity], result of:
            0.08239037 = score(doc=3895,freq=1.0), product of:
              0.22775638 = queryWeight, product of:
                2.0592117 = boost
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.019109251 = queryNorm
              0.3617478 = fieldWeight in 3895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.084048554 = weight(abstract_txt:classes in 3895) [ClassicSimilarity], result of:
            0.084048554 = score(doc=3895,freq=1.0), product of:
              0.2308021 = queryWeight, product of:
                2.0729346 = boost
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.019109251 = queryNorm
              0.36415854 = fieldWeight in 3895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8265367 = idf(docFreq=355, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
          0.12957181 = weight(abstract_txt:rules in 3895) [ClassicSimilarity], result of:
            0.12957181 = score(doc=3895,freq=2.0), product of:
              0.2798425 = queryWeight, product of:
                2.795556 = boost
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.019109251 = queryNorm
              0.4630169 = fieldWeight in 3895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.0625 = fieldNorm(doc=3895)
        0.28 = coord(7/25)
    
  3. Principles of data mining and knowledge discovery (1998) 0.15
    0.15421142 = sum of:
      0.15421142 = product of:
        0.6425476 = sum of:
          0.083908 = weight(abstract_txt:volume in 4822) [ClassicSimilarity], result of:
            0.083908 = score(doc=4822,freq=1.0), product of:
              0.12600462 = queryWeight, product of:
                1.0830387 = boost
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.019109251 = queryNorm
              0.6659121 = fieldWeight in 4822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
          0.12362542 = weight(abstract_txt:mining in 4822) [ClassicSimilarity], result of:
            0.12362542 = score(doc=4822,freq=2.0), product of:
              0.12949294 = queryWeight, product of:
                1.0979278 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019109251 = queryNorm
              0.9546885 = fieldWeight in 4822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
          0.02746362 = weight(abstract_txt:data in 4822) [ClassicSimilarity], result of:
            0.02746362 = score(doc=4822,freq=1.0), product of:
              0.075399086 = queryWeight, product of:
                1.1848102 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019109251 = queryNorm
              0.36424342 = fieldWeight in 4822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
          0.04906428 = weight(abstract_txt:text in 4822) [ClassicSimilarity], result of:
            0.04906428 = score(doc=4822,freq=1.0), product of:
              0.11101232 = queryWeight, product of:
                1.4376439 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.019109251 = queryNorm
              0.44197148 = fieldWeight in 4822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
          0.16033693 = weight(abstract_txt:rules in 4822) [ClassicSimilarity], result of:
            0.16033693 = score(doc=4822,freq=1.0), product of:
              0.2798425 = queryWeight, product of:
                2.795556 = boost
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.019109251 = queryNorm
              0.5729542 = fieldWeight in 4822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
          0.19814938 = weight(abstract_txt:association in 4822) [ClassicSimilarity], result of:
            0.19814938 = score(doc=4822,freq=1.0), product of:
              0.32226995 = queryWeight, product of:
                3.0 = boost
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.019109251 = queryNorm
              0.6148553 = fieldWeight in 4822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.109375 = fieldNorm(doc=4822)
        0.24 = coord(6/25)
    
  4. Cui, H.; Heidorn, P.B.: ¬The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions (2007) 0.15
    0.15060516 = sum of:
      0.15060516 = product of:
        0.6275215 = sum of:
          0.015693497 = weight(abstract_txt:data in 1084) [ClassicSimilarity], result of:
            0.015693497 = score(doc=1084,freq=1.0), product of:
              0.075399086 = queryWeight, product of:
                1.1848102 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.019109251 = queryNorm
              0.20813909 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
          0.16596374 = weight(abstract_txt:induced in 1084) [ClassicSimilarity], result of:
            0.16596374 = score(doc=1084,freq=2.0), product of:
              0.22884516 = queryWeight, product of:
                1.4595588 = boost
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.019109251 = queryNorm
              0.7252228 = fieldWeight in 1084, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
          0.07377319 = weight(abstract_txt:authors in 1084) [ClassicSimilarity], result of:
            0.07377319 = score(doc=1084,freq=3.0), product of:
              0.14670499 = queryWeight, product of:
                1.6526778 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.019109251 = queryNorm
              0.50286764 = fieldWeight in 1084, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
          0.08239037 = weight(abstract_txt:combination in 1084) [ClassicSimilarity], result of:
            0.08239037 = score(doc=1084,freq=1.0), product of:
              0.22775638 = queryWeight, product of:
                2.0592117 = boost
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.019109251 = queryNorm
              0.3617478 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
          0.12957181 = weight(abstract_txt:rules in 1084) [ClassicSimilarity], result of:
            0.12957181 = score(doc=1084,freq=2.0), product of:
              0.2798425 = queryWeight, product of:
                2.795556 = boost
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.019109251 = queryNorm
              0.4630169 = fieldWeight in 1084, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.238438 = idf(docFreq=640, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
          0.16012889 = weight(abstract_txt:association in 1084) [ClassicSimilarity], result of:
            0.16012889 = score(doc=1084,freq=2.0), product of:
              0.32226995 = queryWeight, product of:
                3.0 = boost
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.019109251 = queryNorm
              0.49687812 = fieldWeight in 1084, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.0625 = fieldNorm(doc=1084)
        0.24 = coord(6/25)
    
  5. Dumais, S.T.: Latent semantic analysis (2003) 0.14
    0.14182068 = sum of:
      0.14182068 = product of:
        0.39394635 = sum of:
          0.024976106 = weight(abstract_txt:mining in 3462) [ClassicSimilarity], result of:
            0.024976106 = score(doc=3462,freq=1.0), product of:
              0.12949294 = queryWeight, product of:
                1.0979278 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.019109251 = queryNorm
              0.1928762 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.024467139 = weight(abstract_txt:will in 3462) [ClassicSimilarity], result of:
            0.024467139 = score(doc=3462,freq=4.0), product of:
              0.10137753 = queryWeight, product of:
                1.3738414 = boost
                3.8615482 = idf(docFreq=2539, maxDocs=44421)
                0.019109251 = queryNorm
              0.24134676 = fieldWeight in 3462, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8615482 = idf(docFreq=2539, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.034337845 = weight(abstract_txt:text in 3462) [ClassicSimilarity], result of:
            0.034337845 = score(doc=3462,freq=6.0), product of:
              0.11101232 = queryWeight, product of:
                1.4376439 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.019109251 = queryNorm
              0.30931562 = fieldWeight in 3462, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.036520995 = weight(abstract_txt:relevant in 3462) [ClassicSimilarity], result of:
            0.036520995 = score(doc=3462,freq=3.0), product of:
              0.14573401 = queryWeight, product of:
                1.6471995 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.019109251 = queryNorm
              0.25060037 = fieldWeight in 3462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.030117778 = weight(abstract_txt:authors in 3462) [ClassicSimilarity], result of:
            0.030117778 = score(doc=3462,freq=2.0), product of:
              0.14670499 = queryWeight, product of:
                1.6526778 = boost
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.019109251 = queryNorm
              0.20529485 = fieldWeight in 3462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6452923 = idf(docFreq=1159, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.032649133 = weight(abstract_txt:often in 3462) [ClassicSimilarity], result of:
            0.032649133 = score(doc=3462,freq=2.0), product of:
              0.15481417 = queryWeight, product of:
                1.6977396 = boost
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.019109251 = queryNorm
              0.21089241 = fieldWeight in 3462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7719507 = idf(docFreq=1021, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.11306808 = weight(abstract_txt:words in 3462) [ClassicSimilarity], result of:
            0.11306808 = score(doc=3462,freq=12.0), product of:
              0.19501702 = queryWeight, product of:
                1.9054695 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.019109251 = queryNorm
              0.5797857 = fieldWeight in 3462, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.041195184 = weight(abstract_txt:combination in 3462) [ClassicSimilarity], result of:
            0.041195184 = score(doc=3462,freq=1.0), product of:
              0.22775638 = queryWeight, product of:
                2.0592117 = boost
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.019109251 = queryNorm
              0.1808739 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.787965 = idf(docFreq=369, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
          0.056614112 = weight(abstract_txt:association in 3462) [ClassicSimilarity], result of:
            0.056614112 = score(doc=3462,freq=1.0), product of:
              0.32226995 = queryWeight, product of:
                3.0 = boost
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.019109251 = queryNorm
              0.17567295 = fieldWeight in 3462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6215343 = idf(docFreq=436, maxDocs=44421)
                0.03125 = fieldNorm(doc=3462)
        0.36 = coord(9/25)