Document (#25061)

Larouk, O.
Modelling users need : schemas of interrogation and filtering of answers from the WEB in co-operative mode
Structures and relations in knowledge organization: Proceedings of the 5th International ISKO-Conference, Lille, 25.-29.8.1998. Ed.: W. Mustafa el Hadi et al
Würzburg : Ergon
Advances in knowledge organization; vol.6
Textual analysis is a part of information processing systems. The access to digital data through WEB servers is facilitated by search engines. Following a request, the user is presented with a long list of WEB page references. The efficient selection of relevant documents is very difficult given the low precision in the list. Generally, the user visits the first page referenced in the list but he doesn't consult the hundredth. As it is difficult to assess the pertinence of all the obtained references, the searcher needs tools to filter the list. The aim of the present paper is to suggest a method of filtering based on the URL addresses, titles and abstracts. This filtering will enable the searcher to build a set of pages and so improve on the initial search formulation. This process falls within the scope of modeling the user's profile as a means to improve access to more relevant information. It uses classification algorithms to extract more relevant 'terms' in titles and abstracts, thanks to texts accepted or rejected interactively by the user in the process of filtering. The problem of information searching in texts is mainly linguistic. The objective is to construct a system of automatic indexing using the Noun Phrases (NP) model. The intensional predicate/NP instances are built from the retrieval, navigation and filtering of the references captured from the WEB. The questions that are now posed are: Can they play the role of descriptors in textual databases? How should they be organized in a documentary indexing system for the future research of information ?

Similar documents (content)

  1. García Cumbreras, M.A.; Perea-Ortega, J.M.; García Vega, M.; Ureña López, L.A.: Information retrieval with geographical references : relevant documents filtering vs. query expansion (2009) 0.18
    0.17652892 = sum of:
      0.17652892 = product of:
        0.88264453 = sum of:
          0.018191898 = weight(abstract_txt:information in 222) [ClassicSimilarity], result of:
            0.018191898 = score(doc=222,freq=2.0), product of:
              0.056724925 = queryWeight, product of:
                1.1688051 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020063838 = queryNorm
              0.3207038 = fieldWeight in 222, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.09375 = fieldNorm(doc=222)
          0.05536606 = weight(abstract_txt:improve in 222) [ClassicSimilarity], result of:
            0.05536606 = score(doc=222,freq=1.0), product of:
              0.11912905 = queryWeight, product of:
                1.1977025 = boost
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.020063838 = queryNorm
              0.46475703 = fieldWeight in 222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.09375 = fieldNorm(doc=222)
          0.11717761 = weight(abstract_txt:relevant in 222) [ClassicSimilarity], result of:
            0.11717761 = score(doc=222,freq=3.0), product of:
              0.15586251 = queryWeight, product of:
                1.6778634 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.020063838 = queryNorm
              0.75180113 = fieldWeight in 222, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.09375 = fieldNorm(doc=222)
          0.1423021 = weight(abstract_txt:list in 222) [ClassicSimilarity], result of:
            0.1423021 = score(doc=222,freq=1.0), product of:
              0.28162602 = queryWeight, product of:
                2.604303 = boost
                5.389733 = idf(docFreq=550, maxDocs=44421)
                0.020063838 = queryNorm
              0.50528747 = fieldWeight in 222, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.389733 = idf(docFreq=550, maxDocs=44421)
                0.09375 = fieldNorm(doc=222)
          0.54960686 = weight(abstract_txt:filtering in 222) [ClassicSimilarity], result of:
            0.54960686 = score(doc=222,freq=3.0), product of:
              0.51780105 = queryWeight, product of:
                3.9481313 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.020063838 = queryNorm
              1.0614247 = fieldWeight in 222, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.09375 = fieldNorm(doc=222)
        0.2 = coord(5/25)
  2. Mukhopadhyay, S.; Peng, S.; Raje, R.; Mostafa, J.; Palakal, M.: Distributed multi-agent information filtering : a comparative study (2005) 0.15
    0.15454431 = sum of:
      0.15454431 = product of:
        0.7727215 = sum of:
          0.011935292 = weight(abstract_txt:from in 4559) [ClassicSimilarity], result of:
            0.011935292 = score(doc=4559,freq=1.0), product of:
              0.055364132 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.020063838 = queryNorm
              0.21557805 = fieldWeight in 4559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=4559)
          0.028361605 = weight(abstract_txt:information in 4559) [ClassicSimilarity], result of:
            0.028361605 = score(doc=4559,freq=7.0), product of:
              0.056724925 = queryWeight, product of:
                1.1688051 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020063838 = queryNorm
              0.4999849 = fieldWeight in 4559, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=4559)
          0.028329603 = weight(abstract_txt:user in 4559) [ClassicSimilarity], result of:
            0.028329603 = score(doc=4559,freq=1.0), product of:
              0.0985145 = queryWeight, product of:
                1.3339386 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.020063838 = queryNorm
              0.28756785 = fieldWeight in 4559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.078125 = fieldNorm(doc=4559)
          0.056377105 = weight(abstract_txt:relevant in 4559) [ClassicSimilarity], result of:
            0.056377105 = score(doc=4559,freq=1.0), product of:
              0.15586251 = queryWeight, product of:
                1.6778634 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.020063838 = queryNorm
              0.3617105 = fieldWeight in 4559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.078125 = fieldNorm(doc=4559)
          0.64771795 = weight(abstract_txt:filtering in 4559) [ClassicSimilarity], result of:
            0.64771795 = score(doc=4559,freq=6.0), product of:
              0.51780105 = queryWeight, product of:
                3.9481313 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.020063838 = queryNorm
              1.2509012 = fieldWeight in 4559, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=4559)
        0.2 = coord(5/25)
  3. Furner, J.: On Recommending (2002) 0.14
    0.14211272 = sum of:
      0.14211272 = product of:
        0.5075454 = sum of:
          0.009548233 = weight(abstract_txt:from in 243) [ClassicSimilarity], result of:
            0.009548233 = score(doc=243,freq=1.0), product of:
              0.055364132 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.020063838 = queryNorm
              0.17246243 = fieldWeight in 243, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.012127932 = weight(abstract_txt:information in 243) [ClassicSimilarity], result of:
            0.012127932 = score(doc=243,freq=2.0), product of:
              0.056724925 = queryWeight, product of:
                1.1688051 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020063838 = queryNorm
              0.21380253 = fieldWeight in 243, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.039254647 = weight(abstract_txt:user in 243) [ClassicSimilarity], result of:
            0.039254647 = score(doc=243,freq=3.0), product of:
              0.0985145 = queryWeight, product of:
                1.3339386 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.020063838 = queryNorm
              0.3984657 = fieldWeight in 243, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.09510106 = weight(abstract_txt:searcher in 243) [ClassicSimilarity], result of:
            0.09510106 = score(doc=243,freq=1.0), product of:
              0.22389255 = queryWeight, product of:
                1.6419501 = boost
                6.7961926 = idf(docFreq=134, maxDocs=44421)
                0.020063838 = queryNorm
              0.42476204 = fieldWeight in 243, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7961926 = idf(docFreq=134, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.045101684 = weight(abstract_txt:relevant in 243) [ClassicSimilarity], result of:
            0.045101684 = score(doc=243,freq=1.0), product of:
              0.15586251 = queryWeight, product of:
                1.6778634 = boost
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.020063838 = queryNorm
              0.2893684 = fieldWeight in 243, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6298943 = idf(docFreq=1177, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.094868064 = weight(abstract_txt:list in 243) [ClassicSimilarity], result of:
            0.094868064 = score(doc=243,freq=1.0), product of:
              0.28162602 = queryWeight, product of:
                2.604303 = boost
                5.389733 = idf(docFreq=550, maxDocs=44421)
                0.020063838 = queryNorm
              0.3368583 = fieldWeight in 243, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.389733 = idf(docFreq=550, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
          0.21154378 = weight(abstract_txt:filtering in 243) [ClassicSimilarity], result of:
            0.21154378 = score(doc=243,freq=1.0), product of:
              0.51780105 = queryWeight, product of:
                3.9481313 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.020063838 = queryNorm
              0.4085426 = fieldWeight in 243, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.0625 = fieldNorm(doc=243)
        0.28 = coord(7/25)
  4. Elovici, Y.; Shapira, Y.B.; Kantor, P.B.: ¬A decision theoretic approach to combining information filters : an analytical and empirical evaluation. (2006) 0.14
    0.13793053 = sum of:
      0.13793053 = product of:
        0.6896526 = sum of:
          0.014322349 = weight(abstract_txt:from in 267) [ClassicSimilarity], result of:
            0.014322349 = score(doc=267,freq=1.0), product of:
              0.055364132 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.020063838 = queryNorm
              0.25869364 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.09375 = fieldNorm(doc=267)
          0.022280432 = weight(abstract_txt:information in 267) [ClassicSimilarity], result of:
            0.022280432 = score(doc=267,freq=3.0), product of:
              0.056724925 = queryWeight, product of:
                1.1688051 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020063838 = queryNorm
              0.3927803 = fieldWeight in 267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.09375 = fieldNorm(doc=267)
          0.05536606 = weight(abstract_txt:improve in 267) [ClassicSimilarity], result of:
            0.05536606 = score(doc=267,freq=1.0), product of:
              0.11912905 = queryWeight, product of:
                1.1977025 = boost
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.020063838 = queryNorm
              0.46475703 = fieldWeight in 267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.09375 = fieldNorm(doc=267)
          0.048076928 = weight(abstract_txt:user in 267) [ClassicSimilarity], result of:
            0.048076928 = score(doc=267,freq=2.0), product of:
              0.0985145 = queryWeight, product of:
                1.3339386 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.020063838 = queryNorm
              0.4880188 = fieldWeight in 267, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.09375 = fieldNorm(doc=267)
          0.54960686 = weight(abstract_txt:filtering in 267) [ClassicSimilarity], result of:
            0.54960686 = score(doc=267,freq=3.0), product of:
              0.51780105 = queryWeight, product of:
                3.9481313 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.020063838 = queryNorm
              1.0614247 = fieldWeight in 267, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.09375 = fieldNorm(doc=267)
        0.2 = coord(5/25)
  5. Orrico, E.G.D.: Metaphorical representations of the thematic identity of social Groups in the assistance of information retrieval (2003) 0.13
    0.12982684 = sum of:
      0.12982684 = product of:
        0.5409452 = sum of:
          0.011935292 = weight(abstract_txt:from in 3776) [ClassicSimilarity], result of:
            0.011935292 = score(doc=3776,freq=1.0), product of:
              0.055364132 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.020063838 = queryNorm
              0.21557805 = fieldWeight in 3776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
          0.01856703 = weight(abstract_txt:information in 3776) [ClassicSimilarity], result of:
            0.01856703 = score(doc=3776,freq=3.0), product of:
              0.056724925 = queryWeight, product of:
                1.1688051 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.020063838 = queryNorm
              0.32731694 = fieldWeight in 3776, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
          0.046138387 = weight(abstract_txt:improve in 3776) [ClassicSimilarity], result of:
            0.046138387 = score(doc=3776,freq=1.0), product of:
              0.11912905 = queryWeight, product of:
                1.1977025 = boost
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.020063838 = queryNorm
              0.38729754 = fieldWeight in 3776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9574084 = idf(docFreq=848, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
          0.062014777 = weight(abstract_txt:difficult in 3776) [ClassicSimilarity], result of:
            0.062014777 = score(doc=3776,freq=1.0), product of:
              0.14509068 = queryWeight, product of:
                1.3217822 = boost
                5.4709864 = idf(docFreq=507, maxDocs=44421)
                0.020063838 = queryNorm
              0.4274208 = fieldWeight in 3776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4709864 = idf(docFreq=507, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
          0.028329603 = weight(abstract_txt:user in 3776) [ClassicSimilarity], result of:
            0.028329603 = score(doc=3776,freq=1.0), product of:
              0.0985145 = queryWeight, product of:
                1.3339386 = boost
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.020063838 = queryNorm
              0.28756785 = fieldWeight in 3776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6808684 = idf(docFreq=3042, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
          0.3739601 = weight(abstract_txt:filtering in 3776) [ClassicSimilarity], result of:
            0.3739601 = score(doc=3776,freq=2.0), product of:
              0.51780105 = queryWeight, product of:
                3.9481313 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.020063838 = queryNorm
              0.7222081 = fieldWeight in 3776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=3776)
        0.24 = coord(6/25)