Document (#37100)

Author
Djioua, B.
Desclés, J.-P.
Alrahabi, M.
Title
Searching and mining with semantic categories
Source
Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Imprint
Hershey, PA : IGI Publishing
Year
2012
Pages
S.115-137
Abstract
A new model is proposed to retrieve information by building automatically a semantic metatext structure for texts that allow searching and extracting discourse and semantic information according to certain linguistic categorizations. This paper presents approaches for searching and mining full text with semantic categories. The model is built up from two engines: The first one, called EXCOM (Djioua et al., 2006; Alrahabi, 2010), is an automatic system for text annotation, related to discourse and semantic maps, which are specification of general linguistic ontologies founded on the Applicative and Cognitive Grammar. The annotation layer uses a linguistic method called Contextual Exploration, which handles the polysemic values of a term in texts. Several 'semantic maps' underlying 'point of views' for text mining guide this automatic annotation process. The second engine uses semantic annotated texts, produced previously in order to create a semantic inverted index, which is able to retrieve relevant documents for queries associated with discourse and semantic categories such as definition, quotation, causality, relations between concepts, etc. (Djioua & Desclés, 2007). This semantic indexation process builds a metatext layer for textual contents. Some data and linguistic rules sets as well as the general architecture that extend third-party software are expressed as supplementary information.
Footnote
Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64423.
Theme
Semantic Web
Wissensrepräsentation

Similar documents (content)

  1. Rindflesch, T.C.; Fizsman, M.: The interaction of domain knowledge and linguistic structure in natural language processing : interpreting hypernymic propositions in biomedical text (2003) 0.23
    0.2253477 = sum of:
      0.2253477 = product of:
        0.7042116 = sum of:
          0.018302217 = weight(abstract_txt:process in 3097) [ClassicSimilarity], result of:
            0.018302217 = score(doc=3097,freq=1.0), product of:
              0.072324306 = queryWeight, product of:
                1.0137644 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01762008 = queryNorm
              0.25305763 = fieldWeight in 3097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.031697553 = weight(abstract_txt:general in 3097) [ClassicSimilarity], result of:
            0.031697553 = score(doc=3097,freq=2.0), product of:
              0.08278576 = queryWeight, product of:
                1.0846077 = boost
                4.3318667 = idf(docFreq=1586, maxDocs=44421)
                0.01762008 = queryNorm
              0.38288653 = fieldWeight in 3097, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3318667 = idf(docFreq=1586, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.020463558 = weight(abstract_txt:which in 3097) [ClassicSimilarity], result of:
            0.020463558 = score(doc=3097,freq=4.0), product of:
              0.056184046 = queryWeight, product of:
                1.0943267 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.01762008 = queryNorm
              0.36422366 = fieldWeight in 3097, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.03867351 = weight(abstract_txt:automatic in 3097) [ClassicSimilarity], result of:
            0.03867351 = score(doc=3097,freq=1.0), product of:
              0.119094275 = queryWeight, product of:
                1.3008893 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01762008 = queryNorm
              0.32473022 = fieldWeight in 3097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.03859404 = weight(abstract_txt:text in 3097) [ClassicSimilarity], result of:
            0.03859404 = score(doc=3097,freq=2.0), product of:
              0.108056046 = queryWeight, product of:
                1.517627 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01762008 = queryNorm
              0.3571669 = fieldWeight in 3097, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.10078903 = weight(abstract_txt:discourse in 3097) [ClassicSimilarity], result of:
            0.10078903 = score(doc=3097,freq=1.0), product of:
              0.2581791 = queryWeight, product of:
                2.3458543 = boost
                6.2461467 = idf(docFreq=233, maxDocs=44421)
                0.01762008 = queryNorm
              0.39038417 = fieldWeight in 3097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2461467 = idf(docFreq=233, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.1531579 = weight(abstract_txt:linguistic in 3097) [ClassicSimilarity], result of:
            0.1531579 = score(doc=3097,freq=2.0), product of:
              0.2981088 = queryWeight, product of:
                2.910699 = boost
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.01762008 = queryNorm
              0.51376516 = fieldWeight in 3097, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
          0.30253378 = weight(abstract_txt:semantic in 3097) [ClassicSimilarity], result of:
            0.30253378 = score(doc=3097,freq=6.0), product of:
              0.44164228 = queryWeight, product of:
                5.60164 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.01762008 = queryNorm
              0.68501997 = fieldWeight in 3097, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0625 = fieldNorm(doc=3097)
        0.32 = coord(8/25)
    
  2. Sembok, T.M.T.; Rijsbergen, C.J. van: SILOL: a simple logical-linguistic document retrieval system (1990) 0.20
    0.19569528 = sum of:
      0.19569528 = product of:
        0.6989117 = sum of:
          0.027453328 = weight(abstract_txt:process in 6683) [ClassicSimilarity], result of:
            0.027453328 = score(doc=6683,freq=1.0), product of:
              0.072324306 = queryWeight, product of:
                1.0137644 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01762008 = queryNorm
              0.37958646 = fieldWeight in 6683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.021704882 = weight(abstract_txt:which in 6683) [ClassicSimilarity], result of:
            0.021704882 = score(doc=6683,freq=2.0), product of:
              0.056184046 = queryWeight, product of:
                1.0943267 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.01762008 = queryNorm
              0.38631755 = fieldWeight in 6683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.052578293 = weight(abstract_txt:uses in 6683) [ClassicSimilarity], result of:
            0.052578293 = score(doc=6683,freq=1.0), product of:
              0.111538626 = queryWeight, product of:
                1.2589473 = boost
                5.0281696 = idf(docFreq=790, maxDocs=44421)
                0.01762008 = queryNorm
              0.4713909 = fieldWeight in 6683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0281696 = idf(docFreq=790, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.061559327 = weight(abstract_txt:called in 6683) [ClassicSimilarity], result of:
            0.061559327 = score(doc=6683,freq=1.0), product of:
              0.1239035 = queryWeight, product of:
                1.3268954 = boost
                5.2995505 = idf(docFreq=602, maxDocs=44421)
                0.01762008 = queryNorm
              0.49683285 = fieldWeight in 6683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2995505 = idf(docFreq=602, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.11116546 = weight(abstract_txt:texts in 6683) [ClassicSimilarity], result of:
            0.11116546 = score(doc=6683,freq=1.0), product of:
              0.21032842 = queryWeight, product of:
                2.1173346 = boost
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.01762008 = queryNorm
              0.52853274 = fieldWeight in 6683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.16244851 = weight(abstract_txt:linguistic in 6683) [ClassicSimilarity], result of:
            0.16244851 = score(doc=6683,freq=1.0), product of:
              0.2981088 = queryWeight, product of:
                2.910699 = boost
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.01762008 = queryNorm
              0.5449303 = fieldWeight in 6683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
          0.26200193 = weight(abstract_txt:semantic in 6683) [ClassicSimilarity], result of:
            0.26200193 = score(doc=6683,freq=2.0), product of:
              0.44164228 = queryWeight, product of:
                5.60164 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.01762008 = queryNorm
              0.5932447 = fieldWeight in 6683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.09375 = fieldNorm(doc=6683)
        0.28 = coord(7/25)
    
  3. Park, J.-r.: Evolution of concept networks and implications for knowledge representation (2007) 0.19
    0.18760341 = sum of:
      0.18760341 = product of:
        0.7816809 = sum of:
          0.018302217 = weight(abstract_txt:process in 1847) [ClassicSimilarity], result of:
            0.018302217 = score(doc=1847,freq=1.0), product of:
              0.072324306 = queryWeight, product of:
                1.0137644 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01762008 = queryNorm
              0.25305763 = fieldWeight in 1847, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
          0.010231779 = weight(abstract_txt:which in 1847) [ClassicSimilarity], result of:
            0.010231779 = score(doc=1847,freq=1.0), product of:
              0.056184046 = queryWeight, product of:
                1.0943267 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.01762008 = queryNorm
              0.18211183 = fieldWeight in 1847, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
          0.03859404 = weight(abstract_txt:text in 1847) [ClassicSimilarity], result of:
            0.03859404 = score(doc=1847,freq=2.0), product of:
              0.108056046 = queryWeight, product of:
                1.517627 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01762008 = queryNorm
              0.3571669 = fieldWeight in 1847, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
          0.22537112 = weight(abstract_txt:discourse in 1847) [ClassicSimilarity], result of:
            0.22537112 = score(doc=1847,freq=5.0), product of:
              0.2581791 = queryWeight, product of:
                2.3458543 = boost
                6.2461467 = idf(docFreq=233, maxDocs=44421)
                0.01762008 = queryNorm
              0.8729255 = fieldWeight in 1847, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.2461467 = idf(docFreq=233, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
          0.24216394 = weight(abstract_txt:linguistic in 1847) [ClassicSimilarity], result of:
            0.24216394 = score(doc=1847,freq=5.0), product of:
              0.2981088 = queryWeight, product of:
                2.910699 = boost
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.01762008 = queryNorm
              0.8123341 = fieldWeight in 1847, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
          0.2470178 = weight(abstract_txt:semantic in 1847) [ClassicSimilarity], result of:
            0.2470178 = score(doc=1847,freq=4.0), product of:
              0.44164228 = queryWeight, product of:
                5.60164 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.01762008 = queryNorm
              0.55931646 = fieldWeight in 1847, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0625 = fieldNorm(doc=1847)
        0.24 = coord(6/25)
    
  4. Ibekwe-SanJuan, F.: Constructing and maintaining knowledge organization tools : a symbolic approach (2006) 0.17
    0.17002901 = sum of:
      0.17002901 = product of:
        0.60724646 = sum of:
          0.012661181 = weight(abstract_txt:which in 595) [ClassicSimilarity], result of:
            0.012661181 = score(doc=595,freq=2.0), product of:
              0.056184046 = queryWeight, product of:
                1.0943267 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.01762008 = queryNorm
              0.2253519 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.04785603 = weight(abstract_txt:automatic in 595) [ClassicSimilarity], result of:
            0.04785603 = score(doc=595,freq=2.0), product of:
              0.119094275 = queryWeight, product of:
                1.3008893 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01762008 = queryNorm
              0.40183315 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.0493827 = weight(abstract_txt:maps in 595) [ClassicSimilarity], result of:
            0.0493827 = score(doc=595,freq=1.0), product of:
              0.15322384 = queryWeight, product of:
                1.475564 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.01762008 = queryNorm
              0.32229123 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.033769786 = weight(abstract_txt:text in 595) [ClassicSimilarity], result of:
            0.033769786 = score(doc=595,freq=2.0), product of:
              0.108056046 = queryWeight, product of:
                1.517627 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01762008 = queryNorm
              0.31252104 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.06484651 = weight(abstract_txt:texts in 595) [ClassicSimilarity], result of:
            0.06484651 = score(doc=595,freq=1.0), product of:
              0.21032842 = queryWeight, product of:
                2.1173346 = boost
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.01762008 = queryNorm
              0.30831075 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.13401318 = weight(abstract_txt:linguistic in 595) [ClassicSimilarity], result of:
            0.13401318 = score(doc=595,freq=2.0), product of:
              0.2981088 = queryWeight, product of:
                2.910699 = boost
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.01762008 = queryNorm
              0.44954452 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
          0.26471707 = weight(abstract_txt:semantic in 595) [ClassicSimilarity], result of:
            0.26471707 = score(doc=595,freq=6.0), product of:
              0.44164228 = queryWeight, product of:
                5.60164 = boost
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.01762008 = queryNorm
              0.5993925 = fieldWeight in 595, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.4745317 = idf(docFreq=1375, maxDocs=44421)
                0.0546875 = fieldNorm(doc=595)
        0.28 = coord(7/25)
    
  5. Wang, W.M.; Cheung, C.F.; Lee, W.B.; Kwok, S.K.: Mining knowledge from natural language texts using fuzzy associated concept mapping (2008) 0.17
    0.16559868 = sum of:
      0.16559868 = product of:
        0.5174959 = sum of:
          0.01601444 = weight(abstract_txt:process in 3121) [ClassicSimilarity], result of:
            0.01601444 = score(doc=3121,freq=1.0), product of:
              0.072324306 = queryWeight, product of:
                1.0137644 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.01762008 = queryNorm
              0.22142543 = fieldWeight in 3121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.017905613 = weight(abstract_txt:which in 3121) [ClassicSimilarity], result of:
            0.017905613 = score(doc=3121,freq=4.0), product of:
              0.056184046 = queryWeight, product of:
                1.0943267 = boost
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.01762008 = queryNorm
              0.3186957 = fieldWeight in 3121, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.9137893 = idf(docFreq=6552, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.033839323 = weight(abstract_txt:automatic in 3121) [ClassicSimilarity], result of:
            0.033839323 = score(doc=3121,freq=1.0), product of:
              0.119094275 = queryWeight, product of:
                1.3008893 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01762008 = queryNorm
              0.28413895 = fieldWeight in 3121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.08553335 = weight(abstract_txt:maps in 3121) [ClassicSimilarity], result of:
            0.08553335 = score(doc=3121,freq=3.0), product of:
              0.15322384 = queryWeight, product of:
                1.475564 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.01762008 = queryNorm
              0.5582248 = fieldWeight in 3121, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.053394724 = weight(abstract_txt:text in 3121) [ClassicSimilarity], result of:
            0.053394724 = score(doc=3121,freq=5.0), product of:
              0.108056046 = queryWeight, product of:
                1.517627 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01762008 = queryNorm
              0.49413916 = fieldWeight in 3121, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.09170682 = weight(abstract_txt:texts in 3121) [ClassicSimilarity], result of:
            0.09170682 = score(doc=3121,freq=2.0), product of:
              0.21032842 = queryWeight, product of:
                2.1173346 = boost
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.01762008 = queryNorm
              0.43601727 = fieldWeight in 3121, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6376824 = idf(docFreq=429, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.08508846 = weight(abstract_txt:mining in 3121) [ClassicSimilarity], result of:
            0.08508846 = score(doc=3121,freq=1.0), product of:
              0.25208905 = queryWeight, product of:
                2.3180218 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.01762008 = queryNorm
              0.33753335 = fieldWeight in 3121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
          0.13401318 = weight(abstract_txt:linguistic in 3121) [ClassicSimilarity], result of:
            0.13401318 = score(doc=3121,freq=2.0), product of:
              0.2981088 = queryWeight, product of:
                2.910699 = boost
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.01762008 = queryNorm
              0.44954452 = fieldWeight in 3121, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8125896 = idf(docFreq=360, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3121)
        0.32 = coord(8/25)