Document (#40622)

Author
Collovini de Abreu, S.
Vieira, R.
Title
RelP: Portuguese open relation extraction
Source
Knowledge organization. 44(2017) no.3, S.163-177
Year
2017
Abstract
Natural language texts are valuable data sources in many human activities. NLP techniques are being widely used in order to help find the right information to specific needs. In this paper, we present one such technique: relation extraction from texts. This task aims at identifying and classifying semantic relations that occur between entities in a text. For example, the sentence "Roberto Marinho is the founder of Rede Globo" expresses a relation occurring between "Roberto Marinho" and "Rede Globo." This work presents a system for Portuguese Open Relation Extraction, named RelP, which extracts any relation descriptor that describes an explicit relation between named entities in the organisation domain by applying the Conditional Random Fields. For implementing RelP, we define the representation scheme, features based on previous work, and a reference corpus. RelP achieved state of the art results for open relation extraction; the F-measure rate was around 60% between the named entities person, organisation and place. For better understanding of the output, we present a way for organizing the output from the mining of the extracted relation descriptors. This organization can be useful to classify relation types, to cluster the entities involved in a common relation and to populate datasets.
Content
Beitrag in einem Special Issue "New Trends for Knowledge Organization, Guest Editor: Renato Rocha Souza".
Theme
Computerlinguistik

Similar documents (author)

  1. Vieira, L.: Modèle d'analyse pur une classification du document iconographique (1999) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:vieira in 6320) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 6320, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=6320)
    
  2. Vieira, S. Bastos => Bastos Vieira, S.: 5.04
    5.0379567 = sum of:
      5.0379567 = weight(author_txt:vieira in 4728) [ClassicSimilarity], result of:
        5.0379567 = fieldWeight in 4728, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.375 = fieldNorm(doc=4728)
    
  3. Vieira, E.S.; Cabral, J.A.S.; Gomes, J.A.N.F.: Definition of a model based on bibliometric indicators for assessing applicants to academic positions (2014) 3.56
    3.5623734 = sum of:
      3.5623734 = weight(author_txt:vieira in 1221) [ClassicSimilarity], result of:
        3.5623734 = fieldWeight in 1221, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.375 = fieldNorm(doc=1221)
    
  4. Carvalho, J.R. de; Cordeiro, M.I.; Lopes, A.; Vieira, M.: Meta-information about MARC : an XML framework for validation, explanation and help systems (2004) 2.97
    2.9686446 = sum of:
      2.9686446 = weight(author_txt:vieira in 2848) [ClassicSimilarity], result of:
        2.9686446 = fieldWeight in 2848, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.3125 = fieldNorm(doc=2848)
    
  5. Bastos Vieira, S.; DeBrito, M.; Mustafa El Hadi, W.; Zumer, M.: Developing imaged KOS with the FRSAD Model : a conceptual methodology (2016) 2.37
    2.3749156 = sum of:
      2.3749156 = weight(author_txt:vieira in 3109) [ClassicSimilarity], result of:
        2.3749156 = fieldWeight in 3109, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.25 = fieldNorm(doc=3109)
    

Similar documents (content)

  1. Vo, D.-T.; Bagheri, E.: Feature-enriched matrix factorization for relation extraction (2019) 0.28
    0.27952403 = sum of:
      0.27952403 = product of:
        0.87351257 = sum of:
          0.022699112 = weight(abstract_txt:work in 5105) [ClassicSimilarity], result of:
            0.022699112 = score(doc=5105,freq=4.0), product of:
              0.05469501 = queryWeight, product of:
                1.1394706 = boost
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.012650319 = queryNorm
              0.41501248 = fieldWeight in 5105, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.008256122 = weight(abstract_txt:this in 5105) [ClassicSimilarity], result of:
            0.008256122 = score(doc=5105,freq=2.0), product of:
              0.044239733 = queryWeight, product of:
                1.4492741 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.012650319 = queryNorm
              0.1866223 = fieldWeight in 5105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.024411619 = weight(abstract_txt:between in 5105) [ClassicSimilarity], result of:
            0.024411619 = score(doc=5105,freq=2.0), product of:
              0.091136605 = queryWeight, product of:
                2.080131 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.012650319 = queryNorm
              0.26785746 = fieldWeight in 5105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.03497744 = weight(abstract_txt:open in 5105) [ClassicSimilarity], result of:
            0.03497744 = score(doc=5105,freq=1.0), product of:
              0.13259207 = queryWeight, product of:
                2.1728697 = boost
                4.8237233 = idf(docFreq=965, maxDocs=44218)
                0.012650319 = queryNorm
              0.26379737 = fieldWeight in 5105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8237233 = idf(docFreq=965, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.098927446 = weight(abstract_txt:named in 5105) [ClassicSimilarity], result of:
            0.098927446 = score(doc=5105,freq=1.0), product of:
              0.26517755 = queryWeight, product of:
                3.0728636 = boost
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.012650319 = queryNorm
              0.37306118 = fieldWeight in 5105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.08247318 = weight(abstract_txt:entities in 5105) [ClassicSimilarity], result of:
            0.08247318 = score(doc=5105,freq=1.0), product of:
              0.2585316 = queryWeight, product of:
                3.5034916 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.012650319 = queryNorm
              0.3190062 = fieldWeight in 5105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.1708203 = weight(abstract_txt:extraction in 5105) [ClassicSimilarity], result of:
            0.1708203 = score(doc=5105,freq=3.0), product of:
              0.29126683 = queryWeight, product of:
                3.7186885 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012650319 = queryNorm
              0.5864736 = fieldWeight in 5105, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
          0.43094733 = weight(abstract_txt:relation in 5105) [ClassicSimilarity], result of:
            0.43094733 = score(doc=5105,freq=7.0), product of:
              0.5523342 = queryWeight, product of:
                8.096834 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012650319 = queryNorm
              0.78022933 = fieldWeight in 5105, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5105)
        0.32 = coord(8/25)
    
  2. Li, J.; Zhang, Z.; Li, X.; Chen, H.: Kernel-based learning for biomedical relation extraction (2008) 0.27
    0.26547438 = sum of:
      0.26547438 = product of:
        1.1061432 = sum of:
          0.0083399415 = weight(abstract_txt:this in 1611) [ClassicSimilarity], result of:
            0.0083399415 = score(doc=1611,freq=1.0), product of:
              0.044239733 = queryWeight, product of:
                1.4492741 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.012650319 = queryNorm
              0.18851699 = fieldWeight in 1611, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
          0.024659459 = weight(abstract_txt:between in 1611) [ClassicSimilarity], result of:
            0.024659459 = score(doc=1611,freq=1.0), product of:
              0.091136605 = queryWeight, product of:
                2.080131 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.012650319 = queryNorm
              0.2705769 = fieldWeight in 1611, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
          0.14132494 = weight(abstract_txt:named in 1611) [ClassicSimilarity], result of:
            0.14132494 = score(doc=1611,freq=1.0), product of:
              0.26517755 = queryWeight, product of:
                3.0728636 = boost
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.012650319 = queryNorm
              0.53294456 = fieldWeight in 1611, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.82169 = idf(docFreq=130, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
          0.11781883 = weight(abstract_txt:entities in 1611) [ClassicSimilarity], result of:
            0.11781883 = score(doc=1611,freq=1.0), product of:
              0.2585316 = queryWeight, product of:
                3.5034916 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.012650319 = queryNorm
              0.45572314 = fieldWeight in 1611, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
          0.24402902 = weight(abstract_txt:extraction in 1611) [ClassicSimilarity], result of:
            0.24402902 = score(doc=1611,freq=3.0), product of:
              0.29126683 = queryWeight, product of:
                3.7186885 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012650319 = queryNorm
              0.83781946 = fieldWeight in 1611, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
          0.5699711 = weight(abstract_txt:relation in 1611) [ClassicSimilarity], result of:
            0.5699711 = score(doc=1611,freq=6.0), product of:
              0.5523342 = queryWeight, product of:
                8.096834 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012650319 = queryNorm
              1.0319315 = fieldWeight in 1611, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.078125 = fieldNorm(doc=1611)
        0.24 = coord(6/25)
    
  3. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.24
    0.23922136 = sum of:
      0.23922136 = product of:
        0.9967557 = sum of:
          0.053841803 = weight(abstract_txt:sentence in 5055) [ClassicSimilarity], result of:
            0.053841803 = score(doc=5055,freq=2.0), product of:
              0.0889939 = queryWeight, product of:
                1.0277663 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.012650319 = queryNorm
              0.60500556 = fieldWeight in 5055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.006671953 = weight(abstract_txt:this in 5055) [ClassicSimilarity], result of:
            0.006671953 = score(doc=5055,freq=1.0), product of:
              0.044239733 = queryWeight, product of:
                1.4492741 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.012650319 = queryNorm
              0.1508136 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.027898993 = weight(abstract_txt:between in 5055) [ClassicSimilarity], result of:
            0.027898993 = score(doc=5055,freq=2.0), product of:
              0.091136605 = queryWeight, product of:
                2.080131 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.012650319 = queryNorm
              0.3061228 = fieldWeight in 5055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.09425507 = weight(abstract_txt:entities in 5055) [ClassicSimilarity], result of:
            0.09425507 = score(doc=5055,freq=1.0), product of:
              0.2585316 = queryWeight, product of:
                3.5034916 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.012650319 = queryNorm
              0.36457852 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.22542435 = weight(abstract_txt:extraction in 5055) [ClassicSimilarity], result of:
            0.22542435 = score(doc=5055,freq=4.0), product of:
              0.29126683 = queryWeight, product of:
                3.7186885 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012650319 = queryNorm
              0.77394444 = fieldWeight in 5055, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.5886635 = weight(abstract_txt:relation in 5055) [ClassicSimilarity], result of:
            0.5886635 = score(doc=5055,freq=10.0), product of:
              0.5523342 = queryWeight, product of:
                8.096834 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012650319 = queryNorm
              1.0657742 = fieldWeight in 5055, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
        0.24 = coord(6/25)
    
  4. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 0.18
    0.17630236 = sum of:
      0.17630236 = product of:
        0.88151175 = sum of:
          0.013343906 = weight(abstract_txt:this in 927) [ClassicSimilarity], result of:
            0.013343906 = score(doc=927,freq=4.0), product of:
              0.044239733 = queryWeight, product of:
                1.4492741 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.012650319 = queryNorm
              0.3016272 = fieldWeight in 927, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.019727567 = weight(abstract_txt:between in 927) [ClassicSimilarity], result of:
            0.019727567 = score(doc=927,freq=1.0), product of:
              0.091136605 = queryWeight, product of:
                2.080131 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.012650319 = queryNorm
              0.21646151 = fieldWeight in 927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.09425507 = weight(abstract_txt:entities in 927) [ClassicSimilarity], result of:
            0.09425507 = score(doc=927,freq=1.0), product of:
              0.2585316 = queryWeight, product of:
                3.5034916 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.012650319 = queryNorm
              0.36457852 = fieldWeight in 927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.29820836 = weight(abstract_txt:extraction in 927) [ClassicSimilarity], result of:
            0.29820836 = score(doc=927,freq=7.0), product of:
              0.29126683 = queryWeight, product of:
                3.7186885 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012650319 = queryNorm
              1.0238322 = fieldWeight in 927, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.45597684 = weight(abstract_txt:relation in 927) [ClassicSimilarity], result of:
            0.45597684 = score(doc=927,freq=6.0), product of:
              0.5523342 = queryWeight, product of:
                8.096834 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012650319 = queryNorm
              0.8255452 = fieldWeight in 927, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
        0.2 = coord(5/25)
    
  5. Zhang, M.; Zhou, G.D.; Aw, A.: Exploring syntactic structured features over parse trees for relation extraction using kernel methods (2008) 0.16
    0.1631507 = sum of:
      0.1631507 = product of:
        0.81575346 = sum of:
          0.009435567 = weight(abstract_txt:this in 2055) [ClassicSimilarity], result of:
            0.009435567 = score(doc=2055,freq=2.0), product of:
              0.044239733 = queryWeight, product of:
                1.4492741 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.012650319 = queryNorm
              0.21328263 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=2055)
          0.019727567 = weight(abstract_txt:between in 2055) [ClassicSimilarity], result of:
            0.019727567 = score(doc=2055,freq=1.0), product of:
              0.091136605 = queryWeight, product of:
                2.080131 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.012650319 = queryNorm
              0.21646151 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0625 = fieldNorm(doc=2055)
          0.09425507 = weight(abstract_txt:entities in 2055) [ClassicSimilarity], result of:
            0.09425507 = score(doc=2055,freq=1.0), product of:
              0.2585316 = queryWeight, product of:
                3.5034916 = boost
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.012650319 = queryNorm
              0.36457852 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8332562 = idf(docFreq=351, maxDocs=44218)
                0.0625 = fieldNorm(doc=2055)
          0.2760873 = weight(abstract_txt:extraction in 2055) [ClassicSimilarity], result of:
            0.2760873 = score(doc=2055,freq=6.0), product of:
              0.29126683 = queryWeight, product of:
                3.7186885 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.012650319 = queryNorm
              0.9478845 = fieldWeight in 2055, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=2055)
          0.416248 = weight(abstract_txt:relation in 2055) [ClassicSimilarity], result of:
            0.416248 = score(doc=2055,freq=5.0), product of:
              0.5523342 = queryWeight, product of:
                8.096834 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012650319 = queryNorm
              0.7536162 = fieldWeight in 2055, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.0625 = fieldNorm(doc=2055)
        0.2 = coord(5/25)