Document (#16321)

Author
Kuikka, E.
Salminen, A.
Title
Two-dimensional filters for structured text
Source
Information processing and management. 33(1997) no.1, S.37-54
Year
1997
Abstract
Introduces a method for defining filters for structured text. The text structure is defined by a grammar consisting of a set of productions. To describe the information interests, a two-dimensional template is first created interactively from the grammar to show the structure of a set of textual elements, at a chosen level of detail. The template depicts the hierarchical structure of the elements and indicates also optionality, alternatives and iteration in the structure. The template is filled vy constraints and annotations. The constraints allow giving conditions to the content of parts, to the position of parts in an orderd set of parts, and to the number of parts obeying a specified property. In a compound filter, several templates are connected by annotations. The method is intended to be used as a theoretical framework for developing flexible and powerful graphical interfaces for filtering structured text. Describes a prototype implementation

Similar documents (author)

  1. Salminen, A.: Modeling documents in their context (2009) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:salminen in 834) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 834, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=834)
    
  2. Salminen, A.: Markup languages (2009) 6.10
    6.0972233 = sum of:
      6.0972233 = weight(author_txt:salminen in 836) [ClassicSimilarity], result of:
        6.0972233 = fieldWeight in 836, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.625 = fieldNorm(doc=836)
    
  3. Salminen, A.; Kauppinen, K.; Lehtovaara, M.: Towards a methodology for document analysis (1997) 3.66
    3.6583338 = sum of:
      3.6583338 = weight(author_txt:salminen in 3642) [ClassicSimilarity], result of:
        3.6583338 = fieldWeight in 3642, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.375 = fieldNorm(doc=3642)
    
  4. Salminen, A.; Tague-Sutcliffe, J.; McClellan, C.: From text to hypertext by indexing (1995) 3.66
    3.6583338 = sum of:
      3.6583338 = weight(author_txt:salminen in 1931) [ClassicSimilarity], result of:
        3.6583338 = fieldWeight in 1931, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.375 = fieldNorm(doc=1931)
    
  5. Salminen, A.; Jauhiainen, E.; Nurmeksela, R.: ¬A life cycle model of XML documents (2014) 3.66
    3.6583338 = sum of:
      3.6583338 = weight(author_txt:salminen in 2553) [ClassicSimilarity], result of:
        3.6583338 = fieldWeight in 2553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.755557 = idf(docFreq=6, maxDocs=44421)
          0.375 = fieldNorm(doc=2553)
    

Similar documents (content)

  1. Darányi, S.; Wittek, P.: Demonstrating conceptual dynamics in an evolving text collection (2013) 0.10
    0.09847771 = sum of:
      0.09847771 = product of:
        0.4103238 = sum of:
          0.07757025 = weight(abstract_txt:filter in 2137) [ClassicSimilarity], result of:
            0.07757025 = score(doc=2137,freq=2.0), product of:
              0.120705456 = queryWeight, product of:
                1.1122847 = boost
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.014925802 = queryNorm
              0.6426408 = fieldWeight in 2137, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
          0.025988566 = weight(abstract_txt:method in 2137) [ClassicSimilarity], result of:
            0.025988566 = score(doc=2137,freq=1.0), product of:
              0.092428304 = queryWeight, product of:
                1.3764802 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.014925802 = queryNorm
              0.2811754 = fieldWeight in 2137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
          0.041475117 = weight(abstract_txt:elements in 2137) [ClassicSimilarity], result of:
            0.041475117 = score(doc=2137,freq=1.0), product of:
              0.12622398 = queryWeight, product of:
                1.6085643 = boost
                5.257336 = idf(docFreq=628, maxDocs=44421)
                0.014925802 = queryNorm
              0.3285835 = fieldWeight in 2137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.257336 = idf(docFreq=628, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
          0.18037473 = weight(abstract_txt:dimensional in 2137) [ClassicSimilarity], result of:
            0.18037473 = score(doc=2137,freq=4.0), product of:
              0.21185915 = queryWeight, product of:
                2.0839682 = boost
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.014925802 = queryNorm
              0.8513898 = fieldWeight in 2137, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
          0.03766595 = weight(abstract_txt:text in 2137) [ClassicSimilarity], result of:
            0.03766595 = score(doc=2137,freq=1.0), product of:
              0.14913951 = queryWeight, product of:
                2.472742 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014925802 = queryNorm
              0.25255513 = fieldWeight in 2137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
          0.04724919 = weight(abstract_txt:structure in 2137) [ClassicSimilarity], result of:
            0.04724919 = score(doc=2137,freq=1.0), product of:
              0.17346945 = queryWeight, product of:
                2.6668217 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014925802 = queryNorm
              0.27237758 = fieldWeight in 2137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=2137)
        0.24 = coord(6/25)
    
  2. Taniguchi, S.: ¬A system for analyzing cataloguing rules : a feasibility study (1996) 0.10
    0.098323025 = sum of:
      0.098323025 = product of:
        0.61451894 = sum of:
          0.08064547 = weight(abstract_txt:templates in 4266) [ClassicSimilarity], result of:
            0.08064547 = score(doc=4266,freq=1.0), product of:
              0.15607263 = queryWeight, product of:
                1.2647825 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.014925802 = queryNorm
              0.51671755 = fieldWeight in 4266, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.0625 = fieldNorm(doc=4266)
          0.06682044 = weight(abstract_txt:structure in 4266) [ClassicSimilarity], result of:
            0.06682044 = score(doc=4266,freq=2.0), product of:
              0.17346945 = queryWeight, product of:
                2.6668217 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014925802 = queryNorm
              0.38520005 = fieldWeight in 4266, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=4266)
          0.12882997 = weight(abstract_txt:parts in 4266) [ClassicSimilarity], result of:
            0.12882997 = score(doc=4266,freq=1.0), product of:
              0.33856186 = queryWeight, product of:
                3.7256453 = boost
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.014925802 = queryNorm
              0.3805212 = fieldWeight in 4266, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.0625 = fieldNorm(doc=4266)
          0.33822304 = weight(abstract_txt:template in 4266) [ClassicSimilarity], result of:
            0.33822304 = score(doc=4266,freq=2.0), product of:
              0.46462864 = queryWeight, product of:
                3.7797763 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.014925802 = queryNorm
              0.72794276 = fieldWeight in 4266, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.0625 = fieldNorm(doc=4266)
        0.16 = coord(4/25)
    
  3. Shuldberg, H.K.; Macpherson, M.; Humphrey, P.: Distilling information from text : the EDS TeplateFiller system (1993) 0.08
    0.082586505 = sum of:
      0.082586505 = product of:
        0.5161657 = sum of:
          0.049824435 = weight(abstract_txt:filtering in 5641) [ClassicSimilarity], result of:
            0.049824435 = score(doc=5641,freq=1.0), product of:
              0.09756522 = queryWeight, product of:
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.014925802 = queryNorm
              0.51067823 = fieldWeight in 5641, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=5641)
          0.10080683 = weight(abstract_txt:templates in 5641) [ClassicSimilarity], result of:
            0.10080683 = score(doc=5641,freq=1.0), product of:
              0.15607263 = queryWeight, product of:
                1.2647825 = boost
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.014925802 = queryNorm
              0.6458969 = fieldWeight in 5641, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.267481 = idf(docFreq=30, maxDocs=44421)
                0.078125 = fieldNorm(doc=5641)
          0.06658462 = weight(abstract_txt:text in 5641) [ClassicSimilarity], result of:
            0.06658462 = score(doc=5641,freq=2.0), product of:
              0.14913951 = queryWeight, product of:
                2.472742 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014925802 = queryNorm
              0.4464586 = fieldWeight in 5641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=5641)
          0.29894978 = weight(abstract_txt:template in 5641) [ClassicSimilarity], result of:
            0.29894978 = score(doc=5641,freq=1.0), product of:
              0.46462864 = queryWeight, product of:
                3.7797763 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.014925802 = queryNorm
              0.6434166 = fieldWeight in 5641, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.078125 = fieldNorm(doc=5641)
        0.16 = coord(4/25)
    
  4. Smith, D.A.; Shadbolt, N.R.: FacetOntology : expressive descriptions of facets in the Semantic Web (2012) 0.08
    0.077900015 = sum of:
      0.077900015 = product of:
        0.38950008 = sum of:
          0.049752336 = weight(abstract_txt:specified in 3208) [ClassicSimilarity], result of:
            0.049752336 = score(doc=3208,freq=1.0), product of:
              0.11310516 = queryWeight, product of:
                1.0766975 = boost
                7.0380287 = idf(docFreq=105, maxDocs=44421)
                0.014925802 = queryNorm
              0.4398768 = fieldWeight in 3208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0380287 = idf(docFreq=105, maxDocs=44421)
                0.0625 = fieldNorm(doc=3208)
          0.05485045 = weight(abstract_txt:filter in 3208) [ClassicSimilarity], result of:
            0.05485045 = score(doc=3208,freq=1.0), product of:
              0.120705456 = queryWeight, product of:
                1.1122847 = boost
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.014925802 = queryNorm
              0.45441568 = fieldWeight in 3208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.270651 = idf(docFreq=83, maxDocs=44421)
                0.0625 = fieldNorm(doc=3208)
          0.025988566 = weight(abstract_txt:method in 3208) [ClassicSimilarity], result of:
            0.025988566 = score(doc=3208,freq=1.0), product of:
              0.092428304 = queryWeight, product of:
                1.3764802 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.014925802 = queryNorm
              0.2811754 = fieldWeight in 3208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=3208)
          0.21165952 = weight(abstract_txt:filters in 3208) [ClassicSimilarity], result of:
            0.21165952 = score(doc=3208,freq=2.0), product of:
              0.29696047 = queryWeight, product of:
                2.4672709 = boost
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.014925802 = queryNorm
              0.7127532 = fieldWeight in 3208, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.063882 = idf(docFreq=37, maxDocs=44421)
                0.0625 = fieldNorm(doc=3208)
          0.04724919 = weight(abstract_txt:structure in 3208) [ClassicSimilarity], result of:
            0.04724919 = score(doc=3208,freq=1.0), product of:
              0.17346945 = queryWeight, product of:
                2.6668217 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014925802 = queryNorm
              0.27237758 = fieldWeight in 3208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=3208)
        0.2 = coord(5/25)
    
  5. Crestani, F.; Vegas, J.; Fuente, P. de la: ¬A graphical user interface for the retrieval of hierarchically structured documents (2004) 0.08
    0.077624016 = sum of:
      0.077624016 = product of:
        0.4851501 = sum of:
          0.073608786 = weight(abstract_txt:graphical in 3555) [ClassicSimilarity], result of:
            0.073608786 = score(doc=3555,freq=2.0), product of:
              0.100448444 = queryWeight, product of:
                1.0146683 = boost
                6.6325636 = idf(docFreq=158, maxDocs=44421)
                0.014925802 = queryNorm
              0.7328016 = fieldWeight in 3555, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6325636 = idf(docFreq=158, maxDocs=44421)
                0.078125 = fieldNorm(doc=3555)
          0.19144233 = weight(abstract_txt:structured in 3555) [ClassicSimilarity], result of:
            0.19144233 = score(doc=3555,freq=5.0), product of:
              0.20187187 = queryWeight, product of:
                2.4914434 = boost
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.014925802 = queryNorm
              0.9483358 = fieldWeight in 3555, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.428591 = idf(docFreq=529, maxDocs=44421)
                0.078125 = fieldNorm(doc=3555)
          0.05906149 = weight(abstract_txt:structure in 3555) [ClassicSimilarity], result of:
            0.05906149 = score(doc=3555,freq=1.0), product of:
              0.17346945 = queryWeight, product of:
                2.6668217 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014925802 = queryNorm
              0.34047198 = fieldWeight in 3555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.078125 = fieldNorm(doc=3555)
          0.16103746 = weight(abstract_txt:parts in 3555) [ClassicSimilarity], result of:
            0.16103746 = score(doc=3555,freq=1.0), product of:
              0.33856186 = queryWeight, product of:
                3.7256453 = boost
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.014925802 = queryNorm
              0.4756515 = fieldWeight in 3555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0883393 = idf(docFreq=273, maxDocs=44421)
                0.078125 = fieldNorm(doc=3555)
        0.16 = coord(4/25)