Document (#16777)

Author
Saeed, K.
Dardzinska, A.
Title
Natural language processing : word recognition without segmentation
Source
Journal of the American Society for Information Science and technology. 52(2001) no.14, S.1275-1279
Year
2001
Abstract
In an earlier article about the methods of recognition of machine and hand-written cursive letters, we presented a model showing the possibility of processing, classifying, and hence recognizing such scripts as images. The practical results we obtained encouraged us to extend the theory to an algorithm for word recognition. In this article, we introduce our ideas, describe our achievements, and present our results of testing words for recognition without segmentation. This would lead to the possibility of applying the methods used in this work, together with other previously developed algorithms to process whole sentences and, hence, written and spoken texts with the goal of automatic recognition.
Theme
Computerlinguistik

Similar documents (content)

  1. Xinglin, L.: Automatic summarization method based on compound word recognition (2015) 0.18
    0.1792769 = sum of:
      0.1792769 = product of:
        0.64027464 = sum of:
          0.015525397 = weight(abstract_txt:results in 2841) [ClassicSimilarity], result of:
            0.015525397 = score(doc=2841,freq=1.0), product of:
              0.07140893 = queryWeight, product of:
                1.133673 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.018107336 = queryNorm
              0.21741535 = fieldWeight in 2841, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.08949009 = weight(abstract_txt:sentences in 2841) [ClassicSimilarity], result of:
            0.08949009 = score(doc=2841,freq=2.0), product of:
              0.14461745 = queryWeight, product of:
                1.1407931 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.018107336 = queryNorm
              0.61880565 = fieldWeight in 2841, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.010900149 = weight(abstract_txt:this in 2841) [ClassicSimilarity], result of:
            0.010900149 = score(doc=2841,freq=2.0), product of:
              0.051250655 = queryWeight, product of:
                1.1762695 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018107336 = queryNorm
              0.21268311 = fieldWeight in 2841, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.026236486 = weight(abstract_txt:methods in 2841) [ClassicSimilarity], result of:
            0.026236486 = score(doc=2841,freq=1.0), product of:
              0.10131206 = queryWeight, product of:
                1.3503368 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.018107336 = queryNorm
              0.25896704 = fieldWeight in 2841, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.1027323 = weight(abstract_txt:word in 2841) [ClassicSimilarity], result of:
            0.1027323 = score(doc=2841,freq=3.0), product of:
              0.17451054 = queryWeight, product of:
                1.7722392 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.018107336 = queryNorm
              0.58868825 = fieldWeight in 2841, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.18463652 = weight(abstract_txt:segmentation in 2841) [ClassicSimilarity], result of:
            0.18463652 = score(doc=2841,freq=1.0), product of:
              0.37205097 = queryWeight, product of:
                2.587693 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.018107336 = queryNorm
              0.49626672 = fieldWeight in 2841, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
          0.21075371 = weight(abstract_txt:recognition in 2841) [ClassicSimilarity], result of:
            0.21075371 = score(doc=2841,freq=1.0), product of:
              0.55151105 = queryWeight, product of:
                4.981483 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.018107336 = queryNorm
              0.3821387 = fieldWeight in 2841, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=2841)
        0.28 = coord(7/25)
    
  2. Giannella, C.: ¬An improved algorithm for unsupervised decomposition of a multi-author document (2016) 0.17
    0.17382884 = sum of:
      0.17382884 = product of:
        0.6208173 = sum of:
          0.019406747 = weight(abstract_txt:results in 3642) [ClassicSimilarity], result of:
            0.019406747 = score(doc=3642,freq=1.0), product of:
              0.07140893 = queryWeight, product of:
                1.133673 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.018107336 = queryNorm
              0.2717692 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.11186262 = weight(abstract_txt:sentences in 3642) [ClassicSimilarity], result of:
            0.11186262 = score(doc=3642,freq=2.0), product of:
              0.14461745 = queryWeight, product of:
                1.1407931 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.018107336 = queryNorm
              0.77350706 = fieldWeight in 3642, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.016687376 = weight(abstract_txt:this in 3642) [ClassicSimilarity], result of:
            0.016687376 = score(doc=3642,freq=3.0), product of:
              0.051250655 = queryWeight, product of:
                1.1762695 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018107336 = queryNorm
              0.3256032 = fieldWeight in 3642, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.025284255 = weight(abstract_txt:article in 3642) [ClassicSimilarity], result of:
            0.025284255 = score(doc=3642,freq=1.0), product of:
              0.085182585 = queryWeight, product of:
                1.238189 = boost
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.018107336 = queryNorm
              0.29682422 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.08901859 = weight(abstract_txt:written in 3642) [ClassicSimilarity], result of:
            0.08901859 = score(doc=3642,freq=1.0), product of:
              0.19713838 = queryWeight, product of:
                1.8836366 = boost
                5.779889 = idf(docFreq=372, maxDocs=44421)
                0.018107336 = queryNorm
              0.45155382 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.779889 = idf(docFreq=372, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.12776203 = weight(abstract_txt:hence in 3642) [ClassicSimilarity], result of:
            0.12776203 = score(doc=3642,freq=1.0), product of:
              0.25083333 = queryWeight, product of:
                2.124732 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.018107336 = queryNorm
              0.5093503 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
          0.23079565 = weight(abstract_txt:segmentation in 3642) [ClassicSimilarity], result of:
            0.23079565 = score(doc=3642,freq=1.0), product of:
              0.37205097 = queryWeight, product of:
                2.587693 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.018107336 = queryNorm
              0.62033343 = fieldWeight in 3642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.078125 = fieldNorm(doc=3642)
        0.28 = coord(7/25)
    
  3. Khoo, C.S.G.; Dai, D.; Loh, T.E.: Using statistical and contextual information to identify two- and three-character words in Chinese text (2002) 0.16
    0.1601057 = sum of:
      0.1601057 = product of:
        0.6671071 = sum of:
          0.021956226 = weight(abstract_txt:results in 206) [ClassicSimilarity], result of:
            0.021956226 = score(doc=206,freq=2.0), product of:
              0.07140893 = queryWeight, product of:
                1.133673 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.018107336 = queryNorm
              0.30747172 = fieldWeight in 206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.063279055 = weight(abstract_txt:sentences in 206) [ClassicSimilarity], result of:
            0.063279055 = score(doc=206,freq=1.0), product of:
              0.14461745 = queryWeight, product of:
                1.1407931 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.018107336 = queryNorm
              0.4375617 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.026236486 = weight(abstract_txt:methods in 206) [ClassicSimilarity], result of:
            0.026236486 = score(doc=206,freq=1.0), product of:
              0.10131206 = queryWeight, product of:
                1.3503368 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.018107336 = queryNorm
              0.25896704 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.044057563 = weight(abstract_txt:processing in 206) [ClassicSimilarity], result of:
            0.044057563 = score(doc=206,freq=1.0), product of:
              0.14313231 = queryWeight, product of:
                1.6050197 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.018107336 = queryNorm
              0.30781004 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.059312526 = weight(abstract_txt:word in 206) [ClassicSimilarity], result of:
            0.059312526 = score(doc=206,freq=1.0), product of:
              0.17451054 = queryWeight, product of:
                1.7722392 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.018107336 = queryNorm
              0.33987933 = fieldWeight in 206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
          0.45226526 = weight(abstract_txt:segmentation in 206) [ClassicSimilarity], result of:
            0.45226526 = score(doc=206,freq=6.0), product of:
              0.37205097 = queryWeight, product of:
                2.587693 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.018107336 = queryNorm
              1.2156003 = fieldWeight in 206, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=206)
        0.24 = coord(6/25)
    
  4. Lin, M.; Zhang, Z.: Question-driven segmentation of lecture speech text : towards intelligent e-learning systems (2008) 0.16
    0.15981735 = sum of:
      0.15981735 = product of:
        0.66590565 = sum of:
          0.015525397 = weight(abstract_txt:results in 2351) [ClassicSimilarity], result of:
            0.015525397 = score(doc=2351,freq=1.0), product of:
              0.07140893 = queryWeight, product of:
                1.133673 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.018107336 = queryNorm
              0.21741535 = fieldWeight in 2351, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
          0.063279055 = weight(abstract_txt:sentences in 2351) [ClassicSimilarity], result of:
            0.063279055 = score(doc=2351,freq=1.0), product of:
              0.14461745 = queryWeight, product of:
                1.1407931 = boost
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.018107336 = queryNorm
              0.4375617 = fieldWeight in 2351, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.000987 = idf(docFreq=109, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
          0.0077075693 = weight(abstract_txt:this in 2351) [ClassicSimilarity], result of:
            0.0077075693 = score(doc=2351,freq=1.0), product of:
              0.051250655 = queryWeight, product of:
                1.1762695 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018107336 = queryNorm
              0.15038967 = fieldWeight in 2351, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
          0.020227404 = weight(abstract_txt:article in 2351) [ClassicSimilarity], result of:
            0.020227404 = score(doc=2351,freq=1.0), product of:
              0.085182585 = queryWeight, product of:
                1.238189 = boost
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.018107336 = queryNorm
              0.23745938 = fieldWeight in 2351, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
          0.26111546 = weight(abstract_txt:segmentation in 2351) [ClassicSimilarity], result of:
            0.26111546 = score(doc=2351,freq=2.0), product of:
              0.37205097 = queryWeight, product of:
                2.587693 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.018107336 = queryNorm
              0.7018271 = fieldWeight in 2351, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
          0.29805076 = weight(abstract_txt:recognition in 2351) [ClassicSimilarity], result of:
            0.29805076 = score(doc=2351,freq=2.0), product of:
              0.55151105 = queryWeight, product of:
                4.981483 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.018107336 = queryNorm
              0.5404257 = fieldWeight in 2351, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0625 = fieldNorm(doc=2351)
        0.24 = coord(6/25)
    
  5. Shaalan, K.; Raza, H.: NERA: Named Entity Recognition for Arabic (2009) 0.16
    0.15938857 = sum of:
      0.15938857 = product of:
        0.5692449 = sum of:
          0.023529429 = weight(abstract_txt:results in 3953) [ClassicSimilarity], result of:
            0.023529429 = score(doc=3953,freq=3.0), product of:
              0.07140893 = queryWeight, product of:
                1.133673 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.018107336 = queryNorm
              0.3295026 = fieldWeight in 3953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.009537631 = weight(abstract_txt:this in 3953) [ClassicSimilarity], result of:
            0.009537631 = score(doc=3953,freq=2.0), product of:
              0.051250655 = queryWeight, product of:
                1.1762695 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.018107336 = queryNorm
              0.18609773 = fieldWeight in 3953, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.017698977 = weight(abstract_txt:article in 3953) [ClassicSimilarity], result of:
            0.017698977 = score(doc=3953,freq=1.0), product of:
              0.085182585 = queryWeight, product of:
                1.238189 = boost
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.018107336 = queryNorm
              0.20777695 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.79935 = idf(docFreq=2702, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.08224081 = weight(abstract_txt:recognizing in 3953) [ClassicSimilarity], result of:
            0.08224081 = score(doc=3953,freq=1.0), product of:
              0.18826385 = queryWeight, product of:
                1.3016074 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.018107336 = queryNorm
              0.43683803 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.05451845 = weight(abstract_txt:processing in 3953) [ClassicSimilarity], result of:
            0.05451845 = score(doc=3953,freq=2.0), product of:
              0.14313231 = queryWeight, product of:
                1.6050197 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.018107336 = queryNorm
              0.38089547 = fieldWeight in 3953, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.062313017 = weight(abstract_txt:written in 3953) [ClassicSimilarity], result of:
            0.062313017 = score(doc=3953,freq=1.0), product of:
              0.19713838 = queryWeight, product of:
                1.8836366 = boost
                5.779889 = idf(docFreq=372, maxDocs=44421)
                0.018107336 = queryNorm
              0.3160877 = fieldWeight in 3953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.779889 = idf(docFreq=372, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
          0.31940663 = weight(abstract_txt:recognition in 3953) [ClassicSimilarity], result of:
            0.31940663 = score(doc=3953,freq=3.0), product of:
              0.55151105 = queryWeight, product of:
                4.981483 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.018107336 = queryNorm
              0.5791482 = fieldWeight in 3953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3953)
        0.28 = coord(7/25)