Document (#27207)

AI-Sughaiyer, I.A.
AI-Kharashi, I.A.
Arabic morphological analysis techniques : a comprehensive survey
Journal of the American Society for Information Science and technology. 55(2004) no.3, S.189-213
After several decades of heavy research activity an English stemmers, Arabic morphological analysis techniques have become a popular area of research. The Arabic language is one of the Semitic languages; it exhibits a very systematic but complex morphological structure based an root-pattern schemes. As a consequence, survey of such techniques proves to be more necessary. The aim of this paper is to summarize and organize the information available in the literature in an attempt to motivate researchers to look into these techniques and try to develop more advanced ones. This paper introduces, classifies, and surveys Arabic morphological analysis techniques. Furthermore, conclusions, open areas, and future directions are provided at the end.

Similar documents (content)

  1. Kanaan, G.; Al-Shalabi, R.; Ghwanmeh, S.; Al-Ma'adeed, H.: ¬A comparison of text-classification techniques applied to Arabic text (2009) 0.20
    0.19942266 = sum of:
      0.19942266 = product of:
        0.9971133 = sum of:
          0.019327285 = weight(abstract_txt:research in 83) [ClassicSimilarity], result of:
            0.019327285 = score(doc=83,freq=2.0), product of:
              0.046137594 = queryWeight, product of:
                1.0416185 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.014018987 = queryNorm
              0.41890535 = fieldWeight in 83, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.09375 = fieldNorm(doc=83)
          0.016972996 = weight(abstract_txt:more in 83) [ClassicSimilarity], result of:
            0.016972996 = score(doc=83,freq=1.0), product of:
              0.053307716 = queryWeight, product of:
                1.1196344 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.014018987 = queryNorm
              0.31839663 = fieldWeight in 83, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.09375 = fieldNorm(doc=83)
          0.017972209 = weight(abstract_txt:paper in 83) [ClassicSimilarity], result of:
            0.017972209 = score(doc=83,freq=1.0), product of:
              0.05537988 = queryWeight, product of:
                1.141188 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.014018987 = queryNorm
              0.32452595 = fieldWeight in 83, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.09375 = fieldNorm(doc=83)
          0.10083479 = weight(abstract_txt:techniques in 83) [ClassicSimilarity], result of:
            0.10083479 = score(doc=83,freq=1.0), product of:
              0.23732199 = queryWeight, product of:
                3.7352545 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.014018987 = queryNorm
              0.424886 = fieldWeight in 83, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.09375 = fieldNorm(doc=83)
          0.842006 = weight(abstract_txt:arabic in 83) [ClassicSimilarity], result of:
            0.842006 = score(doc=83,freq=5.0), product of:
              0.5302913 = queryWeight, product of:
                4.9940567 = boost
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.014018987 = queryNorm
              1.5878179 = fieldWeight in 83, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.09375 = fieldNorm(doc=83)
        0.2 = coord(5/25)
  2. Anizi, M.; Dichy, J.: Improving information retrieval in Arabic through a multi-agent approach and a rich lexical resource (2011) 0.17
    0.1673507 = sum of:
      0.1673507 = product of:
        0.8367535 = sum of:
          0.0091109695 = weight(abstract_txt:research in 738) [ClassicSimilarity], result of:
            0.0091109695 = score(doc=738,freq=1.0), product of:
              0.046137594 = queryWeight, product of:
                1.0416185 = boost
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.014018987 = queryNorm
              0.19747387 = fieldWeight in 738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.159582 = idf(docFreq=5124, maxDocs=44421)
                0.0625 = fieldNorm(doc=738)
          0.011981472 = weight(abstract_txt:paper in 738) [ClassicSimilarity], result of:
            0.011981472 = score(doc=738,freq=1.0), product of:
              0.05537988 = queryWeight, product of:
                1.141188 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.014018987 = queryNorm
              0.21635064 = fieldWeight in 738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.0625 = fieldNorm(doc=738)
          0.03636315 = weight(abstract_txt:analysis in 738) [ClassicSimilarity], result of:
            0.03636315 = score(doc=738,freq=3.0), product of:
              0.09213887 = queryWeight, product of:
                1.8028029 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.014018987 = queryNorm
              0.3946559 = fieldWeight in 738, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0625 = fieldNorm(doc=738)
          0.3550209 = weight(abstract_txt:arabic in 738) [ClassicSimilarity], result of:
            0.3550209 = score(doc=738,freq=2.0), product of:
              0.5302913 = queryWeight, product of:
                4.9940567 = boost
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.014018987 = queryNorm
              0.66948277 = fieldWeight in 738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.0625 = fieldNorm(doc=738)
          0.42427695 = weight(abstract_txt:morphological in 738) [ClassicSimilarity], result of:
            0.42427695 = score(doc=738,freq=2.0), product of:
              0.5971886 = queryWeight, product of:
                5.299708 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.014018987 = queryNorm
              0.7104572 = fieldWeight in 738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.0625 = fieldNorm(doc=738)
        0.2 = coord(5/25)
  3. Mansour, N.; Haraty, R.A.; Daher, W.; Houri, M.: ¬An auto-indexing method for Arabic text (2008) 0.17
    0.16645701 = sum of:
      0.16645701 = product of:
        0.83228505 = sum of:
          0.011315331 = weight(abstract_txt:more in 3103) [ClassicSimilarity], result of:
            0.011315331 = score(doc=3103,freq=1.0), product of:
              0.053307716 = queryWeight, product of:
                1.1196344 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.014018987 = queryNorm
              0.21226442 = fieldWeight in 3103, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.0625 = fieldNorm(doc=3103)
          0.011981472 = weight(abstract_txt:paper in 3103) [ClassicSimilarity], result of:
            0.011981472 = score(doc=3103,freq=1.0), product of:
              0.05537988 = queryWeight, product of:
                1.141188 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.014018987 = queryNorm
              0.21635064 = fieldWeight in 3103, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.0625 = fieldNorm(doc=3103)
          0.029690387 = weight(abstract_txt:analysis in 3103) [ClassicSimilarity], result of:
            0.029690387 = score(doc=3103,freq=2.0), product of:
              0.09213887 = queryWeight, product of:
                1.8028029 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.014018987 = queryNorm
              0.3222352 = fieldWeight in 3103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.0625 = fieldNorm(doc=3103)
          0.3550209 = weight(abstract_txt:arabic in 3103) [ClassicSimilarity], result of:
            0.3550209 = score(doc=3103,freq=2.0), product of:
              0.5302913 = queryWeight, product of:
                4.9940567 = boost
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.014018987 = queryNorm
              0.66948277 = fieldWeight in 3103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.0625 = fieldNorm(doc=3103)
          0.42427695 = weight(abstract_txt:morphological in 3103) [ClassicSimilarity], result of:
            0.42427695 = score(doc=3103,freq=2.0), product of:
              0.5971886 = queryWeight, product of:
                5.299708 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.014018987 = queryNorm
              0.7104572 = fieldWeight in 3103, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.0625 = fieldNorm(doc=3103)
        0.2 = coord(5/25)
  4. Abdelali, A.: Localization in modern standard Arabic (2004) 0.12
    0.121125676 = sum of:
      0.121125676 = product of:
        0.7570355 = sum of:
          0.0141441645 = weight(abstract_txt:more in 3066) [ClassicSimilarity], result of:
            0.0141441645 = score(doc=3066,freq=1.0), product of:
              0.053307716 = queryWeight, product of:
                1.1196344 = boost
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.014018987 = queryNorm
              0.26533052 = fieldWeight in 3066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3962307 = idf(docFreq=4044, maxDocs=44421)
                0.078125 = fieldNorm(doc=3066)
          0.014976841 = weight(abstract_txt:paper in 3066) [ClassicSimilarity], result of:
            0.014976841 = score(doc=3066,freq=1.0), product of:
              0.05537988 = queryWeight, product of:
                1.141188 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.014018987 = queryNorm
              0.2704383 = fieldWeight in 3066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.078125 = fieldNorm(doc=3066)
          0.026242845 = weight(abstract_txt:analysis in 3066) [ClassicSimilarity], result of:
            0.026242845 = score(doc=3066,freq=1.0), product of:
              0.09213887 = queryWeight, product of:
                1.8028029 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.014018987 = queryNorm
              0.28481838 = fieldWeight in 3066, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.078125 = fieldNorm(doc=3066)
          0.70167166 = weight(abstract_txt:arabic in 3066) [ClassicSimilarity], result of:
            0.70167166 = score(doc=3066,freq=5.0), product of:
              0.5302913 = queryWeight, product of:
                4.9940567 = boost
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.014018987 = queryNorm
              1.3231815 = fieldWeight in 3066, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.574333 = idf(docFreq=61, maxDocs=44421)
                0.078125 = fieldNorm(doc=3066)
        0.16 = coord(4/25)
  5. Fautsch, C.; Savoy, J.: Algorithmic stemmers or morphological analysis? : an evaluation (2009) 0.11
    0.10520074 = sum of:
      0.10520074 = product of:
        0.87667286 = sum of:
          0.19002113 = weight(abstract_txt:stemmers in 3950) [ClassicSimilarity], result of:
            0.19002113 = score(doc=3950,freq=2.0), product of:
              0.18978117 = queryWeight, product of:
                1.4938011 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.014018987 = queryNorm
              1.0012643 = fieldWeight in 3950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.037112985 = weight(abstract_txt:analysis in 3950) [ClassicSimilarity], result of:
            0.037112985 = score(doc=3950,freq=2.0), product of:
              0.09213887 = queryWeight, product of:
                1.8028029 = boost
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.014018987 = queryNorm
              0.402794 = fieldWeight in 3950, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6456752 = idf(docFreq=3151, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
          0.64953876 = weight(abstract_txt:morphological in 3950) [ClassicSimilarity], result of:
            0.64953876 = score(doc=3950,freq=3.0), product of:
              0.5971886 = queryWeight, product of:
                5.299708 = boost
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.014018987 = queryNorm
              1.087661 = fieldWeight in 3950, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.037906 = idf(docFreq=38, maxDocs=44421)
                0.078125 = fieldNorm(doc=3950)
        0.12 = coord(3/25)