Document (#4891)

Author
Haas, S.
He, S.
Title
Toward the automatic identification of sublanguage vocabulary
Source
Information processing and management. 29(1993) no.6, S.721-744
Year
1993
Abstract
Describes a method developed for automatic identification of sublanguage vocabulary words as they occur in abstracts. Describes the sublanguage vocabulary identification procedures using abstracts from computer science and library and information science as sublanguage sources. Evaluates the results using three criteria. Discuss the practical and theoretical significance of this research and plans for further experiments
Theme
Automatisches Indexieren

Similar documents (author)

  1. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:haas in 8070) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 8070, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=8070)
    
  2. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:haas in 6568) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 6568, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=6568)
    
  3. Haas, S.W.: ¬A text filter for the automatic identification of empirical articles (1996) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:haas in 6866) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 6866, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=6866)
    
  4. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:haas in 484) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=484)
    
  5. Haas, S.: Metadata mania : an overview (1998) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:haas in 3222) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 3222, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=3222)
    

Similar documents (content)

  1. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 0.30
    0.30251405 = sum of:
      0.30251405 = product of:
        0.8403168 = sum of:
          0.018484553 = weight(abstract_txt:method in 6568) [ClassicSimilarity], result of:
            0.018484553 = score(doc=6568,freq=2.0), product of:
              0.046485405 = queryWeight, product of:
                1.0419986 = boost
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.009916357 = queryNorm
              0.39764208 = fieldWeight in 6568, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4988065 = idf(docFreq=1342, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.015842881 = weight(abstract_txt:practical in 6568) [ClassicSimilarity], result of:
            0.015842881 = score(doc=6568,freq=1.0), product of:
              0.052845754 = queryWeight, product of:
                1.1109996 = boost
                4.7967167 = idf(docFreq=996, maxDocs=44421)
                0.009916357 = queryNorm
              0.2997948 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7967167 = idf(docFreq=996, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.016241565 = weight(abstract_txt:theoretical in 6568) [ClassicSimilarity], result of:
            0.016241565 = score(doc=6568,freq=1.0), product of:
              0.053728648 = queryWeight, product of:
                1.1202419 = boost
                4.83662 = idf(docFreq=957, maxDocs=44421)
                0.009916357 = queryNorm
              0.30228874 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.83662 = idf(docFreq=957, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.042492047 = weight(abstract_txt:occur in 6568) [ClassicSimilarity], result of:
            0.042492047 = score(doc=6568,freq=1.0), product of:
              0.10201384 = queryWeight, product of:
                1.543613 = boost
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.009916357 = queryNorm
              0.4165322 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.016102878 = weight(abstract_txt:describes in 6568) [ClassicSimilarity], result of:
            0.016102878 = score(doc=6568,freq=1.0), product of:
              0.06730794 = queryWeight, product of:
                1.773197 = boost
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.009916357 = queryNorm
              0.23924187 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82787 = idf(docFreq=2626, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.016391253 = weight(abstract_txt:science in 6568) [ClassicSimilarity], result of:
            0.016391253 = score(doc=6568,freq=1.0), product of:
              0.06810915 = queryWeight, product of:
                1.7837195 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.009916357 = queryNorm
              0.24066156 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.06083266 = weight(abstract_txt:abstracts in 6568) [ClassicSimilarity], result of:
            0.06083266 = score(doc=6568,freq=1.0), product of:
              0.16326328 = queryWeight, product of:
                2.7616467 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.009916357 = queryNorm
              0.37260467 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.12081723 = weight(abstract_txt:identification in 6568) [ClassicSimilarity], result of:
            0.12081723 = score(doc=6568,freq=2.0), product of:
              0.23437089 = queryWeight, product of:
                4.052484 = boost
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.009916357 = queryNorm
              0.5154959 = fieldWeight in 6568, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.5331117 = weight(abstract_txt:sublanguage in 6568) [ClassicSimilarity], result of:
            0.5331117 = score(doc=6568,freq=1.0), product of:
              0.8743516 = queryWeight, product of:
                9.038199 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.009916357 = queryNorm
              0.6097223 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
        0.36 = coord(9/25)
    
  2. Losee, R.M.; Haas, S.W.: Sublanguage terms : dictionaries, usage, and automatic classification (1995) 0.28
    0.28332117 = sum of:
      0.28332117 = product of:
        1.7707574 = sum of:
          0.017789867 = weight(abstract_txt:using in 2718) [ClassicSimilarity], result of:
            0.017789867 = score(doc=2718,freq=1.0), product of:
              0.054893166 = queryWeight, product of:
                1.6013379 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.009916357 = queryNorm
              0.32408163 = fieldWeight in 2718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.09375 = fieldNorm(doc=2718)
          0.02458688 = weight(abstract_txt:science in 2718) [ClassicSimilarity], result of:
            0.02458688 = score(doc=2718,freq=1.0), product of:
              0.06810915 = queryWeight, product of:
                1.7837195 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.009916357 = queryNorm
              0.36099234 = fieldWeight in 2718, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.09375 = fieldNorm(doc=2718)
          0.12904556 = weight(abstract_txt:abstracts in 2718) [ClassicSimilarity], result of:
            0.12904556 = score(doc=2718,freq=2.0), product of:
              0.16326328 = queryWeight, product of:
                2.7616467 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.009916357 = queryNorm
              0.79041386 = fieldWeight in 2718, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.09375 = fieldNorm(doc=2718)
          1.5993351 = weight(abstract_txt:sublanguage in 2718) [ClassicSimilarity], result of:
            1.5993351 = score(doc=2718,freq=4.0), product of:
              0.8743516 = queryWeight, product of:
                9.038199 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.009916357 = queryNorm
              1.8291669 = fieldWeight in 2718, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.09375 = fieldNorm(doc=2718)
        0.16 = coord(4/25)
    
  3. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.09
    0.09174075 = sum of:
      0.09174075 = product of:
        1.1467594 = sum of:
          0.080536 = weight(abstract_txt:automatic in 4777) [ClassicSimilarity], result of:
            0.080536 = score(doc=4777,freq=1.0), product of:
              0.124004476 = queryWeight, product of:
                2.406814 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.009916357 = queryNorm
              0.64946043 = fieldWeight in 4777, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.125 = fieldNorm(doc=4777)
          1.0662234 = weight(abstract_txt:sublanguage in 4777) [ClassicSimilarity], result of:
            1.0662234 = score(doc=4777,freq=1.0), product of:
              0.8743516 = queryWeight, product of:
                9.038199 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.009916357 = queryNorm
              1.2194446 = fieldWeight in 4777, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.125 = fieldNorm(doc=4777)
        0.08 = coord(2/25)
    
  4. Salton, G.: Automatic processing of foreign language documents (1985) 0.09
    0.087982096 = sum of:
      0.087982096 = product of:
        0.21995524 = sum of:
          0.008664704 = weight(abstract_txt:computer in 4650) [ClassicSimilarity], result of:
            0.008664704 = score(doc=4650,freq=1.0), product of:
              0.042813655 = queryWeight, product of:
                4.317478 = idf(docFreq=1609, maxDocs=44421)
                0.009916357 = queryNorm
              0.20238179 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.317478 = idf(docFreq=1609, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.0108891055 = weight(abstract_txt:further in 4650) [ClassicSimilarity], result of:
            0.0108891055 = score(doc=4650,freq=1.0), product of:
              0.049858738 = queryWeight, product of:
                1.0791442 = boost
                4.6591816 = idf(docFreq=1143, maxDocs=44421)
                0.009916357 = queryNorm
              0.21839914 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6591816 = idf(docFreq=1143, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.011882162 = weight(abstract_txt:practical in 4650) [ClassicSimilarity], result of:
            0.011882162 = score(doc=4650,freq=1.0), product of:
              0.052845754 = queryWeight, product of:
                1.1109996 = boost
                4.7967167 = idf(docFreq=996, maxDocs=44421)
                0.009916357 = queryNorm
              0.2248461 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7967167 = idf(docFreq=996, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.012181173 = weight(abstract_txt:theoretical in 4650) [ClassicSimilarity], result of:
            0.012181173 = score(doc=4650,freq=1.0), product of:
              0.053728648 = queryWeight, product of:
                1.1202419 = boost
                4.83662 = idf(docFreq=957, maxDocs=44421)
                0.009916357 = queryNorm
              0.22671655 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.83662 = idf(docFreq=957, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.01654032 = weight(abstract_txt:words in 4650) [ClassicSimilarity], result of:
            0.01654032 = score(doc=4650,freq=1.0), product of:
              0.06588336 = queryWeight, product of:
                1.2404999 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.009916357 = queryNorm
              0.25105458 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.018542314 = weight(abstract_txt:criteria in 4650) [ClassicSimilarity], result of:
            0.018542314 = score(doc=4650,freq=1.0), product of:
              0.07109773 = queryWeight, product of:
                1.2886552 = boost
                5.5637407 = idf(docFreq=462, maxDocs=44421)
                0.009916357 = queryNorm
              0.26080036 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5637407 = idf(docFreq=462, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.012579335 = weight(abstract_txt:using in 4650) [ClassicSimilarity], result of:
            0.012579335 = score(doc=4650,freq=2.0), product of:
              0.054893166 = queryWeight, product of:
                1.6013379 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.009916357 = queryNorm
              0.22916031 = fieldWeight in 4650, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.01229344 = weight(abstract_txt:science in 4650) [ClassicSimilarity], result of:
            0.01229344 = score(doc=4650,freq=1.0), product of:
              0.06810915 = queryWeight, product of:
                1.7837195 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.009916357 = queryNorm
              0.18049617 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.052309666 = weight(abstract_txt:automatic in 4650) [ClassicSimilarity], result of:
            0.052309666 = score(doc=4650,freq=3.0), product of:
              0.124004476 = queryWeight, product of:
                2.406814 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.009916357 = queryNorm
              0.4218369 = fieldWeight in 4650, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
          0.06407301 = weight(abstract_txt:identification in 4650) [ClassicSimilarity], result of:
            0.06407301 = score(doc=4650,freq=1.0), product of:
              0.23437089 = queryWeight, product of:
                4.052484 = boost
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.009916357 = queryNorm
              0.273383 = fieldWeight in 4650, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8321705 = idf(docFreq=353, maxDocs=44421)
                0.046875 = fieldNorm(doc=4650)
        0.4 = coord(10/25)
    
  5. Hmeidi, I.; Kanaan, G.; Evens, M.: Design and implementation of automatic indexing for information retrieval with Arabic documents (1997) 0.08
    0.08175317 = sum of:
      0.08175317 = product of:
        0.29197562 = sum of:
          0.0144411735 = weight(abstract_txt:computer in 2660) [ClassicSimilarity], result of:
            0.0144411735 = score(doc=2660,freq=1.0), product of:
              0.042813655 = queryWeight, product of:
                4.317478 = idf(docFreq=1609, maxDocs=44421)
                0.009916357 = queryNorm
              0.33730298 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.317478 = idf(docFreq=1609, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.027089903 = weight(abstract_txt:experiments in 2660) [ClassicSimilarity], result of:
            0.027089903 = score(doc=2660,freq=1.0), product of:
              0.06512068 = queryWeight, product of:
                1.2332989 = boost
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.009916357 = queryNorm
              0.4159954 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.324741 = idf(docFreq=587, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.0275672 = weight(abstract_txt:words in 2660) [ClassicSimilarity], result of:
            0.0275672 = score(doc=2660,freq=1.0), product of:
              0.06588336 = queryWeight, product of:
                1.2404999 = boost
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.009916357 = queryNorm
              0.4184243 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.355831 = idf(docFreq=569, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.025677461 = weight(abstract_txt:using in 2660) [ClassicSimilarity], result of:
            0.025677461 = score(doc=2660,freq=3.0), product of:
              0.054893166 = queryWeight, product of:
                1.6013379 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.009916357 = queryNorm
              0.46777156 = fieldWeight in 2660, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.020489069 = weight(abstract_txt:science in 2660) [ClassicSimilarity], result of:
            0.020489069 = score(doc=2660,freq=1.0), product of:
              0.06810915 = queryWeight, product of:
                1.7837195 = boost
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.009916357 = queryNorm
              0.30082697 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.850585 = idf(docFreq=2567, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.100669995 = weight(abstract_txt:automatic in 2660) [ClassicSimilarity], result of:
            0.100669995 = score(doc=2660,freq=4.0), product of:
              0.124004476 = queryWeight, product of:
                2.406814 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.009916357 = queryNorm
              0.8118255 = fieldWeight in 2660, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
          0.07604082 = weight(abstract_txt:abstracts in 2660) [ClassicSimilarity], result of:
            0.07604082 = score(doc=2660,freq=1.0), product of:
              0.16326328 = queryWeight, product of:
                2.7616467 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.009916357 = queryNorm
              0.46575582 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.078125 = fieldNorm(doc=2660)
        0.28 = coord(7/25)