Document (#11719)

Author
Losee, R.M.
Haas, S.W.
Title
Sublanguage terms : dictionaries, usage, and automatic classification
Source
Journal of the American Society for Information Science. 46(1995) no.7, S.519-529
Year
1995
Abstract
The use of terms from natural and social science titles and abstracts is studied from the perspective of sublanguages and their specialized dictionaries. Explores different notions of sublanguage distinctiveness. Object methods for separating hard and soft sciences are suggested based on measures of sublanguage use, dictionary characteristics, and sublanguage distinctiveness. Abstracts were automatically classified with a high degree of accuracy by using a formula that condsiders the degree of uniqueness of terms in each sublanguage. This may prove useful for text filtering of information retrieval systems
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Haas, S.W.; Losee, R.M.: Looking in text windows : their size and composition (1994) 6.05
    6.049801 = sum of:
      6.049801 = sum of:
        2.754635 = weight(author_txt:losee in 139) [ClassicSimilarity], result of:
          2.754635 = score(doc=139,freq=1.0), product of:
            0.6637459 = queryWeight, product of:
              8.30027 = idf(docFreq=29, maxDocs=44421)
              0.07996678 = queryNorm
            4.150135 = fieldWeight in 139, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.30027 = idf(docFreq=29, maxDocs=44421)
              0.5 = fieldNorm(doc=139)
        3.2951655 = weight(author_txt:haas in 139) [ClassicSimilarity], result of:
          3.2951655 = score(doc=139,freq=1.0), product of:
            0.7479581 = queryWeight, product of:
              1.0615433 = boost
              8.811096 = idf(docFreq=17, maxDocs=44421)
              0.07996678 = queryNorm
            4.405548 = fieldWeight in 139, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.811096 = idf(docFreq=17, maxDocs=44421)
              0.5 = fieldNorm(doc=139)
    
  2. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 2.06
    2.0594785 = sum of:
      2.0594785 = product of:
        4.118957 = sum of:
          4.118957 = weight(author_txt:haas in 8070) [ClassicSimilarity], result of:
            4.118957 = score(doc=8070,freq=1.0), product of:
              0.7479581 = queryWeight, product of:
                1.0615433 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.07996678 = queryNorm
              5.506935 = fieldWeight in 8070, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.625 = fieldNorm(doc=8070)
        0.5 = coord(1/2)
    
  3. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 2.06
    2.0594785 = sum of:
      2.0594785 = product of:
        4.118957 = sum of:
          4.118957 = weight(author_txt:haas in 6568) [ClassicSimilarity], result of:
            4.118957 = score(doc=6568,freq=1.0), product of:
              0.7479581 = queryWeight, product of:
                1.0615433 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.07996678 = queryNorm
              5.506935 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.625 = fieldNorm(doc=6568)
        0.5 = coord(1/2)
    
  4. Haas, S.W.: ¬A text filter for the automatic identification of empirical articles (1996) 2.06
    2.0594785 = sum of:
      2.0594785 = product of:
        4.118957 = sum of:
          4.118957 = weight(author_txt:haas in 6866) [ClassicSimilarity], result of:
            4.118957 = score(doc=6866,freq=1.0), product of:
              0.7479581 = queryWeight, product of:
                1.0615433 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.07996678 = queryNorm
              5.506935 = fieldWeight in 6866, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.625 = fieldNorm(doc=6866)
        0.5 = coord(1/2)
    
  5. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 2.06
    2.0594785 = sum of:
      2.0594785 = product of:
        4.118957 = sum of:
          4.118957 = weight(author_txt:haas in 484) [ClassicSimilarity], result of:
            4.118957 = score(doc=484,freq=1.0), product of:
              0.7479581 = queryWeight, product of:
                1.0615433 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.07996678 = queryNorm
              5.506935 = fieldWeight in 484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.625 = fieldNorm(doc=484)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Haas, S.; He, S.: Toward the automatic identification of sublanguage vocabulary (1993) 0.24
    0.23881769 = sum of:
      0.23881769 = product of:
        1.9901475 = sum of:
          0.009632611 = weight(abstract_txt:from in 4890) [ClassicSimilarity], result of:
            0.009632611 = score(doc=4890,freq=1.0), product of:
              0.027926693 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.010120571 = queryNorm
              0.34492487 = fieldWeight in 4890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.125 = fieldNorm(doc=4890)
          0.13737899 = weight(abstract_txt:abstracts in 4890) [ClassicSimilarity], result of:
            0.13737899 = score(doc=4890,freq=2.0), product of:
              0.1303548 = queryWeight, product of:
                2.1604977 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.010120571 = queryNorm
              1.0538851 = fieldWeight in 4890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.125 = fieldNorm(doc=4890)
          1.8431358 = weight(abstract_txt:sublanguage in 4890) [ClassicSimilarity], result of:
            1.8431358 = score(doc=4890,freq=3.0), product of:
              0.87263906 = queryWeight, product of:
                8.83848 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.010120571 = queryNorm
              2.11214 = fieldWeight in 4890, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.125 = fieldNorm(doc=4890)
        0.12 = coord(3/25)
    
  2. Haas, S.W.: Disciplinary variation in automatic sublanguage term identification (1997) 0.18
    0.1752314 = sum of:
      0.1752314 = product of:
        0.73013085 = sum of:
          0.0048163054 = weight(abstract_txt:from in 6568) [ClassicSimilarity], result of:
            0.0048163054 = score(doc=6568,freq=1.0), product of:
              0.027926693 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.010120571 = queryNorm
              0.17246243 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.01929302 = weight(abstract_txt:automatically in 6568) [ClassicSimilarity], result of:
            0.01929302 = score(doc=6568,freq=1.0), product of:
              0.0559071 = queryWeight, product of:
                1.0004808 = boost
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.010120571 = queryNorm
              0.3450907 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.521451 = idf(docFreq=482, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.06523097 = weight(abstract_txt:hard in 6568) [ClassicSimilarity], result of:
            0.06523097 = score(doc=6568,freq=4.0), product of:
              0.079338275 = queryWeight, product of:
                1.1918364 = boost
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.010120571 = queryNorm
              0.82218796 = fieldWeight in 6568, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.048570808 = weight(abstract_txt:abstracts in 6568) [ClassicSimilarity], result of:
            0.048570808 = score(doc=6568,freq=1.0), product of:
              0.1303548 = queryWeight, product of:
                2.1604977 = boost
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.010120571 = queryNorm
              0.37260467 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9616747 = idf(docFreq=310, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.060152188 = weight(abstract_txt:terms in 6568) [ClassicSimilarity], result of:
            0.060152188 = score(doc=6568,freq=7.0), product of:
              0.0899585 = queryWeight, product of:
                2.1981483 = boost
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.010120571 = queryNorm
              0.668666 = fieldWeight in 6568, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.043712 = idf(docFreq=2116, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
          0.53206754 = weight(abstract_txt:sublanguage in 6568) [ClassicSimilarity], result of:
            0.53206754 = score(doc=6568,freq=1.0), product of:
              0.87263906 = queryWeight, product of:
                8.83848 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.010120571 = queryNorm
              0.6097223 = fieldWeight in 6568, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.0625 = fieldNorm(doc=6568)
        0.24 = coord(6/25)
    
  3. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.09
    0.08590141 = sum of:
      0.08590141 = product of:
        1.0737677 = sum of:
          0.009632611 = weight(abstract_txt:from in 4777) [ClassicSimilarity], result of:
            0.009632611 = score(doc=4777,freq=1.0), product of:
              0.027926693 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.010120571 = queryNorm
              0.34492487 = fieldWeight in 4777, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.125 = fieldNorm(doc=4777)
          1.0641351 = weight(abstract_txt:sublanguage in 4777) [ClassicSimilarity], result of:
            1.0641351 = score(doc=4777,freq=1.0), product of:
              0.87263906 = queryWeight, product of:
                8.83848 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.010120571 = queryNorm
              1.2194446 = fieldWeight in 4777, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.125 = fieldNorm(doc=4777)
        0.08 = coord(2/25)
    
  4. Hutchins, J.: ¬A new era in machine translation research (1995) 0.08
    0.084696546 = sum of:
      0.084696546 = product of:
        0.7058046 = sum of:
          0.006020382 = weight(abstract_txt:from in 3914) [ClassicSimilarity], result of:
            0.006020382 = score(doc=3914,freq=1.0), product of:
              0.027926693 = queryWeight, product of:
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.010120571 = queryNorm
              0.21557805 = fieldWeight in 3914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.078125 = fieldNorm(doc=3914)
          0.034699813 = weight(abstract_txt:specialized in 3914) [ClassicSimilarity], result of:
            0.034699813 = score(doc=3914,freq=1.0), product of:
              0.07125438 = queryWeight, product of:
                1.1294864 = boost
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.010120571 = queryNorm
              0.48698497 = fieldWeight in 3914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2334075 = idf(docFreq=236, maxDocs=44421)
                0.078125 = fieldNorm(doc=3914)
          0.6650844 = weight(abstract_txt:sublanguage in 3914) [ClassicSimilarity], result of:
            0.6650844 = score(doc=3914,freq=1.0), product of:
              0.87263906 = queryWeight, product of:
                8.83848 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.010120571 = queryNorm
              0.7621529 = fieldWeight in 3914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.078125 = fieldNorm(doc=3914)
        0.12 = coord(3/25)
    
  5. Ananiadou, S.; McNaught, J.: Terms are not alone : term choice and choice terms (1995) 0.08
    0.080304846 = sum of:
      0.080304846 = product of:
        1.0038106 = sum of:
          0.072692506 = weight(abstract_txt:degree in 1859) [ClassicSimilarity], result of:
            0.072692506 = score(doc=1859,freq=1.0), product of:
              0.11744771 = queryWeight, product of:
                2.050749 = boost
                5.658835 = idf(docFreq=420, maxDocs=44421)
                0.010120571 = queryNorm
              0.61893505 = fieldWeight in 1859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.658835 = idf(docFreq=420, maxDocs=44421)
                0.109375 = fieldNorm(doc=1859)
          0.9311182 = weight(abstract_txt:sublanguage in 1859) [ClassicSimilarity], result of:
            0.9311182 = score(doc=1859,freq=1.0), product of:
              0.87263906 = queryWeight, product of:
                8.83848 = boost
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.010120571 = queryNorm
              1.0670141 = fieldWeight in 1859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.755557 = idf(docFreq=6, maxDocs=44421)
                0.109375 = fieldNorm(doc=1859)
        0.08 = coord(2/25)