Document (#2315)

Author
Renouf, A.
Title
Sticking to the text : a corpus linguist's view of language
Source
Aslib proceedings. 45(1993) no.5, S.131-136
Year
1993
Abstract
Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.22
    0.21530128 = sum of:
      0.21530128 = product of:
        0.89708865 = sum of:
          0.09061565 = weight(abstract_txt:linguistics in 5896) [ClassicSimilarity], result of:
            0.09061565 = score(doc=5896,freq=2.0), product of:
              0.1233873 = queryWeight, product of:
                1.0066475 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.018440187 = queryNorm
              0.73440015 = fieldWeight in 5896, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
          0.031916432 = weight(abstract_txt:language in 5896) [ClassicSimilarity], result of:
            0.031916432 = score(doc=5896,freq=1.0), product of:
              0.09768575 = queryWeight, product of:
                1.2666972 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018440187 = queryNorm
              0.32672557 = fieldWeight in 5896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
          0.15845852 = weight(abstract_txt:linguists in 5896) [ClassicSimilarity], result of:
            0.15845852 = score(doc=5896,freq=1.0), product of:
              0.22564308 = queryWeight, product of:
                1.361298 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.018440187 = queryNorm
              0.7022529 = fieldWeight in 5896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
          0.17543356 = weight(abstract_txt:sake in 5896) [ClassicSimilarity], result of:
            0.17543356 = score(doc=5896,freq=1.0), product of:
              0.24148312 = queryWeight, product of:
                1.4082688 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.018440187 = queryNorm
              0.72648376 = fieldWeight in 5896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
          0.2427276 = weight(abstract_txt:word in 5896) [ClassicSimilarity], result of:
            0.2427276 = score(doc=5896,freq=3.0), product of:
              0.33001778 = queryWeight, product of:
                3.2926142 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.018440187 = queryNorm
              0.73549855 = fieldWeight in 5896, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
          0.19793688 = weight(abstract_txt:corpus in 5896) [ClassicSimilarity], result of:
            0.19793688 = score(doc=5896,freq=1.0), product of:
              0.41544747 = queryWeight, product of:
                3.6942837 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.018440187 = queryNorm
              0.4764426 = fieldWeight in 5896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.078125 = fieldNorm(doc=5896)
        0.24 = coord(6/25)
    
  2. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.15
    0.15036993 = sum of:
      0.15036993 = product of:
        0.6265414 = sum of:
          0.06407494 = weight(abstract_txt:linguistics in 3015) [ClassicSimilarity], result of:
            0.06407494 = score(doc=3015,freq=1.0), product of:
              0.1233873 = queryWeight, product of:
                1.0066475 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.018440187 = queryNorm
              0.5192993 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.04513665 = weight(abstract_txt:language in 3015) [ClassicSimilarity], result of:
            0.04513665 = score(doc=3015,freq=2.0), product of:
              0.09768575 = queryWeight, product of:
                1.2666972 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018440187 = queryNorm
              0.46205974 = fieldWeight in 3015, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.026269615 = weight(abstract_txt:study in 3015) [ClassicSimilarity], result of:
            0.026269615 = score(doc=3015,freq=1.0), product of:
              0.09820973 = queryWeight, product of:
                1.555536 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.018440187 = queryNorm
              0.26748484 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.061198615 = weight(abstract_txt:automatic in 3015) [ClassicSimilarity], result of:
            0.061198615 = score(doc=3015,freq=1.0), product of:
              0.15077038 = queryWeight, product of:
                1.5736755 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.018440187 = queryNorm
              0.40590608 = fieldWeight in 3015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.1499365 = weight(abstract_txt:text in 3015) [ClassicSimilarity], result of:
            0.1499365 = score(doc=3015,freq=3.0), product of:
              0.2740059 = queryWeight, product of:
                3.6744957 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018440187 = queryNorm
              0.54720175 = fieldWeight in 3015, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
          0.27992505 = weight(abstract_txt:corpus in 3015) [ClassicSimilarity], result of:
            0.27992505 = score(doc=3015,freq=2.0), product of:
              0.41544747 = queryWeight, product of:
                3.6942837 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.018440187 = queryNorm
              0.67379165 = fieldWeight in 3015, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.078125 = fieldNorm(doc=3015)
        0.24 = coord(6/25)
    
  3. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.11
    0.1142179 = sum of:
      0.1142179 = product of:
        0.5710895 = sum of:
          0.14089566 = weight(abstract_txt:lexicon in 5089) [ClassicSimilarity], result of:
            0.14089566 = score(doc=5089,freq=4.0), product of:
              0.16672142 = queryWeight, product of:
                1.1701401 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.018440187 = queryNorm
              0.84509635 = fieldWeight in 5089, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.11092096 = weight(abstract_txt:wording in 5089) [ClassicSimilarity], result of:
            0.11092096 = score(doc=5089,freq=1.0), product of:
              0.22564308 = queryWeight, product of:
                1.361298 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.018440187 = queryNorm
              0.49157703 = fieldWeight in 5089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.018388731 = weight(abstract_txt:study in 5089) [ClassicSimilarity], result of:
            0.018388731 = score(doc=5089,freq=1.0), product of:
              0.09820973 = queryWeight, product of:
                1.555536 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.018440187 = queryNorm
              0.1872394 = fieldWeight in 5089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.24028806 = weight(abstract_txt:word in 5089) [ClassicSimilarity], result of:
            0.24028806 = score(doc=5089,freq=6.0), product of:
              0.33001778 = queryWeight, product of:
                3.2926142 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.018440187 = queryNorm
              0.72810644 = fieldWeight in 5089, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.060596116 = weight(abstract_txt:text in 5089) [ClassicSimilarity], result of:
            0.060596116 = score(doc=5089,freq=1.0), product of:
              0.2740059 = queryWeight, product of:
                3.6744957 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018440187 = queryNorm
              0.22114895 = fieldWeight in 5089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
        0.2 = coord(5/25)
    
  4. Lund, K.; Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence (1996) 0.11
    0.1104339 = sum of:
      0.1104339 = product of:
        0.6902119 = sum of:
          0.04513665 = weight(abstract_txt:language in 1704) [ClassicSimilarity], result of:
            0.04513665 = score(doc=1704,freq=2.0), product of:
              0.09768575 = queryWeight, product of:
                1.2666972 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018440187 = queryNorm
              0.46205974 = fieldWeight in 1704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=1704)
          0.2427276 = weight(abstract_txt:word in 1704) [ClassicSimilarity], result of:
            0.2427276 = score(doc=1704,freq=3.0), product of:
              0.33001778 = queryWeight, product of:
                3.2926142 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.018440187 = queryNorm
              0.73549855 = fieldWeight in 1704, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.078125 = fieldNorm(doc=1704)
          0.12242264 = weight(abstract_txt:text in 1704) [ClassicSimilarity], result of:
            0.12242264 = score(doc=1704,freq=2.0), product of:
              0.2740059 = queryWeight, product of:
                3.6744957 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018440187 = queryNorm
              0.44678837 = fieldWeight in 1704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1704)
          0.27992505 = weight(abstract_txt:corpus in 1704) [ClassicSimilarity], result of:
            0.27992505 = score(doc=1704,freq=2.0), product of:
              0.41544747 = queryWeight, product of:
                3.6942837 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.018440187 = queryNorm
              0.67379165 = fieldWeight in 1704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.078125 = fieldNorm(doc=1704)
        0.16 = coord(4/25)
    
  5. Altinel, B.; Ganiz, M.C.: Semantic text classification : a survey of past and recent advances (2018) 0.10
    0.10406319 = sum of:
      0.10406319 = product of:
        0.52031595 = sum of:
          0.022341503 = weight(abstract_txt:language in 5051) [ClassicSimilarity], result of:
            0.022341503 = score(doc=5051,freq=1.0), product of:
              0.09768575 = queryWeight, product of:
                1.2666972 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.018440187 = queryNorm
              0.22870791 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5051)
          0.042839028 = weight(abstract_txt:automatic in 5051) [ClassicSimilarity], result of:
            0.042839028 = score(doc=5051,freq=1.0), product of:
              0.15077038 = queryWeight, product of:
                1.5736755 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.018440187 = queryNorm
              0.28413424 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5051)
          0.0980972 = weight(abstract_txt:word in 5051) [ClassicSimilarity], result of:
            0.0980972 = score(doc=5051,freq=1.0), product of:
              0.33001778 = queryWeight, product of:
                3.2926142 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.018440187 = queryNorm
              0.2972482 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5051)
          0.2184824 = weight(abstract_txt:text in 5051) [ClassicSimilarity], result of:
            0.2184824 = score(doc=5051,freq=13.0), product of:
              0.2740059 = queryWeight, product of:
                3.6744957 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.018440187 = queryNorm
              0.7973639 = fieldWeight in 5051, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5051)
          0.13855582 = weight(abstract_txt:corpus in 5051) [ClassicSimilarity], result of:
            0.13855582 = score(doc=5051,freq=1.0), product of:
              0.41544747 = queryWeight, product of:
                3.6942837 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.018440187 = queryNorm
              0.33350983 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5051)
        0.2 = coord(5/25)