Document (#2314)

Author
Renouf, A.
Title
Sticking to the text : a corpus linguist's view of language
Source
Aslib proceedings. 45(1993) no.5, S.131-136
Year
1993
Abstract
Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy
Theme
Automatisches Indexieren
Computerlinguistik

Similar documents (content)

  1. Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.22
    0.21535975 = sum of:
      0.21535975 = product of:
        0.8973323 = sum of:
          0.09057601 = weight(abstract_txt:linguistics in 896) [ClassicSimilarity], result of:
            0.09057601 = score(doc=896,freq=2.0), product of:
              0.12336691 = queryWeight, product of:
                1.0056758 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.01845998 = queryNorm
              0.7342002 = fieldWeight in 896, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
          0.031674992 = weight(abstract_txt:language in 896) [ClassicSimilarity], result of:
            0.031674992 = score(doc=896,freq=1.0), product of:
              0.09720477 = queryWeight, product of:
                1.2624594 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01845998 = queryNorm
              0.3258584 = fieldWeight in 896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
          0.15876108 = weight(abstract_txt:linguists in 896) [ClassicSimilarity], result of:
            0.15876108 = score(doc=896,freq=1.0), product of:
              0.22595881 = queryWeight, product of:
                1.3610475 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01845998 = queryNorm
              0.70261073 = fieldWeight in 896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
          0.1757596 = weight(abstract_txt:sake in 896) [ClassicSimilarity], result of:
            0.1757596 = score(doc=896,freq=1.0), product of:
              0.24181278 = queryWeight, product of:
                1.4079858 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.01845998 = queryNorm
              0.7268416 = fieldWeight in 896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
          0.24317819 = weight(abstract_txt:word in 896) [ClassicSimilarity], result of:
            0.24317819 = score(doc=896,freq=3.0), product of:
              0.33046785 = queryWeight, product of:
                3.2919502 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.01845998 = queryNorm
              0.73586035 = fieldWeight in 896, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
          0.19738245 = weight(abstract_txt:corpus in 896) [ClassicSimilarity], result of:
            0.19738245 = score(doc=896,freq=1.0), product of:
              0.41472375 = queryWeight, product of:
                3.6878064 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.01845998 = queryNorm
              0.47593716 = fieldWeight in 896, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=896)
        0.24 = coord(6/25)
    
  2. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.15
    0.14998831 = sum of:
      0.14998831 = product of:
        0.6249513 = sum of:
          0.06404691 = weight(abstract_txt:linguistics in 4015) [ClassicSimilarity], result of:
            0.06404691 = score(doc=4015,freq=1.0), product of:
              0.12336691 = queryWeight, product of:
                1.0056758 = boost
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.01845998 = queryNorm
              0.51915795 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6452217 = idf(docFreq=156, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.0447952 = weight(abstract_txt:language in 4015) [ClassicSimilarity], result of:
            0.0447952 = score(doc=4015,freq=2.0), product of:
              0.09720477 = queryWeight, product of:
                1.2624594 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01845998 = queryNorm
              0.46083337 = fieldWeight in 4015, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.026081868 = weight(abstract_txt:study in 4015) [ClassicSimilarity], result of:
            0.026081868 = score(doc=4015,freq=1.0), product of:
              0.0977536 = queryWeight, product of:
                1.5505496 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01845998 = queryNorm
              0.26681235 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.061224863 = weight(abstract_txt:automatic in 4015) [ClassicSimilarity], result of:
            0.061224863 = score(doc=4015,freq=1.0), product of:
              0.15083256 = queryWeight, product of:
                1.5726106 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01845998 = queryNorm
              0.40591276 = fieldWeight in 4015, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.14966153 = weight(abstract_txt:text in 4015) [ClassicSimilarity], result of:
            0.14966153 = score(doc=4015,freq=3.0), product of:
              0.2737054 = queryWeight, product of:
                3.6692386 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01845998 = queryNorm
              0.5467979 = fieldWeight in 4015, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
          0.27914095 = weight(abstract_txt:corpus in 4015) [ClassicSimilarity], result of:
            0.27914095 = score(doc=4015,freq=2.0), product of:
              0.41472375 = queryWeight, product of:
                3.6878064 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.01845998 = queryNorm
              0.6730768 = fieldWeight in 4015, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=4015)
        0.24 = coord(6/25)
    
  3. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.11
    0.11436184 = sum of:
      0.11436184 = product of:
        0.5718092 = sum of:
          0.14119993 = weight(abstract_txt:lexicon in 89) [ClassicSimilarity], result of:
            0.14119993 = score(doc=89,freq=4.0), product of:
              0.16698246 = queryWeight, product of:
                1.1700221 = boost
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.01845998 = queryNorm
              0.8455974 = fieldWeight in 89, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.11113276 = weight(abstract_txt:wording in 89) [ClassicSimilarity], result of:
            0.11113276 = score(doc=89,freq=1.0), product of:
              0.22595881 = queryWeight, product of:
                1.3610475 = boost
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.01845998 = queryNorm
              0.49182755 = fieldWeight in 89, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.993418 = idf(docFreq=14, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.018257309 = weight(abstract_txt:study in 89) [ClassicSimilarity], result of:
            0.018257309 = score(doc=89,freq=1.0), product of:
              0.0977536 = queryWeight, product of:
                1.5505496 = boost
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.01845998 = queryNorm
              0.18676865 = fieldWeight in 89, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.415198 = idf(docFreq=3968, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.24073413 = weight(abstract_txt:word in 89) [ClassicSimilarity], result of:
            0.24073413 = score(doc=89,freq=6.0), product of:
              0.33046785 = queryWeight, product of:
                3.2919502 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.01845998 = queryNorm
              0.7284646 = fieldWeight in 89, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
          0.06048499 = weight(abstract_txt:text in 89) [ClassicSimilarity], result of:
            0.06048499 = score(doc=89,freq=1.0), product of:
              0.2737054 = queryWeight, product of:
                3.6692386 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01845998 = queryNorm
              0.22098574 = fieldWeight in 89, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=89)
        0.2 = coord(5/25)
    
  4. Lund, K.; Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence (1996) 0.11
    0.11028999 = sum of:
      0.11028999 = product of:
        0.68931246 = sum of:
          0.0447952 = weight(abstract_txt:language in 2704) [ClassicSimilarity], result of:
            0.0447952 = score(doc=2704,freq=2.0), product of:
              0.09720477 = queryWeight, product of:
                1.2624594 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01845998 = queryNorm
              0.46083337 = fieldWeight in 2704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.078125 = fieldNorm(doc=2704)
          0.24317819 = weight(abstract_txt:word in 2704) [ClassicSimilarity], result of:
            0.24317819 = score(doc=2704,freq=3.0), product of:
              0.33046785 = queryWeight, product of:
                3.2919502 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.01845998 = queryNorm
              0.73586035 = fieldWeight in 2704, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.078125 = fieldNorm(doc=2704)
          0.12219813 = weight(abstract_txt:text in 2704) [ClassicSimilarity], result of:
            0.12219813 = score(doc=2704,freq=2.0), product of:
              0.2737054 = queryWeight, product of:
                3.6692386 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01845998 = queryNorm
              0.4464586 = fieldWeight in 2704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2704)
          0.27914095 = weight(abstract_txt:corpus in 2704) [ClassicSimilarity], result of:
            0.27914095 = score(doc=2704,freq=2.0), product of:
              0.41472375 = queryWeight, product of:
                3.6878064 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.01845998 = queryNorm
              0.6730768 = fieldWeight in 2704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=2704)
        0.16 = coord(4/25)
    
  5. Altinel, B.; Ganiz, M.C.: Semantic text classification : a survey of past and recent advances (2018) 0.10
    0.103911735 = sum of:
      0.103911735 = product of:
        0.51955867 = sum of:
          0.022172494 = weight(abstract_txt:language in 51) [ClassicSimilarity], result of:
            0.022172494 = score(doc=51,freq=1.0), product of:
              0.09720477 = queryWeight, product of:
                1.2624594 = boost
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.01845998 = queryNorm
              0.22810088 = fieldWeight in 51, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1709876 = idf(docFreq=1863, maxDocs=44421)
                0.0546875 = fieldNorm(doc=51)
          0.042857405 = weight(abstract_txt:automatic in 51) [ClassicSimilarity], result of:
            0.042857405 = score(doc=51,freq=1.0), product of:
              0.15083256 = queryWeight, product of:
                1.5726106 = boost
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.01845998 = queryNorm
              0.28413895 = fieldWeight in 51, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1956835 = idf(docFreq=668, maxDocs=44421)
                0.0546875 = fieldNorm(doc=51)
          0.0982793 = weight(abstract_txt:word in 51) [ClassicSimilarity], result of:
            0.0982793 = score(doc=51,freq=1.0), product of:
              0.33046785 = queryWeight, product of:
                3.2919502 = boost
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.01845998 = queryNorm
              0.29739442 = fieldWeight in 51, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4380693 = idf(docFreq=524, maxDocs=44421)
                0.0546875 = fieldNorm(doc=51)
          0.21808173 = weight(abstract_txt:text in 51) [ClassicSimilarity], result of:
            0.21808173 = score(doc=51,freq=13.0), product of:
              0.2737054 = queryWeight, product of:
                3.6692386 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01845998 = queryNorm
              0.7967754 = fieldWeight in 51, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=51)
          0.13816771 = weight(abstract_txt:corpus in 51) [ClassicSimilarity], result of:
            0.13816771 = score(doc=51,freq=1.0), product of:
              0.41472375 = queryWeight, product of:
                3.6878064 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.01845998 = queryNorm
              0.33315602 = fieldWeight in 51, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.0546875 = fieldNorm(doc=51)
        0.2 = coord(5/25)