Document (#16032)

Author
Losee, R.M.
Title
Text windows and phrases differing by discipline, location in document, and syntactic structure
Source
Information processing and management. 32(1996) no.6, S.747-767
Year
1996
Abstract
Knowledge of window style, content, location, and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs. practice spectrum. Examines characteristics of phrases and text windows, including their number, location in documents, and grammatical construction, in addition to studying variations in these window characteristics across disciplines. Examines some of the linguistic regularities for individual disciplines, and suggests families of regularities that may provide helpful for the automatic classification of documents, as well as for information retrieval and filtering applications
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 2334) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 2334, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=2334)
    
  2. Losee, R.M.: ¬The relative shelf location of circulated books : a study of classification, users, and browsing (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4484) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4484)
    
  3. Losee, R.M.: Seven fundamental questions for the science of library classification (1993) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 4507) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 4507, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=4507)
    
  4. Losee, R.M.: Term dependence : truncating the Bahadur Lazarsfeld expansion (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7389) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7389)
    
  5. Losee, R.M.: Upper bounds for retrieval performance and their user measuring performance and generating optimal queries : can it get any better than this? (1994) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:losee in 7417) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 7417, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=7417)
    

Similar documents (content)

  1. Losee, R.M.: Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering : an empirical basis for grammatical rules (1996) 0.17
    0.1731218 = sum of:
      0.1731218 = product of:
        0.72134084 = sum of:
          0.061581757 = weight(abstract_txt:syntactic in 4136) [ClassicSimilarity], result of:
            0.061581757 = score(doc=4136,freq=1.0), product of:
              0.12069338 = queryWeight, product of:
                1.0017332 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.01844815 = queryNorm
              0.5102331 = fieldWeight in 4136, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
          0.0873179 = weight(abstract_txt:filtering in 4136) [ClassicSimilarity], result of:
            0.0873179 = score(doc=4136,freq=2.0), product of:
              0.12090407 = queryWeight, product of:
                1.0026071 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.01844815 = queryNorm
              0.7222081 = fieldWeight in 4136, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
          0.01672931 = weight(abstract_txt:used in 4136) [ClassicSimilarity], result of:
            0.01672931 = score(doc=4136,freq=1.0), product of:
              0.063783854 = queryWeight, product of:
                1.029866 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01844815 = queryNorm
              0.26228127 = fieldWeight in 4136, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
          0.04951 = weight(abstract_txt:document in 4136) [ClassicSimilarity], result of:
            0.04951 = score(doc=4136,freq=2.0), product of:
              0.10435438 = queryWeight, product of:
                1.3172878 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01844815 = queryNorm
              0.47444102 = fieldWeight in 4136, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
          0.051323958 = weight(abstract_txt:characteristics in 4136) [ClassicSimilarity], result of:
            0.051323958 = score(doc=4136,freq=1.0), product of:
              0.1346704 = queryWeight, product of:
                1.4964472 = boost
                4.8781815 = idf(docFreq=918, maxDocs=44421)
                0.01844815 = queryNorm
              0.38110793 = fieldWeight in 4136, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8781815 = idf(docFreq=918, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
          0.4548779 = weight(abstract_txt:grammatical in 4136) [ClassicSimilarity], result of:
            0.4548779 = score(doc=4136,freq=4.0), product of:
              0.3633306 = queryWeight, product of:
                2.4579685 = boost
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.01844815 = queryNorm
              1.251967 = fieldWeight in 4136, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.0125885 = idf(docFreq=39, maxDocs=44421)
                0.078125 = fieldNorm(doc=4136)
        0.24 = coord(6/25)
    
  2. Haas, S.W.; Losee, R.M.: Looking in text windows : their size and composition (1994) 0.14
    0.13878027 = sum of:
      0.13878027 = product of:
        0.5782511 = sum of:
          0.061262667 = weight(abstract_txt:style in 139) [ClassicSimilarity], result of:
            0.061262667 = score(doc=139,freq=1.0), product of:
              0.1202761 = queryWeight, product of:
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.01844815 = queryNorm
              0.5093503 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
          0.01672931 = weight(abstract_txt:used in 139) [ClassicSimilarity], result of:
            0.01672931 = score(doc=139,freq=1.0), product of:
              0.063783854 = queryWeight, product of:
                1.029866 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01844815 = queryNorm
              0.26228127 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
          0.065231875 = weight(abstract_txt:text in 139) [ClassicSimilarity], result of:
            0.065231875 = score(doc=139,freq=5.0), product of:
              0.0924078 = queryWeight, product of:
                1.2395945 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01844815 = queryNorm
              0.70591307 = fieldWeight in 139, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
          0.036594883 = weight(abstract_txt:structure in 139) [ClassicSimilarity], result of:
            0.036594883 = score(doc=139,freq=1.0), product of:
              0.107482806 = queryWeight, product of:
                1.3368874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01844815 = queryNorm
              0.34047198 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
          0.19560389 = weight(abstract_txt:windows in 139) [ClassicSimilarity], result of:
            0.19560389 = score(doc=139,freq=3.0), product of:
              0.22782601 = queryWeight, product of:
                1.946377 = boost
                6.3448815 = idf(docFreq=211, maxDocs=44421)
                0.01844815 = queryNorm
              0.858567 = fieldWeight in 139, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3448815 = idf(docFreq=211, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
          0.20282853 = weight(abstract_txt:window in 139) [ClassicSimilarity], result of:
            0.20282853 = score(doc=139,freq=1.0), product of:
              0.33662376 = queryWeight, product of:
                2.3659072 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.01844815 = queryNorm
              0.60253775 = fieldWeight in 139, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.078125 = fieldNorm(doc=139)
        0.24 = coord(6/25)
    
  3. Zhu, J.; Song, D.; Rüger, S.: Integrating multiple windows and document features for expert finding (2009) 0.12
    0.12476379 = sum of:
      0.12476379 = product of:
        0.5198491 = sum of:
          0.013383448 = weight(abstract_txt:used in 3755) [ClassicSimilarity], result of:
            0.013383448 = score(doc=3755,freq=1.0), product of:
              0.063783854 = queryWeight, product of:
                1.029866 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01844815 = queryNorm
              0.20982501 = fieldWeight in 3755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
          0.06860307 = weight(abstract_txt:document in 3755) [ClassicSimilarity], result of:
            0.06860307 = score(doc=3755,freq=6.0), product of:
              0.10435438 = queryWeight, product of:
                1.3172878 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01844815 = queryNorm
              0.6574048 = fieldWeight in 3755, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
          0.029275907 = weight(abstract_txt:structure in 3755) [ClassicSimilarity], result of:
            0.029275907 = score(doc=3755,freq=1.0), product of:
              0.107482806 = queryWeight, product of:
                1.3368874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01844815 = queryNorm
              0.27237758 = fieldWeight in 3755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
          0.037193697 = weight(abstract_txt:documents in 3755) [ClassicSimilarity], result of:
            0.037193697 = score(doc=3755,freq=1.0), product of:
              0.14432517 = queryWeight, product of:
                1.8973261 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01844815 = queryNorm
              0.25770763 = fieldWeight in 3755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
          0.09034557 = weight(abstract_txt:windows in 3755) [ClassicSimilarity], result of:
            0.09034557 = score(doc=3755,freq=1.0), product of:
              0.22782601 = queryWeight, product of:
                1.946377 = boost
                6.3448815 = idf(docFreq=211, maxDocs=44421)
                0.01844815 = queryNorm
              0.3965551 = fieldWeight in 3755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3448815 = idf(docFreq=211, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
          0.28104743 = weight(abstract_txt:window in 3755) [ClassicSimilarity], result of:
            0.28104743 = score(doc=3755,freq=3.0), product of:
              0.33662376 = queryWeight, product of:
                2.3659072 = boost
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.01844815 = queryNorm
              0.8349008 = fieldWeight in 3755, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.7124834 = idf(docFreq=53, maxDocs=44421)
                0.0625 = fieldNorm(doc=3755)
        0.24 = coord(6/25)
    
  4. Losee, R.: ¬A performance model of the length and number of subject headings and index phrases (2004) 0.12
    0.12446551 = sum of:
      0.12446551 = product of:
        0.5186063 = sum of:
          0.023658818 = weight(abstract_txt:used in 4725) [ClassicSimilarity], result of:
            0.023658818 = score(doc=4725,freq=2.0), product of:
              0.063783854 = queryWeight, product of:
                1.029866 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01844815 = queryNorm
              0.37092173 = fieldWeight in 4725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
          0.041256256 = weight(abstract_txt:text in 4725) [ClassicSimilarity], result of:
            0.041256256 = score(doc=4725,freq=2.0), product of:
              0.0924078 = queryWeight, product of:
                1.2395945 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01844815 = queryNorm
              0.4464586 = fieldWeight in 4725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
          0.04951 = weight(abstract_txt:document in 4725) [ClassicSimilarity], result of:
            0.04951 = score(doc=4725,freq=2.0), product of:
              0.10435438 = queryWeight, product of:
                1.3172878 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01844815 = queryNorm
              0.47444102 = fieldWeight in 4725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
          0.051323958 = weight(abstract_txt:characteristics in 4725) [ClassicSimilarity], result of:
            0.051323958 = score(doc=4725,freq=1.0), product of:
              0.1346704 = queryWeight, product of:
                1.4964472 = boost
                4.8781815 = idf(docFreq=918, maxDocs=44421)
                0.01844815 = queryNorm
              0.38110793 = fieldWeight in 4725, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8781815 = idf(docFreq=918, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
          0.06574979 = weight(abstract_txt:documents in 4725) [ClassicSimilarity], result of:
            0.06574979 = score(doc=4725,freq=2.0), product of:
              0.14432517 = queryWeight, product of:
                1.8973261 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01844815 = queryNorm
              0.455567 = fieldWeight in 4725, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
          0.28710747 = weight(abstract_txt:phrases in 4725) [ClassicSimilarity], result of:
            0.28710747 = score(doc=4725,freq=4.0), product of:
              0.26734275 = queryWeight, product of:
                2.1084316 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.01844815 = queryNorm
              1.0739303 = fieldWeight in 4725, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.078125 = fieldNorm(doc=4725)
        0.24 = coord(6/25)
    
  5. Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.10
    0.10457635 = sum of:
      0.10457635 = product of:
        0.4357348 = sum of:
          0.09853081 = weight(abstract_txt:syntactic in 2845) [ClassicSimilarity], result of:
            0.09853081 = score(doc=2845,freq=4.0), product of:
              0.12069338 = queryWeight, product of:
                1.0017332 = boost
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.01844815 = queryNorm
              0.81637293 = fieldWeight in 2845, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5309834 = idf(docFreq=175, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.013383448 = weight(abstract_txt:used in 2845) [ClassicSimilarity], result of:
            0.013383448 = score(doc=2845,freq=1.0), product of:
              0.063783854 = queryWeight, product of:
                1.029866 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.01844815 = queryNorm
              0.20982501 = fieldWeight in 2845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.033005007 = weight(abstract_txt:text in 2845) [ClassicSimilarity], result of:
            0.033005007 = score(doc=2845,freq=2.0), product of:
              0.0924078 = queryWeight, product of:
                1.2395945 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.01844815 = queryNorm
              0.3571669 = fieldWeight in 2845, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.06262574 = weight(abstract_txt:document in 2845) [ClassicSimilarity], result of:
            0.06262574 = score(doc=2845,freq=5.0), product of:
              0.10435438 = queryWeight, product of:
                1.3172878 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.01844815 = queryNorm
              0.6001257 = fieldWeight in 2845, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.029275907 = weight(abstract_txt:structure in 2845) [ClassicSimilarity], result of:
            0.029275907 = score(doc=2845,freq=1.0), product of:
              0.107482806 = queryWeight, product of:
                1.3368874 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.01844815 = queryNorm
              0.27237758 = fieldWeight in 2845, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
          0.19891389 = weight(abstract_txt:phrases in 2845) [ClassicSimilarity], result of:
            0.19891389 = score(doc=2845,freq=3.0), product of:
              0.26734275 = queryWeight, product of:
                2.1084316 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.01844815 = queryNorm
              0.7440407 = fieldWeight in 2845, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0625 = fieldNorm(doc=2845)
        0.24 = coord(6/25)