Document (#21770)

Author
Humphrey, S.M.
Title
Automatic indexing of documents from journal descriptors : a preliminary investigation
Source
Journal of the American Society for Information Science. 50(1999) no.8, S.661-674
Year
1999
Abstract
A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
Theme
Automatisches Indexieren
Field
Medizin
Object
Medline

Similar documents (author)

  1. Humphrey, S.M.: Use and management of classification systems for knowledge-based indexing (1992) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:humphrey in 2094) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 2094, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=2094)
    
  2. Humphrey, S.M.: Indexing biomedical documents : from thesaural to knowledge-based retrieval systems (1992) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:humphrey in 7641) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 7641, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=7641)
    
  3. Humphrey, S.M.: ¬The MedIndEx prototype for computer assisted MEDLINE database indexing (1993) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:humphrey in 7819) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 7819, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=7819)
    
  4. Humphrey, S.M.: Knowledge-based systems for indexing (1994) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:humphrey in 2987) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 2987, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=2987)
    
  5. Humphrey, J.: Manuscripts and metadata : Descriptive metadata in three manuscript catalogs: DigCIM, MALVINE, & Digital Scriptorium (2007) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:humphrey in 783) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 783, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=783)
    

Similar documents (content)

  1. Ferber, R.: Automated indexing with thesaurus descriptors : a co-occurence based approach to multilingual retrieval (1997) 0.18
    0.18237129 = sum of:
      0.18237129 = product of:
        0.75988036 = sum of:
          0.011974213 = weight(abstract_txt:this in 4144) [ClassicSimilarity], result of:
            0.011974213 = score(doc=4144,freq=3.0), product of:
              0.045840133 = queryWeight, product of:
                1.0890656 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01744341 = queryNorm
              0.2612168 = fieldWeight in 4144, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
          0.017233886 = weight(abstract_txt:approach in 4144) [ClassicSimilarity], result of:
            0.017233886 = score(doc=4144,freq=1.0), product of:
              0.07362297 = queryWeight, product of:
                1.1269175 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01744341 = queryNorm
              0.234083 = fieldWeight in 4144, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
          0.23735544 = weight(abstract_txt:descriptors in 4144) [ClassicSimilarity], result of:
            0.23735544 = score(doc=4144,freq=6.0), product of:
              0.23279496 = queryWeight, product of:
                2.0038824 = boost
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.01744341 = queryNorm
              1.0195901 = fieldWeight in 4144, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
          0.1239526 = weight(abstract_txt:training in 4144) [ClassicSimilarity], result of:
            0.1239526 = score(doc=4144,freq=2.0), product of:
              0.27432263 = queryWeight, product of:
                3.0763183 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.01744341 = queryNorm
              0.4518497 = fieldWeight in 4144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
          0.099430084 = weight(abstract_txt:documents in 4144) [ClassicSimilarity], result of:
            0.099430084 = score(doc=4144,freq=3.0), product of:
              0.22286542 = queryWeight, product of:
                3.100108 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.01744341 = queryNorm
              0.44614407 = fieldWeight in 4144, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
          0.26993415 = weight(abstract_txt:indexing in 4144) [ClassicSimilarity], result of:
            0.26993415 = score(doc=4144,freq=4.0), product of:
              0.4964777 = queryWeight, product of:
                6.543654 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.01744341 = queryNorm
              0.54369843 = fieldWeight in 4144, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=4144)
        0.24 = coord(6/25)
    
  2. Humphrey, S.M.; Névéol, A.; Browne, A.; Gobeil, J.; Ruch, P.; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty (2009) 0.18
    0.17948708 = sum of:
      0.17948708 = product of:
        0.7478629 = sum of:
          0.009776904 = weight(abstract_txt:this in 3300) [ClassicSimilarity], result of:
            0.009776904 = score(doc=3300,freq=2.0), product of:
              0.045840133 = queryWeight, product of:
                1.0890656 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01744341 = queryNorm
              0.21328263 = fieldWeight in 3300, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
          0.09689995 = weight(abstract_txt:descriptors in 3300) [ClassicSimilarity], result of:
            0.09689995 = score(doc=3300,freq=1.0), product of:
              0.23279496 = queryWeight, product of:
                2.0038824 = boost
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.01744341 = queryNorm
              0.4162459 = fieldWeight in 3300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
          0.06649193 = weight(abstract_txt:journals in 3300) [ClassicSimilarity], result of:
            0.06649193 = score(doc=3300,freq=1.0), product of:
              0.20731668 = queryWeight, product of:
                2.3160515 = boost
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.01744341 = queryNorm
              0.3207264 = fieldWeight in 3300, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
          0.11481198 = weight(abstract_txt:documents in 3300) [ClassicSimilarity], result of:
            0.11481198 = score(doc=3300,freq=4.0), product of:
              0.22286542 = queryWeight, product of:
                3.100108 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.01744341 = queryNorm
              0.5151628 = fieldWeight in 3300, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
          0.18994796 = weight(abstract_txt:journal in 3300) [ClassicSimilarity], result of:
            0.18994796 = score(doc=3300,freq=2.0), product of:
              0.41739258 = queryWeight, product of:
                4.6474895 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.01744341 = queryNorm
              0.45508227 = fieldWeight in 3300, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
          0.26993415 = weight(abstract_txt:indexing in 3300) [ClassicSimilarity], result of:
            0.26993415 = score(doc=3300,freq=4.0), product of:
              0.4964777 = queryWeight, product of:
                6.543654 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.01744341 = queryNorm
              0.54369843 = fieldWeight in 3300, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=3300)
        0.24 = coord(6/25)
    
  3. Harter, S.P.; Nisonger, T.E.; Weng, A.: Semantic relationsships between cited and citing articles in library and information science journals (1993) 0.17
    0.16643502 = sum of:
      0.16643502 = product of:
        0.69347924 = sum of:
          0.006913315 = weight(abstract_txt:this in 5644) [ClassicSimilarity], result of:
            0.006913315 = score(doc=5644,freq=1.0), product of:
              0.045840133 = queryWeight, product of:
                1.0890656 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01744341 = queryNorm
              0.1508136 = fieldWeight in 5644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
          0.09689995 = weight(abstract_txt:descriptors in 5644) [ClassicSimilarity], result of:
            0.09689995 = score(doc=5644,freq=1.0), product of:
              0.23279496 = queryWeight, product of:
                2.0038824 = boost
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.01744341 = queryNorm
              0.4162459 = fieldWeight in 5644, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
          0.09403379 = weight(abstract_txt:journals in 5644) [ClassicSimilarity], result of:
            0.09403379 = score(doc=5644,freq=2.0), product of:
              0.20731668 = queryWeight, product of:
                2.3160515 = boost
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.01744341 = queryNorm
              0.4535756 = fieldWeight in 5644, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
          0.11481198 = weight(abstract_txt:documents in 5644) [ClassicSimilarity], result of:
            0.11481198 = score(doc=5644,freq=4.0), product of:
              0.22286542 = queryWeight, product of:
                3.100108 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.01744341 = queryNorm
              0.5151628 = fieldWeight in 5644, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
          0.18994796 = weight(abstract_txt:journal in 5644) [ClassicSimilarity], result of:
            0.18994796 = score(doc=5644,freq=2.0), product of:
              0.41739258 = queryWeight, product of:
                4.6474895 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.01744341 = queryNorm
              0.45508227 = fieldWeight in 5644, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
          0.19087227 = weight(abstract_txt:indexing in 5644) [ClassicSimilarity], result of:
            0.19087227 = score(doc=5644,freq=2.0), product of:
              0.4964777 = queryWeight, product of:
                6.543654 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.01744341 = queryNorm
              0.38445285 = fieldWeight in 5644, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=5644)
        0.24 = coord(6/25)
    
  4. Lu, K.; Mao, J.: ¬An automatic approach to weighted subject indexing : an empirical study in the biomedical domain (2015) 0.15
    0.14707147 = sum of:
      0.14707147 = product of:
        0.73535734 = sum of:
          0.009776904 = weight(abstract_txt:this in 4005) [ClassicSimilarity], result of:
            0.009776904 = score(doc=4005,freq=2.0), product of:
              0.045840133 = queryWeight, product of:
                1.0890656 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01744341 = queryNorm
              0.21328263 = fieldWeight in 4005, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4005)
          0.062309798 = weight(abstract_txt:feasible in 4005) [ClassicSimilarity], result of:
            0.062309798 = score(doc=4005,freq=1.0), product of:
              0.13765292 = queryWeight, product of:
                1.0895902 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.01744341 = queryNorm
              0.45265874 = fieldWeight in 4005, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=4005)
          0.13703722 = weight(abstract_txt:descriptors in 4005) [ClassicSimilarity], result of:
            0.13703722 = score(doc=4005,freq=2.0), product of:
              0.23279496 = queryWeight, product of:
                2.0038824 = boost
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.01744341 = queryNorm
              0.5886606 = fieldWeight in 4005, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6599345 = idf(docFreq=153, maxDocs=44218)
                0.0625 = fieldNorm(doc=4005)
          0.099430084 = weight(abstract_txt:documents in 4005) [ClassicSimilarity], result of:
            0.099430084 = score(doc=4005,freq=3.0), product of:
              0.22286542 = queryWeight, product of:
                3.100108 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.01744341 = queryNorm
              0.44614407 = fieldWeight in 4005, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=4005)
          0.42680335 = weight(abstract_txt:indexing in 4005) [ClassicSimilarity], result of:
            0.42680335 = score(doc=4005,freq=10.0), product of:
              0.4964777 = queryWeight, product of:
                6.543654 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.01744341 = queryNorm
              0.8596627 = fieldWeight in 4005, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=4005)
        0.2 = coord(5/25)
    
  5. Nebelong-Bonnevie, E.; Frandsen, T.F.: Journal citation identity and journal citation image : a portrait of the Journal of Documentation (2006) 0.13
    0.12874524 = sum of:
      0.12874524 = product of:
        0.64372617 = sum of:
          0.009776904 = weight(abstract_txt:this in 5586) [ClassicSimilarity], result of:
            0.009776904 = score(doc=5586,freq=2.0), product of:
              0.045840133 = queryWeight, product of:
                1.0890656 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01744341 = queryNorm
              0.21328263 = fieldWeight in 5586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5586)
          0.017233886 = weight(abstract_txt:approach in 5586) [ClassicSimilarity], result of:
            0.017233886 = score(doc=5586,freq=1.0), product of:
              0.07362297 = queryWeight, product of:
                1.1269175 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.01744341 = queryNorm
              0.234083 = fieldWeight in 5586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=5586)
          0.09403379 = weight(abstract_txt:journals in 5586) [ClassicSimilarity], result of:
            0.09403379 = score(doc=5586,freq=2.0), product of:
              0.20731668 = queryWeight, product of:
                2.3160515 = boost
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.01744341 = queryNorm
              0.4535756 = fieldWeight in 5586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1316223 = idf(docFreq=709, maxDocs=44218)
                0.0625 = fieldNorm(doc=5586)
          0.05740599 = weight(abstract_txt:documents in 5586) [ClassicSimilarity], result of:
            0.05740599 = score(doc=5586,freq=1.0), product of:
              0.22286542 = queryWeight, product of:
                3.100108 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.01744341 = queryNorm
              0.2575814 = fieldWeight in 5586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=5586)
          0.46527562 = weight(abstract_txt:journal in 5586) [ClassicSimilarity], result of:
            0.46527562 = score(doc=5586,freq=12.0), product of:
              0.41739258 = queryWeight, product of:
                4.6474895 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.01744341 = queryNorm
              1.1147194 = fieldWeight in 5586, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=5586)
        0.2 = coord(5/25)