Document (#21770)

Author
Humphrey, S.M.
Title
Automatic indexing of documents from journal descriptors : a preliminary investigation
Source
Journal of the American Society for Information Science. 50(1999) no.8, S.661-674
Year
1999
Abstract
A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
Theme
Automatisches Indexieren
Field
Medizin
Object
Medline

Similar documents (author)

  1. Humphrey, S.M.: Use and management of classification systems for knowledge-based indexing (1992) 5.81
    5.814733 = sum of:
      5.814733 = weight(author_txt:humphrey in 2093) [ClassicSimilarity], result of:
        5.814733 = fieldWeight in 2093, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.625 = fieldNorm(doc=2093)
    
  2. Humphrey, S.M.: Indexing biomedical documents : from thesaural to knowledge-based retrieval systems (1992) 5.81
    5.814733 = sum of:
      5.814733 = weight(author_txt:humphrey in 7640) [ClassicSimilarity], result of:
        5.814733 = fieldWeight in 7640, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.625 = fieldNorm(doc=7640)
    
  3. Humphrey, S.M.: ¬The MedIndEx prototype for computer assisted MEDLINE database indexing (1993) 5.81
    5.814733 = sum of:
      5.814733 = weight(author_txt:humphrey in 7818) [ClassicSimilarity], result of:
        5.814733 = fieldWeight in 7818, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.625 = fieldNorm(doc=7818)
    
  4. Humphrey, S.M.: Knowledge-based systems for indexing (1994) 5.81
    5.814733 = sum of:
      5.814733 = weight(author_txt:humphrey in 3055) [ClassicSimilarity], result of:
        5.814733 = fieldWeight in 3055, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.625 = fieldNorm(doc=3055)
    
  5. Humphrey, J.: Manuscripts and metadata : Descriptive metadata in three manuscript catalogs: DigCIM, MALVINE, & Digital Scriptorium (2007) 5.81
    5.814733 = sum of:
      5.814733 = weight(author_txt:humphrey in 1783) [ClassicSimilarity], result of:
        5.814733 = fieldWeight in 1783, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.303573 = idf(docFreq=10, maxDocs=44421)
          0.625 = fieldNorm(doc=1783)
    

Similar documents (content)

  1. Ferber, R.: Automated indexing with thesaurus descriptors : a co-occurence based approach to multilingual retrieval (1997) 0.18
    0.18251376 = sum of:
      0.18251376 = product of:
        0.760474 = sum of:
          0.011882051 = weight(abstract_txt:this in 5144) [ClassicSimilarity], result of:
            0.011882051 = score(doc=5144,freq=3.0), product of:
              0.045615535 = queryWeight, product of:
                1.0873389 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017434515 = queryNorm
              0.26048255 = fieldWeight in 5144, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
          0.017188532 = weight(abstract_txt:approach in 5144) [ClassicSimilarity], result of:
            0.017188532 = score(doc=5144,freq=1.0), product of:
              0.07351134 = queryWeight, product of:
                1.1270419 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017434515 = queryNorm
              0.2338215 = fieldWeight in 5144, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
          0.23801638 = weight(abstract_txt:descriptors in 5144) [ClassicSimilarity], result of:
            0.23801638 = score(doc=5144,freq=6.0), product of:
              0.23328276 = queryWeight, product of:
                2.007725 = boost
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.017434515 = queryNorm
              1.0202913 = fieldWeight in 5144, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
          0.12347661 = weight(abstract_txt:training in 5144) [ClassicSimilarity], result of:
            0.12347661 = score(doc=5144,freq=2.0), product of:
              0.2736854 = queryWeight, product of:
                3.075415 = boost
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.017434515 = queryNorm
              0.45116258 = fieldWeight in 5144, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
          0.09964785 = weight(abstract_txt:documents in 5144) [ClassicSimilarity], result of:
            0.09964785 = score(doc=5144,freq=3.0), product of:
              0.22324412 = queryWeight, product of:
                3.1054382 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017434515 = queryNorm
              0.4463627 = fieldWeight in 5144, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
          0.27026263 = weight(abstract_txt:indexing in 5144) [ClassicSimilarity], result of:
            0.27026263 = score(doc=5144,freq=4.0), product of:
              0.49699938 = queryWeight, product of:
                6.5527835 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.017434515 = queryNorm
              0.5437887 = fieldWeight in 5144, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=5144)
        0.24 = coord(6/25)
    
  2. Humphrey, S.M.; Névéol, A.; Browne, A.; Gobeil, J.; Ruch, P.; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty (2009) 0.18
    0.17953806 = sum of:
      0.17953806 = product of:
        0.74807525 = sum of:
          0.009701654 = weight(abstract_txt:this in 287) [ClassicSimilarity], result of:
            0.009701654 = score(doc=287,freq=2.0), product of:
              0.045615535 = queryWeight, product of:
                1.0873389 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017434515 = queryNorm
              0.21268311 = fieldWeight in 287, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
          0.09716978 = weight(abstract_txt:descriptors in 287) [ClassicSimilarity], result of:
            0.09716978 = score(doc=287,freq=1.0), product of:
              0.23328276 = queryWeight, product of:
                2.007725 = boost
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.017434515 = queryNorm
              0.4165322 = fieldWeight in 287, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
          0.06639064 = weight(abstract_txt:journals in 287) [ClassicSimilarity], result of:
            0.06639064 = score(doc=287,freq=1.0), product of:
              0.20715567 = queryWeight, product of:
                2.317165 = boost
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.017434515 = queryNorm
              0.32048672 = fieldWeight in 287, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
          0.11506342 = weight(abstract_txt:documents in 287) [ClassicSimilarity], result of:
            0.11506342 = score(doc=287,freq=4.0), product of:
              0.22324412 = queryWeight, product of:
                3.1054382 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017434515 = queryNorm
              0.51541525 = fieldWeight in 287, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
          0.18948714 = weight(abstract_txt:journal in 287) [ClassicSimilarity], result of:
            0.18948714 = score(doc=287,freq=2.0), product of:
              0.41681698 = queryWeight, product of:
                4.648322 = boost
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.017434515 = queryNorm
              0.45460513 = fieldWeight in 287, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
          0.27026263 = weight(abstract_txt:indexing in 287) [ClassicSimilarity], result of:
            0.27026263 = score(doc=287,freq=4.0), product of:
              0.49699938 = queryWeight, product of:
                6.5527835 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.017434515 = queryNorm
              0.5437887 = fieldWeight in 287, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=287)
        0.24 = coord(6/25)
    
  3. Harter, S.P.; Nisonger, T.E.; Weng, A.: Semantic relationsships between cited and citing articles in library and information science journals (1993) 0.17
    0.16645813 = sum of:
      0.16645813 = product of:
        0.69357556 = sum of:
          0.0068601053 = weight(abstract_txt:this in 5643) [ClassicSimilarity], result of:
            0.0068601053 = score(doc=5643,freq=1.0), product of:
              0.045615535 = queryWeight, product of:
                1.0873389 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017434515 = queryNorm
              0.15038967 = fieldWeight in 5643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
          0.09716978 = weight(abstract_txt:descriptors in 5643) [ClassicSimilarity], result of:
            0.09716978 = score(doc=5643,freq=1.0), product of:
              0.23328276 = queryWeight, product of:
                2.007725 = boost
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.017434515 = queryNorm
              0.4165322 = fieldWeight in 5643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
          0.09389055 = weight(abstract_txt:journals in 5643) [ClassicSimilarity], result of:
            0.09389055 = score(doc=5643,freq=2.0), product of:
              0.20715567 = queryWeight, product of:
                2.317165 = boost
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.017434515 = queryNorm
              0.45323667 = fieldWeight in 5643, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
          0.11506342 = weight(abstract_txt:documents in 5643) [ClassicSimilarity], result of:
            0.11506342 = score(doc=5643,freq=4.0), product of:
              0.22324412 = queryWeight, product of:
                3.1054382 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017434515 = queryNorm
              0.51541525 = fieldWeight in 5643, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
          0.18948714 = weight(abstract_txt:journal in 5643) [ClassicSimilarity], result of:
            0.18948714 = score(doc=5643,freq=2.0), product of:
              0.41681698 = queryWeight, product of:
                4.648322 = boost
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.017434515 = queryNorm
              0.45460513 = fieldWeight in 5643, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
          0.19110455 = weight(abstract_txt:indexing in 5643) [ClassicSimilarity], result of:
            0.19110455 = score(doc=5643,freq=2.0), product of:
              0.49699938 = queryWeight, product of:
                6.5527835 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.017434515 = queryNorm
              0.38451666 = fieldWeight in 5643, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=5643)
        0.24 = coord(6/25)
    
  4. Lu, K.; Mao, J.: ¬An automatic approach to weighted subject indexing : an empirical study in the biomedical domain (2015) 0.15
    0.1473128 = sum of:
      0.1473128 = product of:
        0.736564 = sum of:
          0.009701654 = weight(abstract_txt:this in 5) [ClassicSimilarity], result of:
            0.009701654 = score(doc=5,freq=2.0), product of:
              0.045615535 = queryWeight, product of:
                1.0873389 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017434515 = queryNorm
              0.21268311 = fieldWeight in 5, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=5)
          0.062472943 = weight(abstract_txt:feasible in 5) [ClassicSimilarity], result of:
            0.062472943 = score(doc=5,freq=1.0), product of:
              0.1379261 = queryWeight, product of:
                1.0916191 = boost
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.017434515 = queryNorm
              0.45294502 = fieldWeight in 5, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.0625 = fieldNorm(doc=5)
          0.1374188 = weight(abstract_txt:descriptors in 5) [ClassicSimilarity], result of:
            0.1374188 = score(doc=5,freq=2.0), product of:
              0.23328276 = queryWeight, product of:
                2.007725 = boost
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.017434515 = queryNorm
              0.58906543 = fieldWeight in 5, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.664515 = idf(docFreq=153, maxDocs=44421)
                0.0625 = fieldNorm(doc=5)
          0.09964785 = weight(abstract_txt:documents in 5) [ClassicSimilarity], result of:
            0.09964785 = score(doc=5,freq=3.0), product of:
              0.22324412 = queryWeight, product of:
                3.1054382 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017434515 = queryNorm
              0.4463627 = fieldWeight in 5, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=5)
          0.42732275 = weight(abstract_txt:indexing in 5) [ClassicSimilarity], result of:
            0.42732275 = score(doc=5,freq=10.0), product of:
              0.49699938 = queryWeight, product of:
                6.5527835 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.017434515 = queryNorm
              0.8598054 = fieldWeight in 5, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=5)
        0.2 = coord(5/25)
    
  5. Nebelong-Bonnevie, E.; Frandsen, T.F.: Journal citation identity and journal citation image : a portrait of the Journal of Documentation (2006) 0.13
    0.12849185 = sum of:
      0.12849185 = product of:
        0.6424592 = sum of:
          0.009701654 = weight(abstract_txt:this in 586) [ClassicSimilarity], result of:
            0.009701654 = score(doc=586,freq=2.0), product of:
              0.045615535 = queryWeight, product of:
                1.0873389 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017434515 = queryNorm
              0.21268311 = fieldWeight in 586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=586)
          0.017188532 = weight(abstract_txt:approach in 586) [ClassicSimilarity], result of:
            0.017188532 = score(doc=586,freq=1.0), product of:
              0.07351134 = queryWeight, product of:
                1.1270419 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.017434515 = queryNorm
              0.2338215 = fieldWeight in 586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=586)
          0.09389055 = weight(abstract_txt:journals in 586) [ClassicSimilarity], result of:
            0.09389055 = score(doc=586,freq=2.0), product of:
              0.20715567 = queryWeight, product of:
                2.317165 = boost
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.017434515 = queryNorm
              0.45323667 = fieldWeight in 586, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1277876 = idf(docFreq=715, maxDocs=44421)
                0.0625 = fieldNorm(doc=586)
          0.05753171 = weight(abstract_txt:documents in 586) [ClassicSimilarity], result of:
            0.05753171 = score(doc=586,freq=1.0), product of:
              0.22324412 = queryWeight, product of:
                3.1054382 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.017434515 = queryNorm
              0.25770763 = fieldWeight in 586, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=586)
          0.46414676 = weight(abstract_txt:journal in 586) [ClassicSimilarity], result of:
            0.46414676 = score(doc=586,freq=12.0), product of:
              0.41681698 = queryWeight, product of:
                4.648322 = boost
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.017434515 = queryNorm
              1.1135505 = fieldWeight in 586, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.14327 = idf(docFreq=704, maxDocs=44421)
                0.0625 = fieldNorm(doc=586)
        0.2 = coord(5/25)