Document (#44103)

Author
Grabus, S.
Logan, P.M.
Greenberg, J.
Title
Temporal concept drift and alignment : an empirical approach to comparing knowledge organization systems over time
Source
Knowledge organization. 49(2022) no.2, S.69 - 78
Year
2022
Abstract
This research explores temporal concept drift and temporal alignment in knowledge organization systems (KOS). A comparative analysis is pursued using the 1910 Library of Congress Subject Headings, 2020 FAST Topical, and automatic indexing. The use case involves a sample of 90 nineteenth-century Encyclopedia Britannica entries. The entries were indexed using two approaches: 1) full-text indexing; 2) Named Entity Recognition was performed upon the entries with Stanza, Stanford's NLP toolkit, and entities were automatically indexed with the Helping Interdisciplinary Vocabulary application (HIVE), using both 1910 LCSH and FAST Topical. The analysis focused on three goals: 1) identifying results that were exclusive to the 1910 LCSH output; 2) identifying terms in the exclusive set that have been deprecated from the contemporary LCSH, demonstrating temporal concept drift; and 3) exploring the historical significance of these deprecated terms. Results confirm that historical vocabularies can be used to generate anachronistic subject headings representing conceptual drift across time in KOS and historical resources. A methodological contribution is made demonstrating how to study changes in KOS over time and improve the contextualization historical humanities resources.
Content
Vgl.: https://www.nomos-elibrary.de/10.5771/0943-7444-2022-2/ko-knowledge-organization-jahrgang-49-2022-heft-2?page=1.
Object
LCSH
FAST

Similar documents (author)

  1. Logan, E.: Cognitive styles and online behaviour of novice searchers (1990) 2.55
    2.5542176 = sum of:
      2.5542176 = product of:
        5.108435 = sum of:
          5.108435 = weight(author_txt:logan in 6890) [ClassicSimilarity], result of:
            5.108435 = score(doc=6890,freq=1.0), product of:
              0.8247969 = queryWeight, product of:
                1.2077705 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.068913095 = queryNorm
              6.1935673 = fieldWeight in 6890, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=6890)
        0.5 = coord(1/2)
    
  2. Logan, E.: ¬The Internet challenge (1995) 2.55
    2.5542176 = sum of:
      2.5542176 = product of:
        5.108435 = sum of:
          5.108435 = weight(author_txt:logan in 2799) [ClassicSimilarity], result of:
            5.108435 = score(doc=2799,freq=1.0), product of:
              0.8247969 = queryWeight, product of:
                1.2077705 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.068913095 = queryNorm
              6.1935673 = fieldWeight in 2799, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=2799)
        0.5 = coord(1/2)
    
  3. Logan, E.: ¬The Internet challenge accepted (1996) 2.55
    2.5542176 = sum of:
      2.5542176 = product of:
        5.108435 = sum of:
          5.108435 = weight(author_txt:logan in 5927) [ClassicSimilarity], result of:
            5.108435 = score(doc=5927,freq=1.0), product of:
              0.8247969 = queryWeight, product of:
                1.2077705 = boost
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.068913095 = queryNorm
              6.1935673 = fieldWeight in 5927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.909708 = idf(docFreq=5, maxDocs=44421)
                0.625 = fieldNorm(doc=5927)
        0.5 = coord(1/2)
    
  4. Greenberg, A.M.: ¬An author index to Library of Congress Classification: class P, subclasses PN, PR, PS, PZ; general literature, english, juvenile belles lettres (1981) 1.45
    1.4497886 = sum of:
      1.4497886 = product of:
        2.8995771 = sum of:
          2.8995771 = weight(author_txt:greenberg in 3518) [ClassicSimilarity], result of:
            2.8995771 = score(doc=3518,freq=1.0), product of:
              0.56542915 = queryWeight, product of:
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.068913095 = queryNorm
              5.1281 = fieldWeight in 3518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.625 = fieldNorm(doc=3518)
        0.5 = coord(1/2)
    
  5. Greenberg, J.: Subject control of ephemera : MARC format options (1996) 1.45
    1.4497886 = sum of:
      1.4497886 = product of:
        2.8995771 = sum of:
          2.8995771 = weight(author_txt:greenberg in 1543) [ClassicSimilarity], result of:
            2.8995771 = score(doc=1543,freq=1.0), product of:
              0.56542915 = queryWeight, product of:
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.068913095 = queryNorm
              5.1281 = fieldWeight in 1543, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.20496 = idf(docFreq=32, maxDocs=44421)
                0.625 = fieldNorm(doc=1543)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Li, W.; Wong, K.-F.; Yuan, C.: Toward automatic Chinese temporal information extraction (2001) 0.08
    0.08102169 = sum of:
      0.08102169 = product of:
        0.50638556 = sum of:
          0.015265568 = weight(abstract_txt:over in 29) [ClassicSimilarity], result of:
            0.015265568 = score(doc=29,freq=1.0), product of:
              0.05756919 = queryWeight, product of:
                1.0416505 = boost
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.013026425 = queryNorm
              0.26516905 = fieldWeight in 29, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.242705 = idf(docFreq=1734, maxDocs=44421)
                0.0625 = fieldNorm(doc=29)
          0.030278098 = weight(abstract_txt:time in 29) [ClassicSimilarity], result of:
            0.030278098 = score(doc=29,freq=2.0), product of:
              0.08256975 = queryWeight, product of:
                1.5278584 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.013026425 = queryNorm
              0.36669722 = fieldWeight in 29, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0625 = fieldNorm(doc=29)
          0.047474176 = weight(abstract_txt:concept in 29) [ClassicSimilarity], result of:
            0.047474176 = score(doc=29,freq=3.0), product of:
              0.0973516 = queryWeight, product of:
                1.6589916 = boost
                4.5047812 = idf(docFreq=1334, maxDocs=44421)
                0.013026425 = queryNorm
              0.48765686 = fieldWeight in 29, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5047812 = idf(docFreq=1334, maxDocs=44421)
                0.0625 = fieldNorm(doc=29)
          0.41336775 = weight(abstract_txt:temporal in 29) [ClassicSimilarity], result of:
            0.41336775 = score(doc=29,freq=10.0), product of:
              0.30358657 = queryWeight, product of:
                3.3828542 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.013026425 = queryNorm
              1.3616141 = fieldWeight in 29, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.0625 = fieldNorm(doc=29)
        0.16 = coord(4/25)
    
  2. Jatowt, A.; Yeung, C.M.A.; Tanaka, K.: Generic method for detecting focus time of documents (2015) 0.08
    0.07751804 = sum of:
      0.07751804 = product of:
        0.38759017 = sum of:
          0.015063092 = weight(abstract_txt:resources in 3668) [ClassicSimilarity], result of:
            0.015063092 = score(doc=3668,freq=1.0), product of:
              0.05705901 = queryWeight, product of:
                1.0370247 = boost
                4.2238636 = idf(docFreq=1767, maxDocs=44421)
                0.013026425 = queryNorm
              0.26399148 = fieldWeight in 3668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2238636 = idf(docFreq=1767, maxDocs=44421)
                0.0625 = fieldNorm(doc=3668)
          0.012385832 = weight(abstract_txt:using in 3668) [ClassicSimilarity], result of:
            0.012385832 = score(doc=3668,freq=1.0), product of:
              0.05732737 = queryWeight, product of:
                1.2730739 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.013026425 = queryNorm
              0.21605442 = fieldWeight in 3668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=3668)
          0.06422954 = weight(abstract_txt:time in 3668) [ClassicSimilarity], result of:
            0.06422954 = score(doc=3668,freq=9.0), product of:
              0.08256975 = queryWeight, product of:
                1.5278584 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.013026425 = queryNorm
              0.7778823 = fieldWeight in 3668, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.0625 = fieldNorm(doc=3668)
          0.06950087 = weight(abstract_txt:historical in 3668) [ClassicSimilarity], result of:
            0.06950087 = score(doc=3668,freq=1.0), product of:
              0.19924387 = queryWeight, product of:
                2.7405295 = boost
                5.58117 = idf(docFreq=454, maxDocs=44421)
                0.013026425 = queryNorm
              0.34882313 = fieldWeight in 3668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.58117 = idf(docFreq=454, maxDocs=44421)
                0.0625 = fieldNorm(doc=3668)
          0.22641085 = weight(abstract_txt:temporal in 3668) [ClassicSimilarity], result of:
            0.22641085 = score(doc=3668,freq=3.0), product of:
              0.30358657 = queryWeight, product of:
                3.3828542 = boost
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.013026425 = queryNorm
              0.7457868 = fieldWeight in 3668, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.889283 = idf(docFreq=122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3668)
        0.2 = coord(5/25)
    
  3. Carlyle, A.: Matching LCSH and user vocabulary in the library catalog (1989) 0.07
    0.07382683 = sum of:
      0.07382683 = product of:
        0.36913413 = sum of:
          0.06838608 = weight(abstract_txt:headings in 574) [ClassicSimilarity], result of:
            0.06838608 = score(doc=574,freq=4.0), product of:
              0.08493107 = queryWeight, product of:
                1.2652032 = boost
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.013026425 = queryNorm
              0.80519503 = fieldWeight in 574, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.078125 = fieldNorm(doc=574)
          0.01548229 = weight(abstract_txt:using in 574) [ClassicSimilarity], result of:
            0.01548229 = score(doc=574,freq=1.0), product of:
              0.05732737 = queryWeight, product of:
                1.2730739 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.013026425 = queryNorm
              0.27006802 = fieldWeight in 574, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=574)
          0.018488033 = weight(abstract_txt:were in 574) [ClassicSimilarity], result of:
            0.018488033 = score(doc=574,freq=1.0), product of:
              0.06452565 = queryWeight, product of:
                1.3506376 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.013026425 = queryNorm
              0.28652224 = fieldWeight in 574, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.078125 = fieldNorm(doc=574)
          0.03784762 = weight(abstract_txt:time in 574) [ClassicSimilarity], result of:
            0.03784762 = score(doc=574,freq=2.0), product of:
              0.08256975 = queryWeight, product of:
                1.5278584 = boost
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.013026425 = queryNorm
              0.45837152 = fieldWeight in 574, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1487055 = idf(docFreq=1905, maxDocs=44421)
                0.078125 = fieldNorm(doc=574)
          0.22893009 = weight(abstract_txt:lcsh in 574) [ClassicSimilarity], result of:
            0.22893009 = score(doc=574,freq=6.0), product of:
              0.1900597 = queryWeight, product of:
                2.3180225 = boost
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.013026425 = queryNorm
              1.2045166 = fieldWeight in 574, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.078125 = fieldNorm(doc=574)
        0.2 = coord(5/25)
    
  4. Yi, K.; Chan, L.M.: Revisiting the syntactical and structural analysis of Library of Congress Subject Headings for the digital environment (2010) 0.07
    0.071928345 = sum of:
      0.071928345 = product of:
        0.35964173 = sum of:
          0.03765773 = weight(abstract_txt:resources in 418) [ClassicSimilarity], result of:
            0.03765773 = score(doc=418,freq=4.0), product of:
              0.05705901 = queryWeight, product of:
                1.0370247 = boost
                4.2238636 = idf(docFreq=1767, maxDocs=44421)
                0.013026425 = queryNorm
              0.6599787 = fieldWeight in 418, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2238636 = idf(docFreq=1767, maxDocs=44421)
                0.078125 = fieldNorm(doc=418)
          0.022030182 = weight(abstract_txt:organization in 418) [ClassicSimilarity], result of:
            0.022030182 = score(doc=418,freq=1.0), product of:
              0.06335587 = queryWeight, product of:
                1.092749 = boost
                4.450832 = idf(docFreq=1408, maxDocs=44421)
                0.013026425 = queryNorm
              0.34772125 = fieldWeight in 418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.450832 = idf(docFreq=1408, maxDocs=44421)
                0.078125 = fieldNorm(doc=418)
          0.03419304 = weight(abstract_txt:headings in 418) [ClassicSimilarity], result of:
            0.03419304 = score(doc=418,freq=1.0), product of:
              0.08493107 = queryWeight, product of:
                1.2652032 = boost
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.013026425 = queryNorm
              0.40259752 = fieldWeight in 418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.078125 = fieldNorm(doc=418)
          0.018488033 = weight(abstract_txt:were in 418) [ClassicSimilarity], result of:
            0.018488033 = score(doc=418,freq=1.0), product of:
              0.06452565 = queryWeight, product of:
                1.3506376 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.013026425 = queryNorm
              0.28652224 = fieldWeight in 418, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.078125 = fieldNorm(doc=418)
          0.24727273 = weight(abstract_txt:lcsh in 418) [ClassicSimilarity], result of:
            0.24727273 = score(doc=418,freq=7.0), product of:
              0.1900597 = queryWeight, product of:
                2.3180225 = boost
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.013026425 = queryNorm
              1.3010266 = fieldWeight in 418, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.078125 = fieldNorm(doc=418)
        0.2 = coord(5/25)
    
  5. Frost, C.O.; Dede, B.A.: Subject heading compatibility between LCSH and catalog files of a large research library : a suggested model for analysis (1988) 0.07
    0.067323156 = sum of:
      0.067323156 = product of:
        0.42076972 = sum of:
          0.07106889 = weight(abstract_txt:headings in 654) [ClassicSimilarity], result of:
            0.07106889 = score(doc=654,freq=3.0), product of:
              0.08493107 = queryWeight, product of:
                1.2652032 = boost
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.013026425 = queryNorm
              0.83678323 = fieldWeight in 654, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1532483 = idf(docFreq=697, maxDocs=44421)
                0.09375 = fieldNorm(doc=654)
          0.031375233 = weight(abstract_txt:were in 654) [ClassicSimilarity], result of:
            0.031375233 = score(doc=654,freq=2.0), product of:
              0.06452565 = queryWeight, product of:
                1.3506376 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.013026425 = queryNorm
              0.48624435 = fieldWeight in 654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.09375 = fieldNorm(doc=654)
          0.124071985 = weight(abstract_txt:topical in 654) [ClassicSimilarity], result of:
            0.124071985 = score(doc=654,freq=2.0), product of:
              0.1409591 = queryWeight, product of:
                1.6299473 = boost
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.013026425 = queryNorm
              0.8801985 = fieldWeight in 654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6388726 = idf(docFreq=157, maxDocs=44421)
                0.09375 = fieldNorm(doc=654)
          0.19425361 = weight(abstract_txt:lcsh in 654) [ClassicSimilarity], result of:
            0.19425361 = score(doc=654,freq=3.0), product of:
              0.1900597 = queryWeight, product of:
                2.3180225 = boost
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.013026425 = queryNorm
              1.0220662 = fieldWeight in 654, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.294296 = idf(docFreq=222, maxDocs=44421)
                0.09375 = fieldNorm(doc=654)
        0.16 = coord(4/25)