Document (#30654)

Author
Haravu, L.J.
Neelameghan, A.
Title
Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products
Source
Cataloging and classification quarterly. 37(2003) nos.1/2, S.96-114
Year
2003
Abstract
Discusses the importance of knowledge organization in the context of the information overload caused by the vast quantities of data and information accessible on internal and external networks of an organization. Defines the characteristics of a knowledge-based product. Elaborates on the techniques and applications of text mining in developing knowledge products. Presents two approaches, as case studies, to the making of knowledge products: (1) steps and processes in the planning, designing and development of a composite multilingual multimedia CD product, with the potential international, inter-cultural end users in view, and (2) application of natural language processing software in text mining. Using a text mining software, it is possible to link concept terms from a processed text to a related thesaurus, glossary, schedules of a classification scheme, and facet structured subject representations. Concludes that the products of text mining and data mining could be made more useful if the features of a faceted scheme for subject classification are incorporated into text mining techniques and products.
Content
Beitrag eines Themenheftes "Knowledge organization and classification in international information retrieval"
Footnote
Vgl. auch: http://catalogingandclassificationquarterly.com/
Theme
Data Mining

Similar documents (author)

  1. Neelameghan, A.: Interdisciplinary research and classification problems : a case study (1974) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:neelameghan in 1816) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 1816, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=1816)
    
  2. Neelameghan, A.: Classification, theory of (1971) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:neelameghan in 1987) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 1987, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=1987)
    
  3. Neelameghan, A.: Design of scheme for classification (1969) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:neelameghan in 1989) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 1989, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=1989)
    
  4. Neelameghan, A.: Use of computer for the synthesis of class number : a case study with a freely faceted version of Colon Classification (1968) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:neelameghan in 1990) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 1990, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=1990)
    
  5. Neelameghan, A.: Application of Ranganathan's general theory of knowledge classification in designing specialized databases (1992) 5.11
    5.1094418 = sum of:
      5.1094418 = weight(author_txt:neelameghan in 2962) [ClassicSimilarity], result of:
        5.1094418 = fieldWeight in 2962, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.175107 = idf(docFreq=33, maxDocs=44421)
          0.625 = fieldNorm(doc=2962)
    

Similar documents (content)

  1. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.31
    0.3137668 = sum of:
      0.3137668 = product of:
        0.9805213 = sum of:
          0.054131508 = weight(abstract_txt:processed in 3291) [ClassicSimilarity], result of:
            0.054131508 = score(doc=3291,freq=1.0), product of:
              0.11970106 = queryWeight, product of:
                1.1365675 = boost
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.014555618 = queryNorm
              0.45222247 = fieldWeight in 3291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2355595 = idf(docFreq=86, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.033454362 = weight(abstract_txt:software in 3291) [ClassicSimilarity], result of:
            0.033454362 = score(doc=3291,freq=2.0), product of:
              0.08684932 = queryWeight, product of:
                1.3691292 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.014555618 = queryNorm
              0.38520005 = fieldWeight in 3291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.037625495 = weight(abstract_txt:techniques in 3291) [ClassicSimilarity], result of:
            0.037625495 = score(doc=3291,freq=2.0), product of:
              0.09392605 = queryWeight, product of:
                1.4238173 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.014555618 = queryNorm
              0.40058637 = fieldWeight in 3291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.035583485 = weight(abstract_txt:making in 3291) [ClassicSimilarity], result of:
            0.035583485 = score(doc=3291,freq=1.0), product of:
              0.11401804 = queryWeight, product of:
                1.5687293 = boost
                4.9933834 = idf(docFreq=818, maxDocs=44421)
                0.014555618 = queryNorm
              0.31208646 = fieldWeight in 3291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9933834 = idf(docFreq=818, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.05708837 = weight(abstract_txt:data in 3291) [ClassicSimilarity], result of:
            0.05708837 = score(doc=3291,freq=13.0), product of:
              0.07607156 = queryWeight, product of:
                1.5693436 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.014555618 = queryNorm
              0.75045615 = fieldWeight in 3291, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.038242433 = weight(abstract_txt:knowledge in 3291) [ClassicSimilarity], result of:
            0.038242433 = score(doc=3291,freq=1.0), product of:
              0.17253558 = queryWeight, product of:
                3.3424213 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.014555618 = queryNorm
              0.22164954 = fieldWeight in 3291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.066002496 = weight(abstract_txt:text in 3291) [ClassicSimilarity], result of:
            0.066002496 = score(doc=3291,freq=1.0), product of:
              0.26133895 = queryWeight, product of:
                4.443215 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014555618 = queryNorm
              0.25255513 = fieldWeight in 3291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
          0.65839314 = weight(abstract_txt:mining in 3291) [ClassicSimilarity], result of:
            0.65839314 = score(doc=3291,freq=6.0), product of:
              0.6967886 = queryWeight, product of:
                7.75607 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.014555618 = queryNorm
              0.9448966 = fieldWeight in 3291, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=3291)
        0.32 = coord(8/25)
    
  2. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.24
    0.23617457 = sum of:
      0.23617457 = product of:
        1.1808728 = sum of:
          0.037625495 = weight(abstract_txt:techniques in 1354) [ClassicSimilarity], result of:
            0.037625495 = score(doc=1354,freq=2.0), product of:
              0.09392605 = queryWeight, product of:
                1.4238173 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.014555618 = queryNorm
              0.40058637 = fieldWeight in 1354, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.04189141 = weight(abstract_txt:data in 1354) [ClassicSimilarity], result of:
            0.04189141 = score(doc=1354,freq=7.0), product of:
              0.07607156 = queryWeight, product of:
                1.5693436 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.014555618 = queryNorm
              0.5506843 = fieldWeight in 1354, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.038242433 = weight(abstract_txt:knowledge in 1354) [ClassicSimilarity], result of:
            0.038242433 = score(doc=1354,freq=1.0), product of:
              0.17253558 = queryWeight, product of:
                3.3424213 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.014555618 = queryNorm
              0.22164954 = fieldWeight in 1354, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.13200499 = weight(abstract_txt:text in 1354) [ClassicSimilarity], result of:
            0.13200499 = score(doc=1354,freq=4.0), product of:
              0.26133895 = queryWeight, product of:
                4.443215 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014555618 = queryNorm
              0.50511026 = fieldWeight in 1354, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
          0.9311085 = weight(abstract_txt:mining in 1354) [ClassicSimilarity], result of:
            0.9311085 = score(doc=1354,freq=12.0), product of:
              0.6967886 = queryWeight, product of:
                7.75607 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.014555618 = queryNorm
              1.3362855 = fieldWeight in 1354, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=1354)
        0.2 = coord(5/25)
    
  3. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.22
    0.2222817 = sum of:
      0.2222817 = product of:
        1.1114085 = sum of:
          0.10290931 = weight(abstract_txt:quantities in 3899) [ClassicSimilarity], result of:
            0.10290931 = score(doc=3899,freq=1.0), product of:
              0.14018671 = queryWeight, product of:
                1.2299845 = boost
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.014555618 = queryNorm
              0.73408747 = fieldWeight in 3899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8302665 = idf(docFreq=47, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.03990786 = weight(abstract_txt:techniques in 3899) [ClassicSimilarity], result of:
            0.03990786 = score(doc=3899,freq=1.0), product of:
              0.09392605 = queryWeight, product of:
                1.4238173 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.014555618 = queryNorm
              0.424886 = fieldWeight in 3899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.047500398 = weight(abstract_txt:data in 3899) [ClassicSimilarity], result of:
            0.047500398 = score(doc=3899,freq=4.0), product of:
              0.07607156 = queryWeight, product of:
                1.5693436 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.014555618 = queryNorm
              0.6244173 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.1147273 = weight(abstract_txt:knowledge in 3899) [ClassicSimilarity], result of:
            0.1147273 = score(doc=3899,freq=4.0), product of:
              0.17253558 = queryWeight, product of:
                3.3424213 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.014555618 = queryNorm
              0.66494864 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.80636364 = weight(abstract_txt:mining in 3899) [ClassicSimilarity], result of:
            0.80636364 = score(doc=3899,freq=4.0), product of:
              0.6967886 = queryWeight, product of:
                7.75607 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.014555618 = queryNorm
              1.1572572 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
        0.2 = coord(5/25)
    
  4. Srinivasan, P.: Text mining in biomedicine : challenges and opportunities (2006) 0.21
    0.21251264 = sum of:
      0.21251264 = product of:
        1.0625632 = sum of:
          0.011521644 = weight(abstract_txt:based in 2497) [ClassicSimilarity], result of:
            0.011521644 = score(doc=2497,freq=1.0), product of:
              0.046331625 = queryWeight, product of:
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.014555618 = queryNorm
              0.24867775 = fieldWeight in 2497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.078125 = fieldNorm(doc=2497)
          0.053186633 = weight(abstract_txt:vast in 2497) [ClassicSimilarity], result of:
            0.053186633 = score(doc=2497,freq=1.0), product of:
              0.10195133 = queryWeight, product of:
                1.0489208 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.014555618 = queryNorm
              0.5216865 = fieldWeight in 2497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.078125 = fieldNorm(doc=2497)
          0.04447936 = weight(abstract_txt:making in 2497) [ClassicSimilarity], result of:
            0.04447936 = score(doc=2497,freq=1.0), product of:
              0.11401804 = queryWeight, product of:
                1.5687293 = boost
                4.9933834 = idf(docFreq=818, maxDocs=44421)
                0.014555618 = queryNorm
              0.39010808 = fieldWeight in 2497, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9933834 = idf(docFreq=818, maxDocs=44421)
                0.078125 = fieldNorm(doc=2497)
          0.20209056 = weight(abstract_txt:text in 2497) [ClassicSimilarity], result of:
            0.20209056 = score(doc=2497,freq=6.0), product of:
              0.26133895 = queryWeight, product of:
                4.443215 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014555618 = queryNorm
              0.7732891 = fieldWeight in 2497, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=2497)
          0.751285 = weight(abstract_txt:mining in 2497) [ClassicSimilarity], result of:
            0.751285 = score(doc=2497,freq=5.0), product of:
              0.6967886 = queryWeight, product of:
                7.75607 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.014555618 = queryNorm
              1.0782108 = fieldWeight in 2497, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.078125 = fieldNorm(doc=2497)
        0.2 = coord(5/25)
    
  5. Joo, S.; Choi, I.; Choi, N.: Topic analysis of the research domain in knowledge organization : a Latent Dirichlet Allocation approach (2018) 0.21
    0.21148872 = sum of:
      0.21148872 = product of:
        0.75531685 = sum of:
          0.009217315 = weight(abstract_txt:based in 304) [ClassicSimilarity], result of:
            0.009217315 = score(doc=304,freq=1.0), product of:
              0.046331625 = queryWeight, product of:
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.014555618 = queryNorm
              0.1989422 = fieldWeight in 304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.018183915 = weight(abstract_txt:classification in 304) [ClassicSimilarity], result of:
            0.018183915 = score(doc=304,freq=1.0), product of:
              0.07287851 = queryWeight, product of:
                1.2541832 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.014555618 = queryNorm
              0.24950996 = fieldWeight in 304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.047210135 = weight(abstract_txt:scheme in 304) [ClassicSimilarity], result of:
            0.047210135 = score(doc=304,freq=1.0), product of:
              0.13766749 = queryWeight, product of:
                1.7237605 = boost
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.014555618 = queryNorm
              0.3429287 = fieldWeight in 304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4868593 = idf(docFreq=499, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.09258791 = weight(abstract_txt:organization in 304) [ClassicSimilarity], result of:
            0.09258791 = score(doc=304,freq=6.0), product of:
              0.13588059 = queryWeight, product of:
                2.0974207 = boost
                4.450832 = idf(docFreq=1408, maxDocs=44421)
                0.014555618 = queryNorm
              0.6813917 = fieldWeight in 304, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.450832 = idf(docFreq=1408, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.09367444 = weight(abstract_txt:knowledge in 304) [ClassicSimilarity], result of:
            0.09367444 = score(doc=304,freq=6.0), product of:
              0.17253558 = queryWeight, product of:
                3.3424213 = boost
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.014555618 = queryNorm
              0.5429283 = fieldWeight in 304, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.5463927 = idf(docFreq=3480, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.11431967 = weight(abstract_txt:text in 304) [ClassicSimilarity], result of:
            0.11431967 = score(doc=304,freq=3.0), product of:
              0.26133895 = queryWeight, product of:
                4.443215 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.014555618 = queryNorm
              0.4374383 = fieldWeight in 304, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
          0.38012347 = weight(abstract_txt:mining in 304) [ClassicSimilarity], result of:
            0.38012347 = score(doc=304,freq=2.0), product of:
              0.6967886 = queryWeight, product of:
                7.75607 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.014555618 = queryNorm
              0.5455363 = fieldWeight in 304, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=304)
        0.28 = coord(7/25)