Document (#33385)

Author
Britt, B.L.
Berry, M.W.
Browne, M.
Merrell, M.A.
Kolpack, J.
Title
Document classification techniques for automated technology readiness level analysis
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.4, S.675-680
Year
2008
Abstract
The overhead of assessing technology readiness for deployment and investment purposes can be costly to both large and small businesses. Recent advances in the automatic interpretation of technology readiness levels (TRLs) of a given technology can substantially reduce the risk and associated cost of bringing these new technologies to market. Using vector-space information-retrieval models, such as latent semantic indexing, it is feasible to group similar technology descriptions by exploiting the latent structure of term usage within textual documents. Once the documents have been semantically clustered (or grouped), they can be classified based on the TRL scores of (known) nearest-neighbor documents. Three automated (no human curation) strategies for assigning TRLs to documents are discussed with accuracies as high as 86% achieved for two-class problems.

Similar documents (author)

  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 6.49
    6.4887934 = sum of:
      6.4887934 = sum of:
        3.2019496 = weight(author_txt:browne in 777) [ClassicSimilarity], result of:
          3.2019496 = score(doc=777,freq=1.0), product of:
            0.7009124 = queryWeight, product of:
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.076715484 = queryNorm
            4.5682592 = fieldWeight in 777, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.5 = fieldNorm(doc=777)
        3.286844 = weight(author_txt:berry in 777) [ClassicSimilarity], result of:
          3.286844 = score(doc=777,freq=1.0), product of:
            0.71324736 = queryWeight, product of:
              1.0087608 = boost
              9.216561 = idf(docFreq=11, maxDocs=44421)
              0.076715484 = queryNorm
            4.6082807 = fieldWeight in 777, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.216561 = idf(docFreq=11, maxDocs=44421)
              0.5 = fieldNorm(doc=777)
    
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 6.49
    6.4887934 = sum of:
      6.4887934 = sum of:
        3.2019496 = weight(author_txt:browne in 1007) [ClassicSimilarity], result of:
          3.2019496 = score(doc=1007,freq=1.0), product of:
            0.7009124 = queryWeight, product of:
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.076715484 = queryNorm
            4.5682592 = fieldWeight in 1007, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.1365185 = idf(docFreq=12, maxDocs=44421)
              0.5 = fieldNorm(doc=1007)
        3.286844 = weight(author_txt:berry in 1007) [ClassicSimilarity], result of:
          3.286844 = score(doc=1007,freq=1.0), product of:
            0.71324736 = queryWeight, product of:
              1.0087608 = boost
              9.216561 = idf(docFreq=11, maxDocs=44421)
              0.076715484 = queryNorm
            4.6082807 = fieldWeight in 1007, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.216561 = idf(docFreq=11, maxDocs=44421)
              0.5 = fieldNorm(doc=1007)
    
  3. Berry, J.: CD-ROM: the medium for the moment (1992) 2.05
    2.0542774 = sum of:
      2.0542774 = product of:
        4.108555 = sum of:
          4.108555 = weight(author_txt:berry in 3634) [ClassicSimilarity], result of:
            4.108555 = score(doc=3634,freq=1.0), product of:
              0.71324736 = queryWeight, product of:
                1.0087608 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.076715484 = queryNorm
              5.7603507 = fieldWeight in 3634, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.625 = fieldNorm(doc=3634)
        0.5 = coord(1/2)
    
  4. Browne, G.: Scope notes for LISA subject headings (1992) 2.00
    2.0012186 = sum of:
      2.0012186 = product of:
        4.002437 = sum of:
          4.002437 = weight(author_txt:browne in 1498) [ClassicSimilarity], result of:
            4.002437 = score(doc=1498,freq=1.0), product of:
              0.7009124 = queryWeight, product of:
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.076715484 = queryNorm
              5.7103243 = fieldWeight in 1498, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.625 = fieldNorm(doc=1498)
        0.5 = coord(1/2)
    
  5. Browne, G.: Professional liability of indexers (1996) 2.00
    2.0012186 = sum of:
      2.0012186 = product of:
        4.002437 = sum of:
          4.002437 = weight(author_txt:browne in 4643) [ClassicSimilarity], result of:
            4.002437 = score(doc=4643,freq=1.0), product of:
              0.7009124 = queryWeight, product of:
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.076715484 = queryNorm
              5.7103243 = fieldWeight in 4643, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.1365185 = idf(docFreq=12, maxDocs=44421)
                0.625 = fieldNorm(doc=4643)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Sun, J.: Why different people prefer different systems for different tasks : an activity perspective on technology adoption in a dynamic user environment (2012) 0.07
    0.07224301 = sum of:
      0.07224301 = product of:
        0.90303767 = sum of:
          0.07880649 = weight(abstract_txt:technology in 961) [ClassicSimilarity], result of:
            0.07880649 = score(doc=961,freq=1.0), product of:
              0.2356011 = queryWeight, product of:
                3.2367485 = boost
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.01700097 = queryNorm
              0.3344912 = fieldWeight in 961, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2814875 = idf(docFreq=1668, maxDocs=44421)
                0.078125 = fieldNorm(doc=961)
          0.8242312 = weight(abstract_txt:readiness in 961) [ClassicSimilarity], result of:
            0.8242312 = score(doc=961,freq=4.0), product of:
              0.598686 = queryWeight, product of:
                3.996644 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.01700097 = queryNorm
              1.3767338 = fieldWeight in 961, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.078125 = fieldNorm(doc=961)
        0.08 = coord(2/25)
    
  2. Kim, J.-H.; Choi, K.-S.: Patent document categorization based on semantic structural information (2007) 0.06
    0.0575208 = sum of:
      0.0575208 = product of:
        0.359505 = sum of:
          0.052346528 = weight(abstract_txt:semantically in 1933) [ClassicSimilarity], result of:
            0.052346528 = score(doc=1933,freq=1.0), product of:
              0.12171513 = queryWeight, product of:
                1.0404173 = boost
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.01700097 = queryNorm
              0.43007413 = fieldWeight in 1933, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.881186 = idf(docFreq=123, maxDocs=44421)
                0.0625 = fieldNorm(doc=1933)
          0.109961584 = weight(abstract_txt:clustered in 1933) [ClassicSimilarity], result of:
            0.109961584 = score(doc=1933,freq=2.0), product of:
              0.15845403 = queryWeight, product of:
                1.187099 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01700097 = queryNorm
              0.6939652 = fieldWeight in 1933, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=1933)
          0.08684609 = weight(abstract_txt:nearest in 1933) [ClassicSimilarity], result of:
            0.08684609 = score(doc=1933,freq=1.0), product of:
              0.17057662 = queryWeight, product of:
                1.2316719 = boost
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.01700097 = queryNorm
              0.50913244 = fieldWeight in 1933, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.0625 = fieldNorm(doc=1933)
          0.11035078 = weight(abstract_txt:documents in 1933) [ClassicSimilarity], result of:
            0.11035078 = score(doc=1933,freq=6.0), product of:
              0.1748125 = queryWeight, product of:
                2.493742 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01700097 = queryNorm
              0.6312522 = fieldWeight in 1933, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=1933)
        0.16 = coord(4/25)
    
  3. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.06
    0.05673252 = sum of:
      0.05673252 = product of:
        0.35457826 = sum of:
          0.09197095 = weight(abstract_txt:investment in 393) [ClassicSimilarity], result of:
            0.09197095 = score(doc=393,freq=2.0), product of:
              0.14066188 = queryWeight, product of:
                1.1184678 = boost
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.01700097 = queryNorm
              0.6538442 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3974023 = idf(docFreq=73, maxDocs=44421)
                0.0625 = fieldNorm(doc=393)
          0.08684609 = weight(abstract_txt:nearest in 393) [ClassicSimilarity], result of:
            0.08684609 = score(doc=393,freq=1.0), product of:
              0.17057662 = queryWeight, product of:
                1.2316719 = boost
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.01700097 = queryNorm
              0.50913244 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.0625 = fieldNorm(doc=393)
          0.11205016 = weight(abstract_txt:neighbor in 393) [ClassicSimilarity], result of:
            0.11205016 = score(doc=393,freq=1.0), product of:
              0.20215957 = queryWeight, product of:
                1.3408569 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.01700097 = queryNorm
              0.5542659 = fieldWeight in 393, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.0625 = fieldNorm(doc=393)
          0.06371105 = weight(abstract_txt:documents in 393) [ClassicSimilarity], result of:
            0.06371105 = score(doc=393,freq=2.0), product of:
              0.1748125 = queryWeight, product of:
                2.493742 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01700097 = queryNorm
              0.3644536 = fieldWeight in 393, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=393)
        0.16 = coord(4/25)
    
  4. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.05
    0.054457903 = sum of:
      0.054457903 = product of:
        0.3403619 = sum of:
          0.07775458 = weight(abstract_txt:clustered in 1979) [ClassicSimilarity], result of:
            0.07775458 = score(doc=1979,freq=1.0), product of:
              0.15845403 = queryWeight, product of:
                1.187099 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.01700097 = queryNorm
              0.4907075 = fieldWeight in 1979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=1979)
          0.08684609 = weight(abstract_txt:nearest in 1979) [ClassicSimilarity], result of:
            0.08684609 = score(doc=1979,freq=1.0), product of:
              0.17057662 = queryWeight, product of:
                1.2316719 = boost
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.01700097 = queryNorm
              0.50913244 = fieldWeight in 1979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.146119 = idf(docFreq=34, maxDocs=44421)
                0.0625 = fieldNorm(doc=1979)
          0.11205016 = weight(abstract_txt:neighbor in 1979) [ClassicSimilarity], result of:
            0.11205016 = score(doc=1979,freq=1.0), product of:
              0.20215957 = queryWeight, product of:
                1.3408569 = boost
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.01700097 = queryNorm
              0.5542659 = fieldWeight in 1979, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.868255 = idf(docFreq=16, maxDocs=44421)
                0.0625 = fieldNorm(doc=1979)
          0.06371105 = weight(abstract_txt:documents in 1979) [ClassicSimilarity], result of:
            0.06371105 = score(doc=1979,freq=2.0), product of:
              0.1748125 = queryWeight, product of:
                2.493742 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.01700097 = queryNorm
              0.3644536 = fieldWeight in 1979, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=1979)
        0.16 = coord(4/25)
    
  5. Jones, L.M.; Wright, K.D.; Wallace, M.K.; Veinot, T.: "Take an opportunity whenever you get it" : information sharing among African-American women with hypertension (2018) 0.05
    0.05274979 = sum of:
      0.05274979 = product of:
        0.4395816 = sum of:
          0.047836196 = weight(abstract_txt:assessing in 20) [ClassicSimilarity], result of:
            0.047836196 = score(doc=20,freq=1.0), product of:
              0.11461912 = queryWeight, product of:
                1.0096337 = boost
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.01700097 = queryNorm
              0.4173492 = fieldWeight in 20, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.677587 = idf(docFreq=151, maxDocs=44421)
                0.0625 = fieldNorm(doc=20)
          0.062052917 = weight(abstract_txt:risk in 20) [ClassicSimilarity], result of:
            0.062052917 = score(doc=20,freq=1.0), product of:
              0.13633084 = queryWeight, product of:
                1.1011142 = boost
                7.282627 = idf(docFreq=82, maxDocs=44421)
                0.01700097 = queryNorm
              0.4551642 = fieldWeight in 20, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.282627 = idf(docFreq=82, maxDocs=44421)
                0.0625 = fieldNorm(doc=20)
          0.32969248 = weight(abstract_txt:readiness in 20) [ClassicSimilarity], result of:
            0.32969248 = score(doc=20,freq=1.0), product of:
              0.598686 = queryWeight, product of:
                3.996644 = boost
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.01700097 = queryNorm
              0.5506935 = fieldWeight in 20, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.811096 = idf(docFreq=17, maxDocs=44421)
                0.0625 = fieldNorm(doc=20)
        0.12 = coord(3/25)