Document (#31308)

Author
Ng, K.B.
Kantor, P.B.
Strzalkowski, T.
Wacholder, N.
Tang, R.
Bai, B.
Rittman,
Song, P.
Sun, Y.
Title
Automated judgment of document qualities
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.9, S.1155-1164
Year
2006
Abstract
The authors report on a series of experiments to automate the assessment of document qualities such as depth and objectivity. The primary purpose is to develop a quality-sensitive functionality, orthogonal to relevance, to select documents for an interactive question-answering system. The study consisted of two stages. In the classifier construction stage, nine document qualities deemed important by information professionals were identified and classifiers were developed to predict their values. In the confirmative evaluation stage, the performance of the developed methods was checked using a different document collection. The quality prediction methods worked well in the second stage. The results strongly suggest that the best way to predict document qualities automatically is to construct classifiers on a person-by-person basis.

Similar documents (author)

  1. Kelly, D.; Wacholder, N.; Rittman, R.; Sun, Y.; Kantor, P.; Small, S.; Strzalkowski, T.: Using interview data to identify evaluation criteria for interactive, analytical question-answering systems (2007) 1.96
    1.9642693 = sum of:
      1.9642693 = product of:
        3.273782 = sum of:
          0.81976324 = weight(author_txt:kantor in 1332) [ClassicSimilarity], result of:
            0.81976324 = score(doc=1332,freq=1.0), product of:
              0.40110216 = queryWeight, product of:
                1.0325654 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.047516454 = queryNorm
              2.0437768 = fieldWeight in 1332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.25 = fieldNorm(doc=1332)
          1.2082517 = weight(author_txt:strzalkowski in 1332) [ClassicSimilarity], result of:
            1.2082517 = score(doc=1332,freq=1.0), product of:
              0.5194786 = queryWeight, product of:
                1.1750975 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.047516454 = queryNorm
              2.3258932 = fieldWeight in 1332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.25 = fieldNorm(doc=1332)
          1.2457671 = weight(author_txt:wacholder in 1332) [ClassicSimilarity], result of:
            1.2457671 = score(doc=1332,freq=1.0), product of:
              0.5301767 = queryWeight, product of:
                1.1871357 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.047516454 = queryNorm
              2.3497207 = fieldWeight in 1332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.25 = fieldNorm(doc=1332)
        0.6 = coord(3/5)
    
  2. Wacholder, N.; Kelly, D.; Kantor, P.; Rittman, R.; Sun, Y.; Bai, B.; Small, S.; Yamrom, B.; Strzalkowski, T.: ¬A model for quantitative evaluation of an end-to-end question-answering system (2007) 1.72
    1.7187358 = sum of:
      1.7187358 = product of:
        2.8645597 = sum of:
          0.71729285 = weight(author_txt:kantor in 1435) [ClassicSimilarity], result of:
            0.71729285 = score(doc=1435,freq=1.0), product of:
              0.40110216 = queryWeight, product of:
                1.0325654 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.047516454 = queryNorm
              1.7883047 = fieldWeight in 1435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.21875 = fieldNorm(doc=1435)
          1.0572203 = weight(author_txt:strzalkowski in 1435) [ClassicSimilarity], result of:
            1.0572203 = score(doc=1435,freq=1.0), product of:
              0.5194786 = queryWeight, product of:
                1.1750975 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.047516454 = queryNorm
              2.0351565 = fieldWeight in 1435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.21875 = fieldNorm(doc=1435)
          1.0900463 = weight(author_txt:wacholder in 1435) [ClassicSimilarity], result of:
            1.0900463 = score(doc=1435,freq=1.0), product of:
              0.5301767 = queryWeight, product of:
                1.1871357 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.047516454 = queryNorm
              2.0560057 = fieldWeight in 1435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.21875 = fieldNorm(doc=1435)
        0.6 = coord(3/5)
    
  3. Tang, X.; Yang, C.C.; Song, M.: Understanding the evolution of multiple scientific research domains using a content and network approach (2013) 0.91
    0.9056082 = sum of:
      0.9056082 = product of:
        2.2640204 = sum of:
          1.1169329 = weight(author_txt:song in 1744) [ClassicSimilarity], result of:
            1.1169329 = score(doc=1744,freq=1.0), product of:
              0.37620097 = queryWeight, product of:
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.047516454 = queryNorm
              2.9689791 = fieldWeight in 1744, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.917278 = idf(docFreq=43, maxDocs=44421)
                0.375 = fieldNorm(doc=1744)
          1.1470875 = weight(author_txt:tang in 1744) [ClassicSimilarity], result of:
            1.1470875 = score(doc=1744,freq=1.0), product of:
              0.3829419 = queryWeight, product of:
                1.0089195 = boost
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.047516454 = queryNorm
              2.9954607 = fieldWeight in 1744, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9878955 = idf(docFreq=40, maxDocs=44421)
                0.375 = fieldNorm(doc=1744)
        0.4 = coord(2/5)
    
  4. Wacholder, N.: Interactive query formulation (2011) 0.62
    0.6228836 = sum of:
      0.6228836 = product of:
        3.114418 = sum of:
          3.114418 = weight(author_txt:wacholder in 196) [ClassicSimilarity], result of:
            3.114418 = score(doc=196,freq=1.0), product of:
              0.5301767 = queryWeight, product of:
                1.1871357 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.047516454 = queryNorm
              5.874302 = fieldWeight in 196, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.625 = fieldNorm(doc=196)
        0.2 = coord(1/5)
    
  5. Strzalkowski, T.: Natural language information retrieval (1995) 0.60
    0.6041259 = sum of:
      0.6041259 = product of:
        3.0206294 = sum of:
          3.0206294 = weight(author_txt:strzalkowski in 1982) [ClassicSimilarity], result of:
            3.0206294 = score(doc=1982,freq=1.0), product of:
              0.5194786 = queryWeight, product of:
                1.1750975 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.047516454 = queryNorm
              5.814733 = fieldWeight in 1982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.625 = fieldNorm(doc=1982)
        0.2 = coord(1/5)
    

Similar documents (content)

  1. Barry, C.L.: Document representations and clues to document relevance (1998) 0.10
    0.09840584 = sum of:
      0.09840584 = product of:
        0.8200487 = sum of:
          0.13649386 = weight(abstract_txt:predict in 3325) [ClassicSimilarity], result of:
            0.13649386 = score(doc=3325,freq=2.0), product of:
              0.22820352 = queryWeight, product of:
                2.129874 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.015833344 = queryNorm
              0.5981234 = fieldWeight in 3325, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0625 = fieldNorm(doc=3325)
          0.18497103 = weight(abstract_txt:document in 3325) [ClassicSimilarity], result of:
            0.18497103 = score(doc=3325,freq=9.0), product of:
              0.22973397 = queryWeight, product of:
                3.3789003 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015833344 = queryNorm
              0.80515313 = fieldWeight in 3325, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=3325)
          0.49858376 = weight(abstract_txt:qualities in 3325) [ClassicSimilarity], result of:
            0.49858376 = score(doc=3325,freq=3.0), product of:
              0.5957334 = queryWeight, product of:
                4.8666906 = boost
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.015833344 = queryNorm
              0.8369243 = fieldWeight in 3325, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.731176 = idf(docFreq=52, maxDocs=44421)
                0.0625 = fieldNorm(doc=3325)
        0.12 = coord(3/25)
    
  2. Lykourentzou, I.; Giannoukos, I.; Mpardis, G.; Nikolopoulos, V.; Loumos, V.: Early and dynamic student achievement prediction in e-learning courses using neural networks (2009) 0.10
    0.09527814 = sum of:
      0.09527814 = product of:
        0.47639066 = sum of:
          0.11579976 = weight(abstract_txt:prediction in 3715) [ClassicSimilarity], result of:
            0.11579976 = score(doc=3715,freq=4.0), product of:
              0.12883446 = queryWeight, product of:
                1.1316022 = boost
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.015833344 = queryNorm
              0.898826 = fieldWeight in 3715, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.190608 = idf(docFreq=90, maxDocs=44421)
                0.0625 = fieldNorm(doc=3715)
          0.026611932 = weight(abstract_txt:were in 3715) [ClassicSimilarity], result of:
            0.026611932 = score(doc=3715,freq=3.0), product of:
              0.06702973 = queryWeight, product of:
                1.1543207 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.015833344 = queryNorm
              0.39701685 = fieldWeight in 3715, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0625 = fieldNorm(doc=3715)
          0.086993635 = weight(abstract_txt:objectivity in 3715) [ClassicSimilarity], result of:
            0.086993635 = score(doc=3715,freq=1.0), product of:
              0.16900721 = queryWeight, product of:
                1.2960757 = boost
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.015833344 = queryNorm
              0.51473325 = fieldWeight in 3715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.235732 = idf(docFreq=31, maxDocs=44421)
                0.0625 = fieldNorm(doc=3715)
          0.09651574 = weight(abstract_txt:predict in 3715) [ClassicSimilarity], result of:
            0.09651574 = score(doc=3715,freq=1.0), product of:
              0.22820352 = queryWeight, product of:
                2.129874 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.015833344 = queryNorm
              0.4229371 = fieldWeight in 3715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0625 = fieldNorm(doc=3715)
          0.15046962 = weight(abstract_txt:stage in 3715) [ClassicSimilarity], result of:
            0.15046962 = score(doc=3715,freq=2.0), product of:
              0.2787682 = queryWeight, product of:
                2.883102 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.015833344 = queryNorm
              0.5397661 = fieldWeight in 3715, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=3715)
        0.2 = coord(5/25)
    
  3. Kishida, K.: High-speed rough clustering for very large document collections (2010) 0.09
    0.088608496 = sum of:
      0.088608496 = product of:
        0.4430425 = sum of:
          0.02215666 = weight(abstract_txt:methods in 450) [ClassicSimilarity], result of:
            0.02215666 = score(doc=450,freq=1.0), product of:
              0.08555783 = queryWeight, product of:
                1.3041353 = boost
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.015833344 = queryNorm
              0.25896704 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1434727 = idf(docFreq=1915, maxDocs=44421)
                0.0625 = fieldNorm(doc=450)
          0.02300963 = weight(abstract_txt:developed in 450) [ClassicSimilarity], result of:
            0.02300963 = score(doc=450,freq=1.0), product of:
              0.08773981 = queryWeight, product of:
                1.3206602 = boost
                4.1959753 = idf(docFreq=1817, maxDocs=44421)
                0.015833344 = queryNorm
              0.26224846 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1959753 = idf(docFreq=1817, maxDocs=44421)
                0.0625 = fieldNorm(doc=450)
          0.09788384 = weight(abstract_txt:checked in 450) [ClassicSimilarity], result of:
            0.09788384 = score(doc=450,freq=1.0), product of:
              0.18283287 = queryWeight, product of:
                1.3480465 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.015833344 = queryNorm
              0.53537333 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0625 = fieldNorm(doc=450)
          0.21279618 = weight(abstract_txt:stage in 450) [ClassicSimilarity], result of:
            0.21279618 = score(doc=450,freq=4.0), product of:
              0.2787682 = queryWeight, product of:
                2.883102 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.015833344 = queryNorm
              0.7633445 = fieldWeight in 450, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=450)
          0.08719618 = weight(abstract_txt:document in 450) [ClassicSimilarity], result of:
            0.08719618 = score(doc=450,freq=2.0), product of:
              0.22973397 = queryWeight, product of:
                3.3789003 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015833344 = queryNorm
              0.3795528 = fieldWeight in 450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0625 = fieldNorm(doc=450)
        0.2 = coord(5/25)
    
  4. Tang, R.; Solomon, P.: Use of relevance criteria across stages of document evaluation : on the complementarity of experimental and naturalistic studies (2001) 0.09
    0.085034 = sum of:
      0.085034 = product of:
        0.42516997 = sum of:
          0.034962654 = weight(abstract_txt:functionality in 213) [ClassicSimilarity], result of:
            0.034962654 = score(doc=213,freq=1.0), product of:
              0.10061077 = queryWeight, product of:
                6.35436 = idf(docFreq=209, maxDocs=44421)
                0.015833344 = queryNorm
              0.34750408 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.35436 = idf(docFreq=209, maxDocs=44421)
                0.0546875 = fieldNorm(doc=213)
          0.023285441 = weight(abstract_txt:were in 213) [ClassicSimilarity], result of:
            0.023285441 = score(doc=213,freq=3.0), product of:
              0.06702973 = queryWeight, product of:
                1.1543207 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.015833344 = queryNorm
              0.34738976 = fieldWeight in 213, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0546875 = fieldNorm(doc=213)
          0.027303375 = weight(abstract_txt:quality in 213) [ClassicSimilarity], result of:
            0.027303375 = score(doc=213,freq=1.0), product of:
              0.107496865 = queryWeight, product of:
                1.461809 = boost
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.015833344 = queryNorm
              0.2539923 = fieldWeight in 213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.0546875 = fieldNorm(doc=213)
          0.26332185 = weight(abstract_txt:stage in 213) [ClassicSimilarity], result of:
            0.26332185 = score(doc=213,freq=8.0), product of:
              0.2787682 = queryWeight, product of:
                2.883102 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.015833344 = queryNorm
              0.9445906 = fieldWeight in 213, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0546875 = fieldNorm(doc=213)
          0.07629665 = weight(abstract_txt:document in 213) [ClassicSimilarity], result of:
            0.07629665 = score(doc=213,freq=2.0), product of:
              0.22973397 = queryWeight, product of:
                3.3789003 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015833344 = queryNorm
              0.3321087 = fieldWeight in 213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.0546875 = fieldNorm(doc=213)
        0.2 = coord(5/25)
    
  5. Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.08
    0.08393596 = sum of:
      0.08393596 = product of:
        0.5245998 = sum of:
          0.104785755 = weight(abstract_txt:classifier in 3697) [ClassicSimilarity], result of:
            0.104785755 = score(doc=3697,freq=2.0), product of:
              0.13086748 = queryWeight, product of:
                1.1404957 = boost
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.015833344 = queryNorm
              0.80070126 = fieldWeight in 3697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2471204 = idf(docFreq=85, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.05516115 = weight(abstract_txt:quality in 3697) [ClassicSimilarity], result of:
            0.05516115 = score(doc=3697,freq=2.0), product of:
              0.107496865 = queryWeight, product of:
                1.461809 = boost
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.015833344 = queryNorm
              0.51314193 = fieldWeight in 3697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6444306 = idf(docFreq=1160, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.28758165 = weight(abstract_txt:classifiers in 3697) [ClassicSimilarity], result of:
            0.28758165 = score(doc=3697,freq=3.0), product of:
              0.2823475 = queryWeight, product of:
                2.3691072 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.015833344 = queryNorm
              1.018538 = fieldWeight in 3697, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
          0.07707126 = weight(abstract_txt:document in 3697) [ClassicSimilarity], result of:
            0.07707126 = score(doc=3697,freq=1.0), product of:
              0.22973397 = queryWeight, product of:
                3.3789003 = boost
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.015833344 = queryNorm
              0.33548045 = fieldWeight in 3697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.29415 = idf(docFreq=1647, maxDocs=44421)
                0.078125 = fieldNorm(doc=3697)
        0.16 = coord(4/25)