Document (#20327)

Author
Mostafa, J.
Quiroga, L.M.
Palakal, M.
Title
Filtering medical documents using automated and human classification methods
Source
Journal of the American Society for Information Science. 49(1998) no.14, S.1304-1318
Year
1998
Abstract
The goal of this research is to clarify the role of document classification in information filtering. An important function of classification, in managing computational complexity, is described and illustrated in the context of an existing filtering system. A parameter called classification homogeneity is presented for analyzing unsupervised automated classification by employing human classification as a control. 2 significant components of the automated classification approach, vocabulary discovery and classification scheme generation, are described in detail. Results of classification performance revealed considerable variability in the homogeneity of automatically produced classes. Based on the classification performance, different types of interest profiles were created. Subsequently, these profiles were used to perform filtering sessions. The filtering results showed that with increasing homogeneity, filtering performance improves, and, conversely, with decreasing homogeneity, filtering performance degrades
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Mostafa, J.: Digital image representation and access (1994) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:mostafa in 1170) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 1170, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=1170)
    
  2. Mostafa, S.P.: Enfoqies paradigmaticos de bibliotecologia : unidade na diversidad na unidad (1996) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:mostafa in 829) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 829, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=829)
    
  3. Mostafa, J.: Document search interface design : background and introduction to special topic section (2004) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:mostafa in 3503) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 3503, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=3503)
    
  4. Mostafa, J.: Bessere Suchmaschinen für das Web (2006) 5.51
    5.506935 = sum of:
      5.506935 = weight(author_txt:mostafa in 5871) [ClassicSimilarity], result of:
        5.506935 = fieldWeight in 5871, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.625 = fieldNorm(doc=5871)
    
  5. Sugimoto, C.R.; Mostafa, J.: ¬A note of concern and context : on careful use of terminologies (2018) 4.41
    4.405548 = sum of:
      4.405548 = weight(author_txt:mostafa in 7277) [ClassicSimilarity], result of:
        4.405548 = fieldWeight in 7277, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.811096 = idf(docFreq=17, maxDocs=44421)
          0.5 = fieldNorm(doc=7277)
    

Similar documents (content)

  1. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.18
    0.17560117 = sum of:
      0.17560117 = product of:
        0.7316716 = sum of:
          0.013494375 = weight(abstract_txt:results in 3579) [ClassicSimilarity], result of:
            0.013494375 = score(doc=3579,freq=2.0), product of:
              0.04388818 = queryWeight, product of:
                1.1357882 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011108107 = queryNorm
              0.30747172 = fieldWeight in 3579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.04050164 = weight(abstract_txt:clarify in 3579) [ClassicSimilarity], result of:
            0.04050164 = score(doc=3579,freq=1.0), product of:
              0.09131893 = queryWeight, product of:
                1.1584811 = boost
                7.0962973 = idf(docFreq=99, maxDocs=44421)
                0.011108107 = queryNorm
              0.44351858 = fieldWeight in 3579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0962973 = idf(docFreq=99, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.019367466 = weight(abstract_txt:were in 3579) [ClassicSimilarity], result of:
            0.019367466 = score(doc=3579,freq=3.0), product of:
              0.048782475 = queryWeight, product of:
                1.1974447 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.011108107 = queryNorm
              0.39701685 = fieldWeight in 3579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.12573361 = weight(abstract_txt:profiles in 3579) [ClassicSimilarity], result of:
            0.12573361 = score(doc=3579,freq=3.0), product of:
              0.16976556 = queryWeight, product of:
                2.233821 = boost
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.011108107 = queryNorm
              0.74063087 = fieldWeight in 3579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8416553 = idf(docFreq=128, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.08939747 = weight(abstract_txt:performance in 3579) [ClassicSimilarity], result of:
            0.08939747 = score(doc=3579,freq=4.0), product of:
              0.15480888 = queryWeight, product of:
                3.0167303 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.011108107 = queryNorm
              0.5774699 = fieldWeight in 3579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
          0.44317704 = weight(abstract_txt:filtering in 3579) [ClassicSimilarity], result of:
            0.44317704 = score(doc=3579,freq=4.0), product of:
              0.5423878 = queryWeight, product of:
                7.469861 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.011108107 = queryNorm
              0.8170852 = fieldWeight in 3579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.0625 = fieldNorm(doc=3579)
        0.24 = coord(6/25)
    
  2. Díaz, I.; Ranilla, J.; Montañes, E.; Fernández, J.; Combarro, E.F.: Improving performance of text categorization by combining filtering and support vector machines (2004) 0.12
    0.124138415 = sum of:
      0.124138415 = product of:
        0.6206921 = sum of:
          0.0429517 = weight(abstract_txt:improves in 3234) [ClassicSimilarity], result of:
            0.0429517 = score(doc=3234,freq=1.0), product of:
              0.08183881 = queryWeight, product of:
                1.0967009 = boost
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.011108107 = queryNorm
              0.5248329 = fieldWeight in 3234, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.717861 = idf(docFreq=145, maxDocs=44421)
                0.078125 = fieldNorm(doc=3234)
          0.01686797 = weight(abstract_txt:results in 3234) [ClassicSimilarity], result of:
            0.01686797 = score(doc=3234,freq=2.0), product of:
              0.04388818 = queryWeight, product of:
                1.1357882 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011108107 = queryNorm
              0.38433966 = fieldWeight in 3234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.078125 = fieldNorm(doc=3234)
          0.079016946 = weight(abstract_txt:performance in 3234) [ClassicSimilarity], result of:
            0.079016946 = score(doc=3234,freq=2.0), product of:
              0.15480888 = queryWeight, product of:
                3.0167303 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.011108107 = queryNorm
              0.5104161 = fieldWeight in 3234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.078125 = fieldNorm(doc=3234)
          0.09013861 = weight(abstract_txt:classification in 3234) [ClassicSimilarity], result of:
            0.09013861 = score(doc=3234,freq=1.0), product of:
              0.28901005 = queryWeight, product of:
                6.5172596 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011108107 = queryNorm
              0.31188744 = fieldWeight in 3234, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.078125 = fieldNorm(doc=3234)
          0.39171684 = weight(abstract_txt:filtering in 3234) [ClassicSimilarity], result of:
            0.39171684 = score(doc=3234,freq=2.0), product of:
              0.5423878 = queryWeight, product of:
                7.469861 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.011108107 = queryNorm
              0.7222081 = fieldWeight in 3234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.078125 = fieldNorm(doc=3234)
        0.2 = coord(5/25)
    
  3. Sebastiani, F.: Classification of text, automatic (2006) 0.10
    0.09773082 = sum of:
      0.09773082 = product of:
        0.6108177 = sum of:
          0.043522064 = weight(abstract_txt:computational in 3) [ClassicSimilarity], result of:
            0.043522064 = score(doc=3,freq=1.0), product of:
              0.07311243 = queryWeight, product of:
                1.0365832 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.011108107 = queryNorm
              0.5952759 = fieldWeight in 3, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.09375 = fieldNorm(doc=3)
          0.12674652 = weight(abstract_txt:automated in 3) [ClassicSimilarity], result of:
            0.12674652 = score(doc=3,freq=2.0), product of:
              0.17067608 = queryWeight, product of:
                2.7431877 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.011108107 = queryNorm
              0.7426144 = fieldWeight in 3, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.09375 = fieldNorm(doc=3)
          0.10816633 = weight(abstract_txt:classification in 3) [ClassicSimilarity], result of:
            0.10816633 = score(doc=3,freq=1.0), product of:
              0.28901005 = queryWeight, product of:
                6.5172596 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011108107 = queryNorm
              0.37426496 = fieldWeight in 3, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.09375 = fieldNorm(doc=3)
          0.33238277 = weight(abstract_txt:filtering in 3) [ClassicSimilarity], result of:
            0.33238277 = score(doc=3,freq=1.0), product of:
              0.5423878 = queryWeight, product of:
                7.469861 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.011108107 = queryNorm
              0.6128139 = fieldWeight in 3, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.09375 = fieldNorm(doc=3)
        0.16 = coord(4/25)
    
  4. Kenter, T.; Balog, K.; Rijke, M. de: Evaluating document filtering systems over time (2015) 0.09
    0.093999766 = sum of:
      0.093999766 = product of:
        0.58749855 = sum of:
          0.033016585 = weight(abstract_txt:employing in 3672) [ClassicSimilarity], result of:
            0.033016585 = score(doc=3672,freq=1.0), product of:
              0.08710875 = queryWeight, product of:
                1.1314607 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.011108107 = queryNorm
              0.3790272 = fieldWeight in 3672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3672)
          0.011807578 = weight(abstract_txt:results in 3672) [ClassicSimilarity], result of:
            0.011807578 = score(doc=3672,freq=2.0), product of:
              0.04388818 = queryWeight, product of:
                1.1357882 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011108107 = queryNorm
              0.26903775 = fieldWeight in 3672, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3672)
          0.06774292 = weight(abstract_txt:performance in 3672) [ClassicSimilarity], result of:
            0.06774292 = score(doc=3672,freq=3.0), product of:
              0.15480888 = queryWeight, product of:
                3.0167303 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.011108107 = queryNorm
              0.43759066 = fieldWeight in 3672, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3672)
          0.47493148 = weight(abstract_txt:filtering in 3672) [ClassicSimilarity], result of:
            0.47493148 = score(doc=3672,freq=6.0), product of:
              0.5423878 = queryWeight, product of:
                7.469861 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.011108107 = queryNorm
              0.87563086 = fieldWeight in 3672, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.0546875 = fieldNorm(doc=3672)
        0.16 = coord(4/25)
    
  5. Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.09
    0.09065363 = sum of:
      0.09065363 = product of:
        0.45326814 = sum of:
          0.016527167 = weight(abstract_txt:results in 2083) [ClassicSimilarity], result of:
            0.016527167 = score(doc=2083,freq=3.0), product of:
              0.04388818 = queryWeight, product of:
                1.1357882 = boost
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.011108107 = queryNorm
              0.37657443 = fieldWeight in 2083, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4786456 = idf(docFreq=3724, maxDocs=44421)
                0.0625 = fieldNorm(doc=2083)
          0.011181812 = weight(abstract_txt:were in 2083) [ClassicSimilarity], result of:
            0.011181812 = score(doc=2083,freq=1.0), product of:
              0.048782475 = queryWeight, product of:
                1.1974447 = boost
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.011108107 = queryNorm
              0.2292178 = fieldWeight in 2083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6674848 = idf(docFreq=3083, maxDocs=44421)
                0.0625 = fieldNorm(doc=2083)
          0.05974888 = weight(abstract_txt:automated in 2083) [ClassicSimilarity], result of:
            0.05974888 = score(doc=2083,freq=1.0), product of:
              0.17067608 = queryWeight, product of:
                2.7431877 = boost
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.011108107 = queryNorm
              0.3500718 = fieldWeight in 2083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6011486 = idf(docFreq=445, maxDocs=44421)
                0.0625 = fieldNorm(doc=2083)
          0.14422177 = weight(abstract_txt:classification in 2083) [ClassicSimilarity], result of:
            0.14422177 = score(doc=2083,freq=4.0), product of:
              0.28901005 = queryWeight, product of:
                6.5172596 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.011108107 = queryNorm
              0.49901992 = fieldWeight in 2083, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0625 = fieldNorm(doc=2083)
          0.22158852 = weight(abstract_txt:filtering in 2083) [ClassicSimilarity], result of:
            0.22158852 = score(doc=2083,freq=1.0), product of:
              0.5423878 = queryWeight, product of:
                7.469861 = boost
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.011108107 = queryNorm
              0.4085426 = fieldWeight in 2083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5366817 = idf(docFreq=174, maxDocs=44421)
                0.0625 = fieldNorm(doc=2083)
        0.2 = coord(5/25)