Document (#37506)

Author
Derek Doran, D.
Gokhale, S.S.
Title
¬A classification framework for web robots
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.12, S.2549-2554,
Year
2012
Series
Brief communication
Abstract
The behavior of modern web robots varies widely when they crawl for different purposes. In this article, we present a framework to classify these web robots from two orthogonal perspectives, namely, their functionality and the types of resources they consume. Applying the classification framework to a year-long access log from the UConn SoE web server, we present trends that point to significant differences in their crawling behavior.
Theme
Internet
Data Mining

Similar documents (author)

  1. Doran, K.: Unified disparity : theory and practice of union listing (1996) 6.19
    6.1935673 = sum of:
      6.1935673 = weight(author_txt:doran in 4794) [ClassicSimilarity], result of:
        6.1935673 = fieldWeight in 4794, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.625 = fieldNorm(doc=4794)
    
  2. Doran, C.; Martin, C.: Measuring success in outsourced cataloging : a data-driven investigation (2017) 4.95
    4.954854 = sum of:
      4.954854 = weight(author_txt:doran in 150) [ClassicSimilarity], result of:
        4.954854 = fieldWeight in 150, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.5 = fieldNorm(doc=150)
    
  3. Rittschof, K.A.; Kulhavy, R.W.; Stock, W.A.; Verdi, M.P.; Doran, J.M.: Thematic maps improve memory for facts and inferences : a test of the stimulus order hypothesis (1994) 3.10
    3.0967836 = sum of:
      3.0967836 = weight(author_txt:doran in 2157) [ClassicSimilarity], result of:
        3.0967836 = fieldWeight in 2157, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.3125 = fieldNorm(doc=2157)
    
  4. Monireh, E.; Sarker, M.K.; Bianchi, F.; Hitzler, P.; Doran, D.; Xie, N.: Reasoning over RDF knowledge bases using deep learning (2018) 2.48
    2.477427 = sum of:
      2.477427 = weight(author_txt:doran in 553) [ClassicSimilarity], result of:
        2.477427 = fieldWeight in 553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.25 = fieldNorm(doc=553)
    

Similar documents (content)

  1. Byers, D.: Full-text indexing of non-textual resources (1998) 0.13
    0.13120675 = sum of:
      0.13120675 = product of:
        0.8200422 = sum of:
          0.029783875 = weight(abstract_txt:from in 4606) [ClassicSimilarity], result of:
            0.029783875 = score(doc=4606,freq=3.0), product of:
              0.04985355 = queryWeight, product of:
                1.0528916 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017159235 = queryNorm
              0.59742737 = fieldWeight in 4606, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.125 = fieldNorm(doc=4606)
          0.090414606 = weight(abstract_txt:server in 4606) [ClassicSimilarity], result of:
            0.090414606 = score(doc=4606,freq=1.0), product of:
              0.11964597 = queryWeight, product of:
                1.1533726 = boost
                6.045476 = idf(docFreq=285, maxDocs=44421)
                0.017159235 = queryNorm
              0.7556845 = fieldWeight in 4606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.045476 = idf(docFreq=285, maxDocs=44421)
                0.125 = fieldNorm(doc=4606)
          0.04308296 = weight(abstract_txt:they in 4606) [ClassicSimilarity], result of:
            0.04308296 = score(doc=4606,freq=1.0), product of:
              0.09196432 = queryWeight, product of:
                1.4300305 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.017159235 = queryNorm
              0.46847472 = fieldWeight in 4606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.125 = fieldNorm(doc=4606)
          0.65676075 = weight(abstract_txt:robots in 4606) [ClassicSimilarity], result of:
            0.65676075 = score(doc=4606,freq=1.0), product of:
              0.64721847 = queryWeight, product of:
                4.646294 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.017159235 = queryNorm
              1.0147436 = fieldWeight in 4606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.125 = fieldNorm(doc=4606)
        0.16 = coord(4/25)
    
  2. Kimmel, S.: Robot-generated databases on the World Wide Web (1996) 0.12
    0.11868944 = sum of:
      0.11868944 = product of:
        0.9890787 = sum of:
          0.017195728 = weight(abstract_txt:from in 4792) [ClassicSimilarity], result of:
            0.017195728 = score(doc=4792,freq=1.0), product of:
              0.04985355 = queryWeight, product of:
                1.0528916 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017159235 = queryNorm
              0.34492487 = fieldWeight in 4792, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.125 = fieldNorm(doc=4792)
          0.04308296 = weight(abstract_txt:they in 4792) [ClassicSimilarity], result of:
            0.04308296 = score(doc=4792,freq=1.0), product of:
              0.09196432 = queryWeight, product of:
                1.4300305 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.017159235 = queryNorm
              0.46847472 = fieldWeight in 4792, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.125 = fieldNorm(doc=4792)
          0.9288 = weight(abstract_txt:robots in 4792) [ClassicSimilarity], result of:
            0.9288 = score(doc=4792,freq=2.0), product of:
              0.64721847 = queryWeight, product of:
                4.646294 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.017159235 = queryNorm
              1.4350641 = fieldWeight in 4792, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.125 = fieldNorm(doc=4792)
        0.12 = coord(3/25)
    
  3. Moya Anegón, F. de; López-Huertas, M.J.: ¬An automatic model for updating the conceptual structure of a scientific discipline (2000) 0.10
    0.09742872 = sum of:
      0.09742872 = product of:
        0.405953 = sum of:
          0.015046262 = weight(abstract_txt:from in 1126) [ClassicSimilarity], result of:
            0.015046262 = score(doc=1126,freq=4.0), product of:
              0.04985355 = queryWeight, product of:
                1.0528916 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017159235 = queryNorm
              0.30180925 = fieldWeight in 1126, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
          0.0366443 = weight(abstract_txt:applying in 1126) [ClassicSimilarity], result of:
            0.0366443 = score(doc=1126,freq=1.0), product of:
              0.11369933 = queryWeight, product of:
                1.124345 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.017159235 = queryNorm
              0.32229123 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
          0.015863145 = weight(abstract_txt:their in 1126) [ClassicSimilarity], result of:
            0.015863145 = score(doc=1126,freq=2.0), product of:
              0.06506486 = queryWeight, product of:
                1.2028428 = boost
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.017159235 = queryNorm
              0.2438051 = fieldWeight in 1126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
          0.018848795 = weight(abstract_txt:they in 1126) [ClassicSimilarity], result of:
            0.018848795 = score(doc=1126,freq=1.0), product of:
              0.09196432 = queryWeight, product of:
                1.4300305 = boost
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.017159235 = queryNorm
              0.2049577 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7477977 = idf(docFreq=2845, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
          0.03221764 = weight(abstract_txt:classification in 1126) [ClassicSimilarity], result of:
            0.03221764 = score(doc=1126,freq=2.0), product of:
              0.10434768 = queryWeight, product of:
                1.5232705 = boost
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.017159235 = queryNorm
              0.3087528 = fieldWeight in 1126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9921594 = idf(docFreq=2228, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
          0.28733283 = weight(abstract_txt:robots in 1126) [ClassicSimilarity], result of:
            0.28733283 = score(doc=1126,freq=1.0), product of:
              0.64721847 = queryWeight, product of:
                4.646294 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.017159235 = queryNorm
              0.4439503 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.0546875 = fieldNorm(doc=1126)
        0.24 = coord(6/25)
    
  4. Day, R.E.: Indexing it all : the subject in the age of documentation, information, and data (2014) 0.09
    0.0926008 = sum of:
      0.0926008 = product of:
        0.463004 = sum of:
          0.014891937 = weight(abstract_txt:from in 4024) [ClassicSimilarity], result of:
            0.014891937 = score(doc=4024,freq=3.0), product of:
              0.04985355 = queryWeight, product of:
                1.0528916 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017159235 = queryNorm
              0.29871368 = fieldWeight in 4024, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=4024)
          0.038292814 = weight(abstract_txt:purposes in 4024) [ClassicSimilarity], result of:
            0.038292814 = score(doc=4024,freq=1.0), product of:
              0.107111774 = queryWeight, product of:
                1.0912876 = boost
                5.720053 = idf(docFreq=395, maxDocs=44421)
                0.017159235 = queryNorm
              0.35750332 = fieldWeight in 4024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.720053 = idf(docFreq=395, maxDocs=44421)
                0.0625 = fieldNorm(doc=4024)
          0.068619505 = weight(abstract_txt:modern in 4024) [ClassicSimilarity], result of:
            0.068619505 = score(doc=4024,freq=3.0), product of:
              0.109567985 = queryWeight, product of:
                1.103729 = boost
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.017159235 = queryNorm
              0.62627333 = fieldWeight in 4024, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7852654 = idf(docFreq=370, maxDocs=44421)
                0.0625 = fieldNorm(doc=4024)
          0.012819357 = weight(abstract_txt:their in 4024) [ClassicSimilarity], result of:
            0.012819357 = score(doc=4024,freq=1.0), product of:
              0.06506486 = queryWeight, product of:
                1.2028428 = boost
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.017159235 = queryNorm
              0.19702427 = fieldWeight in 4024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.0625 = fieldNorm(doc=4024)
          0.32838038 = weight(abstract_txt:robots in 4024) [ClassicSimilarity], result of:
            0.32838038 = score(doc=4024,freq=1.0), product of:
              0.64721847 = queryWeight, product of:
                4.646294 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.017159235 = queryNorm
              0.5073718 = fieldWeight in 4024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.0625 = fieldNorm(doc=4024)
        0.2 = coord(5/25)
    
  5. Hidalgo, C.: Why information grows : the evolution of order, from atoms to economies (2015) 0.09
    0.088019826 = sum of:
      0.088019826 = product of:
        0.44009912 = sum of:
          0.011168953 = weight(abstract_txt:from in 3154) [ClassicSimilarity], result of:
            0.011168953 = score(doc=3154,freq=3.0), product of:
              0.04985355 = queryWeight, product of:
                1.0528916 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017159235 = queryNorm
              0.22403526 = fieldWeight in 3154, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.046875 = fieldNorm(doc=3154)
          0.031409398 = weight(abstract_txt:applying in 3154) [ClassicSimilarity], result of:
            0.031409398 = score(doc=3154,freq=1.0), product of:
              0.11369933 = queryWeight, product of:
                1.124345 = boost
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.017159235 = queryNorm
              0.27624962 = fieldWeight in 3154, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8933253 = idf(docFreq=332, maxDocs=44421)
                0.046875 = fieldNorm(doc=3154)
          0.013596981 = weight(abstract_txt:their in 3154) [ClassicSimilarity], result of:
            0.013596981 = score(doc=3154,freq=2.0), product of:
              0.06506486 = queryWeight, product of:
                1.2028428 = boost
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.017159235 = queryNorm
              0.20897579 = fieldWeight in 3154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1523883 = idf(docFreq=5161, maxDocs=44421)
                0.046875 = fieldNorm(doc=3154)
          0.035623826 = weight(abstract_txt:present in 3154) [ClassicSimilarity], result of:
            0.035623826 = score(doc=3154,freq=2.0), product of:
              0.12365506 = queryWeight, product of:
                1.6582178 = boost
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.017159235 = queryNorm
              0.28809032 = fieldWeight in 3154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3458266 = idf(docFreq=1564, maxDocs=44421)
                0.046875 = fieldNorm(doc=3154)
          0.34829998 = weight(abstract_txt:robots in 3154) [ClassicSimilarity], result of:
            0.34829998 = score(doc=3154,freq=2.0), product of:
              0.64721847 = queryWeight, product of:
                4.646294 = boost
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.017159235 = queryNorm
              0.538149 = fieldWeight in 3154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.117949 = idf(docFreq=35, maxDocs=44421)
                0.046875 = fieldNorm(doc=3154)
        0.2 = coord(5/25)