Document (#34687)

Author
Stenmark, D.
Title
Identifying clusters of user behavior in intranet search engine log files
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.14, S.2232-2243
Year
2008
Abstract
When studying how ordinary Web users interact with Web search engines, researchers tend to either treat the users as a homogeneous group or group them according to search experience. Neither approach is sufficient, we argue, to capture the variety in behavior that is known to exist among searchers. By applying automatic clustering technique based on self-organizing maps to search engine log files from a corporate intranet, we show that users can be usefully separated into distinguishable segments based on their actual search behavior. Based on these segments, future tools for information seeking and retrieval can be targeted to specific segments rather than just made to fit the the average user. The exact number of clusters, and to some extent their characteristics, can be expected to vary between intranets, but our results indicate that some more generic groups may exist. In our study, a large group of users appeared to be fact seekers who would benefit from higher precision, a smaller group of users were more holistically oriented and would likely benefit from higher recall, and a third category of users seemed to constitute the knowledgeable users. These three groups may raise different design implications for search-tool developers.

Similar documents (content)

  1. Zhang, Y.; Broussard, R.; Ke, W.; Gong, X.: Evaluation of a scatter/gather interface for supporting distinct health information search tasks (2014) 0.17
    0.1734854 = sum of:
      0.1734854 = product of:
        0.6195907 = sum of:
          0.015316701 = weight(abstract_txt:based in 2261) [ClassicSimilarity], result of:
            0.015316701 = score(doc=2261,freq=1.0), product of:
              0.07699071 = queryWeight, product of:
                1.2497311 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.019354183 = queryNorm
              0.1989422 = fieldWeight in 2261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.05935709 = weight(abstract_txt:groups in 2261) [ClassicSimilarity], result of:
            0.05935709 = score(doc=2261,freq=2.0), product of:
              0.13170516 = queryWeight, product of:
                1.3346063 = boost
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.019354183 = queryNorm
              0.45068157 = fieldWeight in 2261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.08774308 = weight(abstract_txt:clusters in 2261) [ClassicSimilarity], result of:
            0.08774308 = score(doc=2261,freq=1.0), product of:
              0.21533088 = queryWeight, product of:
                1.7064947 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.019354183 = queryNorm
              0.40748024 = fieldWeight in 2261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.0945302 = weight(abstract_txt:behavior in 2261) [ClassicSimilarity], result of:
            0.0945302 = score(doc=2261,freq=2.0), product of:
              0.20560417 = queryWeight, product of:
                2.042271 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.019354183 = queryNorm
              0.45976794 = fieldWeight in 2261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.127491 = weight(abstract_txt:group in 2261) [ClassicSimilarity], result of:
            0.127491 = score(doc=2261,freq=3.0), product of:
              0.2413165 = queryWeight, product of:
                2.5548198 = boost
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.019354183 = queryNorm
              0.5283145 = fieldWeight in 2261, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.12266549 = weight(abstract_txt:search in 2261) [ClassicSimilarity], result of:
            0.12266549 = score(doc=2261,freq=7.0), product of:
              0.20298024 = queryWeight, product of:
                2.8697183 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.019354183 = queryNorm
              0.6043223 = fieldWeight in 2261, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
          0.11248712 = weight(abstract_txt:users in 2261) [ClassicSimilarity], result of:
            0.11248712 = score(doc=2261,freq=5.0), product of:
              0.22563109 = queryWeight, product of:
                3.2680242 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.019354183 = queryNorm
              0.49854442 = fieldWeight in 2261, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.0625 = fieldNorm(doc=2261)
        0.28 = coord(7/25)
    
  2. Chen, H.-M.; Cooper, M.D.: Stochastic modeling of usage patterns in a Web-based information system (2002) 0.16
    0.15636344 = sum of:
      0.15636344 = product of:
        0.48863578 = sum of:
          0.010583863 = weight(abstract_txt:from in 1577) [ClassicSimilarity], result of:
            0.010583863 = score(doc=1577,freq=2.0), product of:
              0.057859335 = queryWeight, product of:
                1.0833882 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.019354183 = queryNorm
              0.18292403 = fieldWeight in 1577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.06960271 = weight(abstract_txt:knowledgeable in 1577) [ClassicSimilarity], result of:
            0.06960271 = score(doc=1577,freq=1.0), product of:
              0.17741801 = queryWeight, product of:
                1.0953064 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.019354183 = queryNorm
              0.3923092 = fieldWeight in 1577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.022975052 = weight(abstract_txt:based in 1577) [ClassicSimilarity], result of:
            0.022975052 = score(doc=1577,freq=4.0), product of:
              0.07699071 = queryWeight, product of:
                1.2497311 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.019354183 = queryNorm
              0.2984133 = fieldWeight in 1577, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.070388846 = weight(abstract_txt:groups in 1577) [ClassicSimilarity], result of:
            0.070388846 = score(doc=1577,freq=5.0), product of:
              0.13170516 = queryWeight, product of:
                1.3346063 = boost
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.019354183 = queryNorm
              0.53444254 = fieldWeight in 1577, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.05013221 = weight(abstract_txt:behavior in 1577) [ClassicSimilarity], result of:
            0.05013221 = score(doc=1577,freq=1.0), product of:
              0.20560417 = queryWeight, product of:
                2.042271 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.019354183 = queryNorm
              0.24382877 = fieldWeight in 1577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.13522464 = weight(abstract_txt:group in 1577) [ClassicSimilarity], result of:
            0.13522464 = score(doc=1577,freq=6.0), product of:
              0.2413165 = queryWeight, product of:
                2.5548198 = boost
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.019354183 = queryNorm
              0.56036216 = fieldWeight in 1577, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.09199911 = weight(abstract_txt:search in 1577) [ClassicSimilarity], result of:
            0.09199911 = score(doc=1577,freq=7.0), product of:
              0.20298024 = queryWeight, product of:
                2.8697183 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.019354183 = queryNorm
              0.45324174 = fieldWeight in 1577, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
          0.037729327 = weight(abstract_txt:users in 1577) [ClassicSimilarity], result of:
            0.037729327 = score(doc=1577,freq=1.0), product of:
              0.22563109 = queryWeight, product of:
                3.2680242 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.019354183 = queryNorm
              0.16721688 = fieldWeight in 1577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.046875 = fieldNorm(doc=1577)
        0.32 = coord(8/25)
    
  3. Chen, H.-M.; Cooper, M.D.: Using clustering techniques to detect usage patterns in a Web-based information system (2001) 0.14
    0.139352 = sum of:
      0.139352 = product of:
        0.435475 = sum of:
          0.009978562 = weight(abstract_txt:from in 526) [ClassicSimilarity], result of:
            0.009978562 = score(doc=526,freq=1.0), product of:
              0.057859335 = queryWeight, product of:
                1.0833882 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.019354183 = queryNorm
              0.17246243 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.09280362 = weight(abstract_txt:knowledgeable in 526) [ClassicSimilarity], result of:
            0.09280362 = score(doc=526,freq=1.0), product of:
              0.17741801 = queryWeight, product of:
                1.0953064 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.019354183 = queryNorm
              0.5230789 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.015316701 = weight(abstract_txt:based in 526) [ClassicSimilarity], result of:
            0.015316701 = score(doc=526,freq=1.0), product of:
              0.07699071 = queryWeight, product of:
                1.2497311 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.019354183 = queryNorm
              0.1989422 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.05935709 = weight(abstract_txt:groups in 526) [ClassicSimilarity], result of:
            0.05935709 = score(doc=526,freq=2.0), product of:
              0.13170516 = queryWeight, product of:
                1.3346063 = boost
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.019354183 = queryNorm
              0.45068157 = fieldWeight in 526, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.08774308 = weight(abstract_txt:clusters in 526) [ClassicSimilarity], result of:
            0.08774308 = score(doc=526,freq=1.0), product of:
              0.21533088 = queryWeight, product of:
                1.7064947 = boost
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.019354183 = queryNorm
              0.40748024 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.519684 = idf(docFreq=177, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.07360696 = weight(abstract_txt:group in 526) [ClassicSimilarity], result of:
            0.07360696 = score(doc=526,freq=1.0), product of:
              0.2413165 = queryWeight, product of:
                2.5548198 = boost
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.019354183 = queryNorm
              0.3050225 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.046363197 = weight(abstract_txt:search in 526) [ClassicSimilarity], result of:
            0.046363197 = score(doc=526,freq=1.0), product of:
              0.20298024 = queryWeight, product of:
                2.8697183 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.019354183 = queryNorm
              0.22841237 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
          0.05030577 = weight(abstract_txt:users in 526) [ClassicSimilarity], result of:
            0.05030577 = score(doc=526,freq=1.0), product of:
              0.22563109 = queryWeight, product of:
                3.2680242 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.019354183 = queryNorm
              0.22295584 = fieldWeight in 526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.0625 = fieldNorm(doc=526)
        0.32 = coord(8/25)
    
  4. Shneiderman, B.; Byrd, D.; Croft, W.B.: Clarifying search : a user-interface framework for text searches (1997) 0.13
    0.12523088 = sum of:
      0.12523088 = product of:
        0.52179533 = sum of:
          0.017462483 = weight(abstract_txt:from in 2258) [ClassicSimilarity], result of:
            0.017462483 = score(doc=2258,freq=1.0), product of:
              0.057859335 = queryWeight, product of:
                1.0833882 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.019354183 = queryNorm
              0.30180925 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
          0.08579972 = weight(abstract_txt:higher in 2258) [ClassicSimilarity], result of:
            0.08579972 = score(doc=2258,freq=1.0), product of:
              0.14608185 = queryWeight, product of:
                1.4055617 = boost
                5.3699656 = idf(docFreq=561, maxDocs=44421)
                0.019354183 = queryNorm
              0.58734 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3699656 = idf(docFreq=561, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
          0.12055026 = weight(abstract_txt:benefit in 2258) [ClassicSimilarity], result of:
            0.12055026 = score(doc=2258,freq=1.0), product of:
              0.18325302 = queryWeight, product of:
                1.5742632 = boost
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.019354183 = queryNorm
              0.65783507 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.014492 = idf(docFreq=294, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
          0.1288122 = weight(abstract_txt:group in 2258) [ClassicSimilarity], result of:
            0.1288122 = score(doc=2258,freq=1.0), product of:
              0.2413165 = queryWeight, product of:
                2.5548198 = boost
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.019354183 = queryNorm
              0.5337894 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
          0.08113559 = weight(abstract_txt:search in 2258) [ClassicSimilarity], result of:
            0.08113559 = score(doc=2258,freq=1.0), product of:
              0.20298024 = queryWeight, product of:
                2.8697183 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.019354183 = queryNorm
              0.39972165 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
          0.08803509 = weight(abstract_txt:users in 2258) [ClassicSimilarity], result of:
            0.08803509 = score(doc=2258,freq=1.0), product of:
              0.22563109 = queryWeight, product of:
                3.2680242 = boost
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.019354183 = queryNorm
              0.39017272 = fieldWeight in 2258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5672934 = idf(docFreq=3408, maxDocs=44421)
                0.109375 = fieldNorm(doc=2258)
        0.24 = coord(6/25)
    
  5. Hyldegård, J.: Beyond the search process : exploring group members' information behavior in context (2009) 0.12
    0.12460184 = sum of:
      0.12460184 = product of:
        0.51917434 = sum of:
          0.014111817 = weight(abstract_txt:from in 3458) [ClassicSimilarity], result of:
            0.014111817 = score(doc=3458,freq=2.0), product of:
              0.057859335 = queryWeight, product of:
                1.0833882 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.019354183 = queryNorm
              0.2438987 = fieldWeight in 3458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
          0.021661084 = weight(abstract_txt:based in 3458) [ClassicSimilarity], result of:
            0.021661084 = score(doc=3458,freq=2.0), product of:
              0.07699071 = queryWeight, product of:
                1.2497311 = boost
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.019354183 = queryNorm
              0.28134674 = fieldWeight in 3458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1830752 = idf(docFreq=5005, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
          0.05935709 = weight(abstract_txt:groups in 3458) [ClassicSimilarity], result of:
            0.05935709 = score(doc=3458,freq=2.0), product of:
              0.13170516 = queryWeight, product of:
                1.3346063 = boost
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.019354183 = queryNorm
              0.45068157 = fieldWeight in 3458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.09888 = idf(docFreq=736, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
          0.16373113 = weight(abstract_txt:behavior in 3458) [ClassicSimilarity], result of:
            0.16373113 = score(doc=3458,freq=6.0), product of:
              0.20560417 = queryWeight, product of:
                2.042271 = boost
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.019354183 = queryNorm
              0.7963415 = fieldWeight in 3458, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.2016807 = idf(docFreq=664, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
          0.19474572 = weight(abstract_txt:group in 3458) [ClassicSimilarity], result of:
            0.19474572 = score(doc=3458,freq=7.0), product of:
              0.2413165 = queryWeight, product of:
                2.5548198 = boost
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.019354183 = queryNorm
              0.8070137 = fieldWeight in 3458, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.88036 = idf(docFreq=916, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
          0.06556746 = weight(abstract_txt:search in 3458) [ClassicSimilarity], result of:
            0.06556746 = score(doc=3458,freq=2.0), product of:
              0.20298024 = queryWeight, product of:
                2.8697183 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.019354183 = queryNorm
              0.3230239 = fieldWeight in 3458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=3458)
        0.24 = coord(6/25)