Document (#38847)

Author
Miah, M.W.R.
Yearwood, J.
Kulkarni, S.
Title
Constructing an inter-post similarity measure to differentiate the psychological stages in offensive chats
Source
Journal of the Association for Information Science and Technology. 66(2015) no.5, S.1065-1081
Year
2015
Abstract
Offensive Internet chats, particularly the child-exploiting type, tend to follow a documented psychological behavioral pattern. Researchers have identified some important stages in this pattern. The psychological stages broadly include befriending, information exchange, grooming, and approach. Similarities among the posts of a chat play an important role in differentiating as well as in identifying these stages. In this article a novel similarity measure is constructed which gives high Inter-post-similarity among the chat-posts within a particular behavioral stage and low inter-post-similarity across different behavioral stages. A psychological stage corpus-based dictionary is constructed from mining the terms associated with each stage. The dictionary works as a background knowledge-base to support the similarity measure. To find the inter-post similarity a modified sentence similarity measure is used. The proposed measure gives improved recognition of inter-stage and intra-stage similarity among the chat posts compared with other types of similarity measures. The pairwise inter-post similarity is used for clustering chat-posts into the psychological stages. Results of experiments demonstrate that the new clustering method gives better results than some current clustering methods.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23247/abstract.

Similar documents (content)

  1. Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.12
    0.116753325 = sum of:
      0.116753325 = product of:
        0.9729444 = sum of:
          0.06160232 = weight(abstract_txt:gives in 4915) [ClassicSimilarity], result of:
            0.06160232 = score(doc=4915,freq=1.0), product of:
              0.105297856 = queryWeight, product of:
                2.1770577 = boost
                5.3488383 = idf(docFreq=573, maxDocs=44421)
                0.009042533 = queryNorm
              0.5850292 = fieldWeight in 4915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3488383 = idf(docFreq=573, maxDocs=44421)
                0.109375 = fieldNorm(doc=4915)
          0.26401055 = weight(abstract_txt:measure in 4915) [ClassicSimilarity], result of:
            0.26401055 = score(doc=4915,freq=6.0), product of:
              0.18127371 = queryWeight, product of:
                3.6876695 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.009042533 = queryNorm
              1.4564193 = fieldWeight in 4915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.109375 = fieldNorm(doc=4915)
          0.64733154 = weight(abstract_txt:similarity in 4915) [ClassicSimilarity], result of:
            0.64733154 = score(doc=4915,freq=6.0), product of:
              0.41528717 = queryWeight, product of:
                7.8935766 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.009042533 = queryNorm
              1.5587565 = fieldWeight in 4915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.109375 = fieldNorm(doc=4915)
        0.12 = coord(3/25)
    
  2. Ellis, D.; Furner-Hines, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of inter-linker consistency (1994) 0.11
    0.10745456 = sum of:
      0.10745456 = product of:
        0.5372728 = sum of:
          0.01001524 = weight(abstract_txt:important in 7492) [ClassicSimilarity], result of:
            0.01001524 = score(doc=7492,freq=1.0), product of:
              0.043496266 = queryWeight, product of:
                1.1424588 = boost
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.009042533 = queryNorm
              0.23025516 = fieldWeight in 7492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.0546875 = fieldNorm(doc=7492)
          0.053890925 = weight(abstract_txt:measure in 7492) [ClassicSimilarity], result of:
            0.053890925 = score(doc=7492,freq=1.0), product of:
              0.18127371 = queryWeight, product of:
                3.6876695 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.009042533 = queryNorm
              0.29729035 = fieldWeight in 7492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.0546875 = fieldNorm(doc=7492)
          0.07639574 = weight(abstract_txt:stage in 7492) [ClassicSimilarity], result of:
            0.07639574 = score(doc=7492,freq=1.0), product of:
              0.22875497 = queryWeight, product of:
                4.14257 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.009042533 = queryNorm
              0.33396322 = fieldWeight in 7492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0546875 = fieldNorm(doc=7492)
          0.21010236 = weight(abstract_txt:inter in 7492) [ClassicSimilarity], result of:
            0.21010236 = score(doc=7492,freq=3.0), product of:
              0.33085042 = queryWeight, product of:
                5.457466 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.009042533 = queryNorm
              0.6350373 = fieldWeight in 7492, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0546875 = fieldNorm(doc=7492)
          0.18686852 = weight(abstract_txt:similarity in 7492) [ClassicSimilarity], result of:
            0.18686852 = score(doc=7492,freq=2.0), product of:
              0.41528717 = queryWeight, product of:
                7.8935766 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.009042533 = queryNorm
              0.4499742 = fieldWeight in 7492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0546875 = fieldNorm(doc=7492)
        0.2 = coord(5/25)
    
  3. Ellis, D.; Furner-Hines, J.; Willett, P.: ¬The creation of hypertext links in full-text documents (1994) 0.09
    0.09443398 = sum of:
      0.09443398 = product of:
        0.47216988 = sum of:
          0.011445988 = weight(abstract_txt:important in 1152) [ClassicSimilarity], result of:
            0.011445988 = score(doc=1152,freq=1.0), product of:
              0.043496266 = queryWeight, product of:
                1.1424588 = boost
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.009042533 = queryNorm
              0.26314875 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.0625 = fieldNorm(doc=1152)
          0.021218853 = weight(abstract_txt:among in 1152) [ClassicSimilarity], result of:
            0.021218853 = score(doc=1152,freq=1.0), product of:
              0.075138316 = queryWeight, product of:
                1.839039 = boost
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.009042533 = queryNorm
              0.28239724 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.0625 = fieldNorm(doc=1152)
          0.08730943 = weight(abstract_txt:stage in 1152) [ClassicSimilarity], result of:
            0.08730943 = score(doc=1152,freq=1.0), product of:
              0.22875497 = queryWeight, product of:
                4.14257 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.009042533 = queryNorm
              0.38167226 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=1152)
          0.1386316 = weight(abstract_txt:inter in 1152) [ClassicSimilarity], result of:
            0.1386316 = score(doc=1152,freq=1.0), product of:
              0.33085042 = queryWeight, product of:
                5.457466 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.009042533 = queryNorm
              0.41901594 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0625 = fieldNorm(doc=1152)
          0.21356402 = weight(abstract_txt:similarity in 1152) [ClassicSimilarity], result of:
            0.21356402 = score(doc=1152,freq=2.0), product of:
              0.41528717 = queryWeight, product of:
                7.8935766 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.009042533 = queryNorm
              0.51425624 = fieldWeight in 1152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0625 = fieldNorm(doc=1152)
        0.2 = coord(5/25)
    
  4. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.09
    0.08944317 = sum of:
      0.08944317 = product of:
        0.44721583 = sum of:
          0.01001524 = weight(abstract_txt:important in 4282) [ClassicSimilarity], result of:
            0.01001524 = score(doc=4282,freq=1.0), product of:
              0.043496266 = queryWeight, product of:
                1.1424588 = boost
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.009042533 = queryNorm
              0.23025516 = fieldWeight in 4282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.21038 = idf(docFreq=1791, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4282)
          0.018566497 = weight(abstract_txt:among in 4282) [ClassicSimilarity], result of:
            0.018566497 = score(doc=4282,freq=1.0), product of:
              0.075138316 = queryWeight, product of:
                1.839039 = boost
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.009042533 = queryNorm
              0.24709758 = fieldWeight in 4282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4282)
          0.07639574 = weight(abstract_txt:stage in 4282) [ClassicSimilarity], result of:
            0.07639574 = score(doc=4282,freq=1.0), product of:
              0.22875497 = queryWeight, product of:
                4.14257 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.009042533 = queryNorm
              0.33396322 = fieldWeight in 4282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4282)
          0.21010236 = weight(abstract_txt:inter in 4282) [ClassicSimilarity], result of:
            0.21010236 = score(doc=4282,freq=3.0), product of:
              0.33085042 = queryWeight, product of:
                5.457466 = boost
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.009042533 = queryNorm
              0.6350373 = fieldWeight in 4282, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.704255 = idf(docFreq=147, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4282)
          0.13213599 = weight(abstract_txt:similarity in 4282) [ClassicSimilarity], result of:
            0.13213599 = score(doc=4282,freq=1.0), product of:
              0.41528717 = queryWeight, product of:
                7.8935766 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.009042533 = queryNorm
              0.31817982 = fieldWeight in 4282, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4282)
        0.2 = coord(5/25)
    
  5. Huang, M.-H.; Wu, L.-L.; Wu, Y.-C.: ¬A study of research collaboration in the pre-web and post-web stages : a coauthorship analysis of the information systems discipline (2015) 0.09
    0.08795059 = sum of:
      0.08795059 = product of:
        0.5496912 = sum of:
          0.021218853 = weight(abstract_txt:among in 2729) [ClassicSimilarity], result of:
            0.021218853 = score(doc=2729,freq=1.0), product of:
              0.075138316 = queryWeight, product of:
                1.839039 = boost
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.009042533 = queryNorm
              0.28239724 = fieldWeight in 2729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.518356 = idf(docFreq=1316, maxDocs=44421)
                0.0625 = fieldNorm(doc=2729)
          0.21386355 = weight(abstract_txt:stage in 2729) [ClassicSimilarity], result of:
            0.21386355 = score(doc=2729,freq=6.0), product of:
              0.22875497 = queryWeight, product of:
                4.14257 = boost
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.009042533 = queryNorm
              0.9349023 = fieldWeight in 2729, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.106756 = idf(docFreq=268, maxDocs=44421)
                0.0625 = fieldNorm(doc=2729)
          0.1996394 = weight(abstract_txt:post in 2729) [ClassicSimilarity], result of:
            0.1996394 = score(doc=2729,freq=5.0), product of:
              0.23218666 = queryWeight, product of:
                4.173527 = boost
                6.1523914 = idf(docFreq=256, maxDocs=44421)
                0.009042533 = queryNorm
              0.85982287 = fieldWeight in 2729, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1523914 = idf(docFreq=256, maxDocs=44421)
                0.0625 = fieldNorm(doc=2729)
          0.11496937 = weight(abstract_txt:stages in 2729) [ClassicSimilarity], result of:
            0.11496937 = score(doc=2729,freq=1.0), product of:
              0.29204178 = queryWeight, product of:
                5.1274056 = boost
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.009042533 = queryNorm
              0.3936744 = fieldWeight in 2729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2987905 = idf(docFreq=221, maxDocs=44421)
                0.0625 = fieldNorm(doc=2729)
        0.16 = coord(4/25)