Document (#36753)

Author
Herdagdelen, A.
Baroni, M.
Title
Stereotypical gender actions can be extracted from web text
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.9, S.1741-1749
Year
2011
Abstract
We extracted gender-specific actions from text corpora and Twitter, and compared them with stereotypical expectations of people. We used Open Mind Common Sense (OMCS), a common sense knowledge repository, to focus on actions that are pertinent to common sense and daily life of humans. We use the gender information of Twitter users and web-corpus-based pronoun/name gender heuristics to compute the gender bias of the actions. With high recall, we obtained a Spearman correlation of 0.47 between corpus-based predictions and a human gold standard, and an area under the ROC curve of 0.76 when predicting the polarity of the gold standard. We conclude that it is feasible to use natural text (and a Twitter-derived corpus in particular) in order to augment common sense repositories with the stereotypical gender expectations of actions. We also present a dataset of 441 common sense actions with human judges' ratings on whether the action is typically/slightly masculine/feminine (or neutral), and another larger dataset of 21,442 actions automatically rated by the methods we investigate in this study.

Similar documents (content)

  1. Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.11
    0.109662555 = sum of:
      0.109662555 = product of:
        0.5483128 = sum of:
          0.0145054925 = weight(abstract_txt:with in 236) [ClassicSimilarity], result of:
            0.0145054925 = score(doc=236,freq=3.0), product of:
              0.04294503 = queryWeight, product of:
                1.3153858 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.013079491 = queryNorm
              0.33776882 = fieldWeight in 236, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=236)
          0.037685122 = weight(abstract_txt:text in 236) [ClassicSimilarity], result of:
            0.037685122 = score(doc=236,freq=2.0), product of:
              0.08440899 = queryWeight, product of:
                1.5970616 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.013079491 = queryNorm
              0.4464586 = fieldWeight in 236, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=236)
          0.09130723 = weight(abstract_txt:corpus in 236) [ClassicSimilarity], result of:
            0.09130723 = score(doc=236,freq=1.0), product of:
              0.19184723 = queryWeight, product of:
                2.407715 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.013079491 = queryNorm
              0.47593716 = fieldWeight in 236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.078125 = fieldNorm(doc=236)
          0.13294722 = weight(abstract_txt:sense in 236) [ClassicSimilarity], result of:
            0.13294722 = score(doc=236,freq=1.0), product of:
              0.29220515 = queryWeight, product of:
                3.8361504 = boost
                5.823732 = idf(docFreq=356, maxDocs=44421)
                0.013079491 = queryNorm
              0.45497906 = fieldWeight in 236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.823732 = idf(docFreq=356, maxDocs=44421)
                0.078125 = fieldNorm(doc=236)
          0.27186766 = weight(abstract_txt:actions in 236) [ClassicSimilarity], result of:
            0.27186766 = score(doc=236,freq=1.0), product of:
              0.5266427 = queryWeight, product of:
                6.093597 = boost
                6.6077175 = idf(docFreq=162, maxDocs=44421)
                0.013079491 = queryNorm
              0.51622796 = fieldWeight in 236, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6077175 = idf(docFreq=162, maxDocs=44421)
                0.078125 = fieldNorm(doc=236)
        0.2 = coord(5/25)
    
  2. Muresan, S.; Gonzalez-Ibanez, R.; Ghosh, D.; Wacholder, N.: Identification of nonliteral language in social media : a case study on sarcasm (2016) 0.10
    0.09994667 = sum of:
      0.09994667 = product of:
        0.41644448 = sum of:
          0.05703339 = weight(abstract_txt:judges in 4155) [ClassicSimilarity], result of:
            0.05703339 = score(doc=4155,freq=1.0), product of:
              0.112790145 = queryWeight, product of:
                1.0658652 = boost
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.013079491 = queryNorm
              0.50565934 = fieldWeight in 4155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.090549 = idf(docFreq=36, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
          0.06769012 = weight(abstract_txt:polarity in 4155) [ClassicSimilarity], result of:
            0.06769012 = score(doc=4155,freq=1.0), product of:
              0.12643535 = queryWeight, product of:
                1.1284984 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.013079491 = queryNorm
              0.53537333 = fieldWeight in 4155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
          0.03124878 = weight(abstract_txt:human in 4155) [ClassicSimilarity], result of:
            0.03124878 = score(doc=4155,freq=2.0), product of:
              0.07552204 = queryWeight, product of:
                1.2334415 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.013079491 = queryNorm
              0.41377032 = fieldWeight in 4155, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
          0.0066998 = weight(abstract_txt:with in 4155) [ClassicSimilarity], result of:
            0.0066998 = score(doc=4155,freq=1.0), product of:
              0.04294503 = queryWeight, product of:
                1.3153858 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.013079491 = queryNorm
              0.15600874 = fieldWeight in 4155, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
          0.10330234 = weight(abstract_txt:corpus in 4155) [ClassicSimilarity], result of:
            0.10330234 = score(doc=4155,freq=2.0), product of:
              0.19184723 = queryWeight, product of:
                2.407715 = boost
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.013079491 = queryNorm
              0.53846145 = fieldWeight in 4155, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0919957 = idf(docFreq=272, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
          0.15047006 = weight(abstract_txt:twitter in 4155) [ClassicSimilarity], result of:
            0.15047006 = score(doc=4155,freq=2.0), product of:
              0.2465181 = queryWeight, product of:
                2.729303 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.013079491 = queryNorm
              0.61038136 = fieldWeight in 4155, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.0625 = fieldNorm(doc=4155)
        0.24 = coord(6/25)
    
  3. Singh, V.K.; Chayko, M.; Inamdar, R.; Floegel, D.: Female librarians and male computer programmers? : gender bias in occupational images on digital media platforms (2020) 0.09
    0.09112337 = sum of:
      0.09112337 = product of:
        0.56952107 = sum of:
          0.03827178 = weight(abstract_txt:human in 1007) [ClassicSimilarity], result of:
            0.03827178 = score(doc=1007,freq=3.0), product of:
              0.07552204 = queryWeight, product of:
                1.2334415 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.013079491 = queryNorm
              0.50676304 = fieldWeight in 1007, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.0625 = fieldNorm(doc=1007)
          0.0066998 = weight(abstract_txt:with in 1007) [ClassicSimilarity], result of:
            0.0066998 = score(doc=1007,freq=1.0), product of:
              0.04294503 = queryWeight, product of:
                1.3153858 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.013079491 = queryNorm
              0.15600874 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=1007)
          0.106398396 = weight(abstract_txt:twitter in 1007) [ClassicSimilarity], result of:
            0.106398396 = score(doc=1007,freq=1.0), product of:
              0.2465181 = queryWeight, product of:
                2.729303 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.013079491 = queryNorm
              0.4316048 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.0625 = fieldNorm(doc=1007)
          0.4181511 = weight(abstract_txt:gender in 1007) [ClassicSimilarity], result of:
            0.4181511 = score(doc=1007,freq=4.0), product of:
              0.48727143 = queryWeight, product of:
                5.4266 = boost
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.013079491 = queryNorm
              0.8581482 = fieldWeight in 1007, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.0625 = fieldNorm(doc=1007)
        0.16 = coord(4/25)
    
  4. Lee, C.H.; Zhao, J.L.: Social media engagement and crowdfunding performance : the moderating role of product type and entrepreneurs' characteristics (2022) 0.08
    0.082821436 = sum of:
      0.082821436 = product of:
        0.517634 = sum of:
          0.0066998 = weight(abstract_txt:with in 1749) [ClassicSimilarity], result of:
            0.0066998 = score(doc=1749,freq=1.0), product of:
              0.04294503 = queryWeight, product of:
                1.3153858 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.013079491 = queryNorm
              0.15600874 = fieldWeight in 1749, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=1749)
          0.06394458 = weight(abstract_txt:dataset in 1749) [ClassicSimilarity], result of:
            0.06394458 = score(doc=1749,freq=1.0), product of:
              0.15336661 = queryWeight, product of:
                1.7577095 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.013079491 = queryNorm
              0.41693935 = fieldWeight in 1749, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.0625 = fieldNorm(doc=1749)
          0.23791404 = weight(abstract_txt:twitter in 1749) [ClassicSimilarity], result of:
            0.23791404 = score(doc=1749,freq=5.0), product of:
              0.2465181 = queryWeight, product of:
                2.729303 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.013079491 = queryNorm
              0.96509767 = fieldWeight in 1749, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.0625 = fieldNorm(doc=1749)
          0.20907556 = weight(abstract_txt:gender in 1749) [ClassicSimilarity], result of:
            0.20907556 = score(doc=1749,freq=1.0), product of:
              0.48727143 = queryWeight, product of:
                5.4266 = boost
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.013079491 = queryNorm
              0.4290741 = fieldWeight in 1749, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8651857 = idf(docFreq=125, maxDocs=44421)
                0.0625 = fieldNorm(doc=1749)
        0.16 = coord(4/25)
    
  5. Montejo-Ráez, A.; Martínez-Cámara, E.; Martín-Valdivia, M.T.; Ureña-López, L.A.: ¬A knowledge-based approach for polarity classification in Twitter (2014) 0.08
    0.077468775 = sum of:
      0.077468775 = product of:
        0.48417985 = sum of:
          0.072561 = weight(abstract_txt:compute in 2204) [ClassicSimilarity], result of:
            0.072561 = score(doc=2204,freq=1.0), product of:
              0.1010632 = queryWeight, product of:
                1.0089351 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.013079491 = queryNorm
              0.7179765 = fieldWeight in 2204, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.09375 = fieldNorm(doc=2204)
          0.17586407 = weight(abstract_txt:polarity in 2204) [ClassicSimilarity], result of:
            0.17586407 = score(doc=2204,freq=3.0), product of:
              0.12643535 = queryWeight, product of:
                1.1284984 = boost
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.013079491 = queryNorm
              1.3909407 = fieldWeight in 2204, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.565973 = idf(docFreq=22, maxDocs=44421)
                0.09375 = fieldNorm(doc=2204)
          0.010049701 = weight(abstract_txt:with in 2204) [ClassicSimilarity], result of:
            0.010049701 = score(doc=2204,freq=1.0), product of:
              0.04294503 = queryWeight, product of:
                1.3153858 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.013079491 = queryNorm
              0.23401311 = fieldWeight in 2204, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.09375 = fieldNorm(doc=2204)
          0.22570509 = weight(abstract_txt:twitter in 2204) [ClassicSimilarity], result of:
            0.22570509 = score(doc=2204,freq=2.0), product of:
              0.2465181 = queryWeight, product of:
                2.729303 = boost
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.013079491 = queryNorm
              0.91557205 = fieldWeight in 2204, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.905677 = idf(docFreq=120, maxDocs=44421)
                0.09375 = fieldNorm(doc=2204)
        0.16 = coord(4/25)