Document (#44306)

Author
Mens, G. Le
Kovács; B.
Hannan, M.T.
Pros, G.
Title
Uncovering the semantics of concepts using GPT-4
Source
Proceedings of the National Academy of Sciences. 120(2023) no.49 [https://doi.org/10.1073/pnas.2309350120]
Year
2023
Abstract
The ability of recent Large Language Models (LLMs) such as GPT-3.5 and GPT-4 to generate human-like texts suggests that social scientists could use these LLMs to construct measures of semantic similarity that match human judgment. In this article, we provide an empirical test of this intuition. We use GPT-4 to construct a measure of typicality-the similarity of a text document to a concept. We evaluate its performance against other model-based typicality measures in terms of the correlation with human typicality ratings. We conduct this comparative analysis in two domains: the typicality of books in literary genres (using an existing dataset of book descriptions) and the typicality of tweets authored by US Congress members in the Democratic and Republican parties (using a novel dataset). The typicality measure produced with GPT-4 meets or exceeds the performance of the previous state-of-the art typicality measure we introduced in a recent paper [G. Le Mens, B. Kovács, M. T. Hannan, G. Pros Rius, Sociol. Sci. 2023 , 82-117 (2023)]. It accomplishes this without any training with the research data (it is zero-shot learning). This is a breakthrough because the previous state-of-the-art measure required fine-tuning an LLM on hundreds of thousands of text documents to achieve its performance.
We use GPT-4 to create "typicality measures" that quantitatively assess how closely text documents align with a specific concept or category. Unlike previous methods that required extensive training on large text datasets, the GPT-4-based measures achieve state-of-the-art correlation with human judgments without such training. Because training data is not needed, this dramatically reduces the data requirements for obtaining high performing model-based typicality measures. Our analysis spans two domains: judging the typicality of books in literary genres and the typicality of tweets in the Democratic and Republican parties. Our results demonstrate that modern Large Language Models (LLMs) can be used for text analysis in the social sciences beyond simple classification or labelling.
Theme
Computerlinguistik
Object
ChatGPT

Similar documents (content)

  1. Gao, T.; Yen, H.; Yu, J.; Chen, D.: Enabling large language models to generate text with citations (2023) 0.38
    0.38493586 = sum of:
      0.38493586 = product of:
        0.96233964 = sum of:
          0.008025401 = weight(abstract_txt:that in 2295) [ClassicSimilarity], result of:
            0.008025401 = score(doc=2295,freq=1.0), product of:
              0.054295957 = queryWeight, product of:
                1.0601697 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02165573 = queryNorm
              0.14780845 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.021100886 = weight(abstract_txt:with in 2295) [ClassicSimilarity], result of:
            0.021100886 = score(doc=2295,freq=5.0), product of:
              0.06048766 = queryWeight, product of:
                1.1189871 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.02165573 = queryNorm
              0.34884614 = fieldWeight in 2295, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.061854728 = weight(abstract_txt:correlation in 2295) [ClassicSimilarity], result of:
            0.061854728 = score(doc=2295,freq=1.0), product of:
              0.15609594 = queryWeight, product of:
                1.1368874 = boost
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.02165573 = queryNorm
              0.39626098 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3401756 = idf(docFreq=212, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.031915307 = weight(abstract_txt:large in 2295) [ClassicSimilarity], result of:
            0.031915307 = score(doc=2295,freq=1.0), product of:
              0.114949234 = queryWeight, product of:
                1.194869 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.02165573 = queryNorm
              0.27764696 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.07205226 = weight(abstract_txt:dataset in 2295) [ClassicSimilarity], result of:
            0.07205226 = score(doc=2295,freq=1.0), product of:
              0.17281234 = queryWeight, product of:
                1.1962144 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.02165573 = queryNorm
              0.41693935 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.0101438835 = weight(abstract_txt:this in 2295) [ClassicSimilarity], result of:
            0.0101438835 = score(doc=2295,freq=1.0), product of:
              0.067450665 = queryWeight, product of:
                1.2944206 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.02165573 = queryNorm
              0.15038967 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.040924724 = weight(abstract_txt:state in 2295) [ClassicSimilarity], result of:
            0.040924724 = score(doc=2295,freq=1.0), product of:
              0.1356742 = queryWeight, product of:
                1.2981232 = boost
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.02165573 = queryNorm
              0.3016397 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.07042178 = weight(abstract_txt:human in 2295) [ClassicSimilarity], result of:
            0.07042178 = score(doc=2295,freq=2.0), product of:
              0.17019534 = queryWeight, product of:
                1.6788446 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.02165573 = queryNorm
              0.41377032 = fieldWeight in 2295, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.040034793 = weight(abstract_txt:text in 2295) [ClassicSimilarity], result of:
            0.040034793 = score(doc=2295,freq=1.0), product of:
              0.15851903 = queryWeight, product of:
                1.811475 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.02165573 = queryNorm
              0.25255513 = fieldWeight in 2295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
          0.60586584 = weight(abstract_txt:llms in 2295) [ClassicSimilarity], result of:
            0.60586584 = score(doc=2295,freq=5.0), product of:
              0.47837418 = queryWeight, product of:
                2.4375367 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.02165573 = queryNorm
              1.2665104 = fieldWeight in 2295, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.0625 = fieldNorm(doc=2295)
        0.4 = coord(10/25)
    
  2. Ghali, M.-K.; Farrag, A.; Won, D.; Jin, Y.: Enhancing knowledge retrieval with in-context learning and semantic search through Generative AI (2024) 0.38
    0.37505627 = sum of:
      0.37505627 = product of:
        0.8524006 = sum of:
          0.03683228 = weight(abstract_txt:domains in 2367) [ClassicSimilarity], result of:
            0.03683228 = score(doc=2367,freq=1.0), product of:
              0.12076934 = queryWeight, product of:
                5.576784 = idf(docFreq=456, maxDocs=44421)
                0.02165573 = queryNorm
              0.3049804 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.576784 = idf(docFreq=456, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.007022226 = weight(abstract_txt:that in 2367) [ClassicSimilarity], result of:
            0.007022226 = score(doc=2367,freq=1.0), product of:
              0.054295957 = queryWeight, product of:
                1.0601697 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02165573 = queryNorm
              0.1293324 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.016514057 = weight(abstract_txt:with in 2367) [ClassicSimilarity], result of:
            0.016514057 = score(doc=2367,freq=4.0), product of:
              0.06048766 = queryWeight, product of:
                1.1189871 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.02165573 = queryNorm
              0.2730153 = fieldWeight in 2367, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.04836906 = weight(abstract_txt:large in 2367) [ClassicSimilarity], result of:
            0.04836906 = score(doc=2367,freq=3.0), product of:
              0.114949234 = queryWeight, product of:
                1.194869 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.02165573 = queryNorm
              0.4207863 = fieldWeight in 2367, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.08916013 = weight(abstract_txt:dataset in 2367) [ClassicSimilarity], result of:
            0.08916013 = score(doc=2367,freq=2.0), product of:
              0.17281234 = queryWeight, product of:
                1.1962144 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.02165573 = queryNorm
              0.51593614 = fieldWeight in 2367, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.03140699 = weight(abstract_txt:performance in 2367) [ClassicSimilarity], result of:
            0.03140699 = score(doc=2367,freq=1.0), product of:
              0.12431368 = queryWeight, product of:
                1.2425869 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.02165573 = queryNorm
              0.25264308 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.008875898 = weight(abstract_txt:this in 2367) [ClassicSimilarity], result of:
            0.008875898 = score(doc=2367,freq=1.0), product of:
              0.067450665 = queryWeight, product of:
                1.2944206 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.02165573 = queryNorm
              0.13159096 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.035809133 = weight(abstract_txt:state in 2367) [ClassicSimilarity], result of:
            0.035809133 = score(doc=2367,freq=1.0), product of:
              0.1356742 = queryWeight, product of:
                1.2981232 = boost
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.02165573 = queryNorm
              0.26393473 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8262353 = idf(docFreq=967, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.043571252 = weight(abstract_txt:human in 2367) [ClassicSimilarity], result of:
            0.043571252 = score(doc=2367,freq=1.0), product of:
              0.17019534 = queryWeight, product of:
                1.6788446 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.02165573 = queryNorm
              0.2560073 = fieldWeight in 2367, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.06067451 = weight(abstract_txt:text in 2367) [ClassicSimilarity], result of:
            0.06067451 = score(doc=2367,freq=3.0), product of:
              0.15851903 = queryWeight, product of:
                1.811475 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.02165573 = queryNorm
              0.38275853 = fieldWeight in 2367, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
          0.47416505 = weight(abstract_txt:llms in 2367) [ClassicSimilarity], result of:
            0.47416505 = score(doc=2367,freq=4.0), product of:
              0.47837418 = queryWeight, product of:
                2.4375367 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.02165573 = queryNorm
              0.99120116 = fieldWeight in 2367, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2367)
        0.44 = coord(11/25)
    
  3. Hou, Y.; Pascale, A.; Carnerero-Cano, J.; Sattigeri, P.; Tchrakian, T.; Marinescu, R.; Daly, E.; Padhi, I.: WikiContradict : a benchmark for evaluating LLMs on real-world knowledge conflicts from Wikipedia (2024) 0.32
    0.32014778 = sum of:
      0.32014778 = product of:
        0.8892994 = sum of:
          0.012162851 = weight(abstract_txt:that in 2368) [ClassicSimilarity], result of:
            0.012162851 = score(doc=2368,freq=3.0), product of:
              0.054295957 = queryWeight, product of:
                1.0601697 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02165573 = queryNorm
              0.22401026 = fieldWeight in 2368, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.016514057 = weight(abstract_txt:with in 2368) [ClassicSimilarity], result of:
            0.016514057 = score(doc=2368,freq=4.0), product of:
              0.06048766 = queryWeight, product of:
                1.1189871 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.02165573 = queryNorm
              0.2730153 = fieldWeight in 2368, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.027925892 = weight(abstract_txt:large in 2368) [ClassicSimilarity], result of:
            0.027925892 = score(doc=2368,freq=1.0), product of:
              0.114949234 = queryWeight, product of:
                1.194869 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.02165573 = queryNorm
              0.24294108 = fieldWeight in 2368, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.06304573 = weight(abstract_txt:dataset in 2368) [ClassicSimilarity], result of:
            0.06304573 = score(doc=2368,freq=1.0), product of:
              0.17281234 = queryWeight, product of:
                1.1962144 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.02165573 = queryNorm
              0.36482194 = fieldWeight in 2368, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.044416193 = weight(abstract_txt:performance in 2368) [ClassicSimilarity], result of:
            0.044416193 = score(doc=2368,freq=2.0), product of:
              0.12431368 = queryWeight, product of:
                1.2425869 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.02165573 = queryNorm
              0.35729125 = fieldWeight in 2368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.012552415 = weight(abstract_txt:this in 2368) [ClassicSimilarity], result of:
            0.012552415 = score(doc=2368,freq=2.0), product of:
              0.067450665 = queryWeight, product of:
                1.2944206 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.02165573 = queryNorm
              0.18609773 = fieldWeight in 2368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.075467624 = weight(abstract_txt:human in 2368) [ClassicSimilarity], result of:
            0.075467624 = score(doc=2368,freq=3.0), product of:
              0.17019534 = queryWeight, product of:
                1.6788446 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.02165573 = queryNorm
              0.44341767 = fieldWeight in 2368, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.05648346 = weight(abstract_txt:training in 2368) [ClassicSimilarity], result of:
            0.05648346 = score(doc=2368,freq=1.0), product of:
              0.20234624 = queryWeight, product of:
                1.830561 = boost
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.02165573 = queryNorm
              0.27914262 = fieldWeight in 2368, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
          0.5807312 = weight(abstract_txt:llms in 2368) [ClassicSimilarity], result of:
            0.5807312 = score(doc=2368,freq=6.0), product of:
              0.47837418 = queryWeight, product of:
                2.4375367 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.02165573 = queryNorm
              1.2139685 = fieldWeight in 2368, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2368)
        0.36 = coord(9/25)
    
  4. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.30
    0.2964865 = sum of:
      0.2964865 = product of:
        0.74121624 = sum of:
          0.08449825 = weight(abstract_txt:similarity in 1372) [ClassicSimilarity], result of:
            0.08449825 = score(doc=1372,freq=2.0), product of:
              0.13144925 = queryWeight, product of:
                1.0432796 = boost
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.02165573 = queryNorm
              0.6428203 = fieldWeight in 1372, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181453 = idf(docFreq=358, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.020063503 = weight(abstract_txt:that in 1372) [ClassicSimilarity], result of:
            0.020063503 = score(doc=1372,freq=4.0), product of:
              0.054295957 = queryWeight, product of:
                1.0601697 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02165573 = queryNorm
              0.3695211 = fieldWeight in 1372, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.016681716 = weight(abstract_txt:with in 1372) [ClassicSimilarity], result of:
            0.016681716 = score(doc=1372,freq=2.0), product of:
              0.06048766 = queryWeight, product of:
                1.1189871 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.02165573 = queryNorm
              0.2757871 = fieldWeight in 1372, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.039894134 = weight(abstract_txt:large in 1372) [ClassicSimilarity], result of:
            0.039894134 = score(doc=1372,freq=1.0), product of:
              0.114949234 = queryWeight, product of:
                1.194869 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.02165573 = queryNorm
              0.3470587 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.044867128 = weight(abstract_txt:performance in 1372) [ClassicSimilarity], result of:
            0.044867128 = score(doc=1372,freq=1.0), product of:
              0.12431368 = queryWeight, product of:
                1.2425869 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.02165573 = queryNorm
              0.36091867 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.113840364 = weight(abstract_txt:genres in 1372) [ClassicSimilarity], result of:
            0.113840364 = score(doc=1372,freq=1.0), product of:
              0.20202285 = queryWeight, product of:
                1.2933674 = boost
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.02165573 = queryNorm
              0.56350243 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.212831 = idf(docFreq=88, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.062244646 = weight(abstract_txt:human in 1372) [ClassicSimilarity], result of:
            0.062244646 = score(doc=1372,freq=1.0), product of:
              0.17019534 = queryWeight, product of:
                1.6788446 = boost
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.02165573 = queryNorm
              0.36572474 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.681277 = idf(docFreq=1118, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.050043494 = weight(abstract_txt:text in 1372) [ClassicSimilarity], result of:
            0.050043494 = score(doc=1372,freq=1.0), product of:
              0.15851903 = queryWeight, product of:
                1.811475 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.02165573 = queryNorm
              0.3156939 = fieldWeight in 1372, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.13784856 = weight(abstract_txt:measure in 1372) [ClassicSimilarity], result of:
            0.13784856 = score(doc=1372,freq=2.0), product of:
              0.22951151 = queryWeight, product of:
                1.9495703 = boost
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.02165573 = queryNorm
              0.6006172 = fieldWeight in 1372, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4361663 = idf(docFreq=525, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
          0.17123441 = weight(abstract_txt:measures in 1372) [ClassicSimilarity], result of:
            0.17123441 = score(doc=1372,freq=2.0), product of:
              0.2856935 = queryWeight, product of:
                2.4318783 = boost
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.02165573 = queryNorm
              0.59936404 = fieldWeight in 1372, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.424824 = idf(docFreq=531, maxDocs=44421)
                0.078125 = fieldNorm(doc=1372)
        0.4 = coord(10/25)
    
  5. El Hamdani, R.; Bonald, T.; Malliaros, F.; Suchanek, F.; Holzenberger, N.: ¬The factuality of Large Language Models in the legal domain (2024) 0.26
    0.26036915 = sum of:
      0.26036915 = product of:
        0.81365365 = sum of:
          0.017375503 = weight(abstract_txt:that in 2383) [ClassicSimilarity], result of:
            0.017375503 = score(doc=2383,freq=3.0), product of:
              0.054295957 = queryWeight, product of:
                1.0601697 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.02165573 = queryNorm
              0.32001466 = fieldWeight in 2383, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.011795755 = weight(abstract_txt:with in 2383) [ClassicSimilarity], result of:
            0.011795755 = score(doc=2383,freq=1.0), product of:
              0.06048766 = queryWeight, product of:
                1.1189871 = boost
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.02165573 = queryNorm
              0.19501092 = fieldWeight in 2383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4961398 = idf(docFreq=9949, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.039894134 = weight(abstract_txt:large in 2383) [ClassicSimilarity], result of:
            0.039894134 = score(doc=2383,freq=1.0), product of:
              0.114949234 = queryWeight, product of:
                1.194869 = boost
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.02165573 = queryNorm
              0.3470587 = fieldWeight in 2383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4423513 = idf(docFreq=1420, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.12737161 = weight(abstract_txt:dataset in 2383) [ClassicSimilarity], result of:
            0.12737161 = score(doc=2383,freq=2.0), product of:
              0.17281234 = queryWeight, product of:
                1.1962144 = boost
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.02165573 = queryNorm
              0.7370516 = fieldWeight in 2383, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6710296 = idf(docFreq=152, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.044867128 = weight(abstract_txt:performance in 2383) [ClassicSimilarity], result of:
            0.044867128 = score(doc=2383,freq=1.0), product of:
              0.12431368 = queryWeight, product of:
                1.2425869 = boost
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.02165573 = queryNorm
              0.36091867 = fieldWeight in 2383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.012679854 = weight(abstract_txt:this in 2383) [ClassicSimilarity], result of:
            0.012679854 = score(doc=2383,freq=1.0), product of:
              0.067450665 = queryWeight, product of:
                1.2944206 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.02165573 = queryNorm
              0.18798709 = fieldWeight in 2383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.08069065 = weight(abstract_txt:training in 2383) [ClassicSimilarity], result of:
            0.08069065 = score(doc=2383,freq=1.0), product of:
              0.20234624 = queryWeight, product of:
                1.830561 = boost
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.02165573 = queryNorm
              0.39877516 = fieldWeight in 2383, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.104322 = idf(docFreq=732, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
          0.47897902 = weight(abstract_txt:llms in 2383) [ClassicSimilarity], result of:
            0.47897902 = score(doc=2383,freq=2.0), product of:
              0.47837418 = queryWeight, product of:
                2.4375367 = boost
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.02165573 = queryNorm
              1.0012643 = fieldWeight in 2383, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.06241 = idf(docFreq=13, maxDocs=44421)
                0.078125 = fieldNorm(doc=2383)
        0.32 = coord(8/25)