Document (#40887)

Author
Maaten, L. van den
Title
Accelerating t-SNE using Tree-Based Algorithms
Source
Journal of machine learning research. 15(2014), S.3221-3245
Year
2014
Abstract
The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
Content
Vgl. auch: https://lvdmaaten.github.io/tsne/.
Theme
Data Mining
Visualisierung
Object
tSNE

Similar documents (content)

  1. Zhu, Y.; Quan, L.; Chen, P.-Y.; Kim, M.C.; Che, C.: Predicting coauthorship using bibliographic network embedding (2023) 0.29
    0.2883266 = sum of:
      0.2883266 = product of:
        0.8009072 = sum of:
          0.022039883 = weight(abstract_txt:data in 1918) [ClassicSimilarity], result of:
            0.022039883 = score(doc=1918,freq=4.0), product of:
              0.052945085 = queryWeight, product of:
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015898349 = queryNorm
              0.41627818 = fieldWeight in 1918, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.0112899 = weight(abstract_txt:used in 1918) [ClassicSimilarity], result of:
            0.0112899 = score(doc=1918,freq=1.0), product of:
              0.053806264 = queryWeight, product of:
                1.0080999 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.015898349 = queryNorm
              0.20982501 = fieldWeight in 1918, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.047139365 = weight(abstract_txt:dimensional in 1918) [ClassicSimilarity], result of:
            0.047139365 = score(doc=1918,freq=1.0), product of:
              0.110735096 = queryWeight, product of:
                1.0226213 = boost
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.015898349 = queryNorm
              0.4256949 = fieldWeight in 1918, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8111186 = idf(docFreq=132, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.012325593 = weight(abstract_txt:using in 1918) [ClassicSimilarity], result of:
            0.012325593 = score(doc=1918,freq=1.0), product of:
              0.05704856 = queryWeight, product of:
                1.0380291 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015898349 = queryNorm
              0.21605442 = fieldWeight in 1918, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.1444064 = weight(abstract_txt:embedding in 1918) [ClassicSimilarity], result of:
            0.1444064 = score(doc=1918,freq=4.0), product of:
              0.14714101 = queryWeight, product of:
                1.178797 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.015898349 = queryNorm
              0.981415 = fieldWeight in 1918, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.116797924 = weight(abstract_txt:gradient in 1918) [ClassicSimilarity], result of:
            0.116797924 = score(doc=1918,freq=1.0), product of:
              0.20276183 = queryWeight, product of:
                1.3837744 = boost
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.015898349 = queryNorm
              0.5760351 = fieldWeight in 1918, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.216561 = idf(docFreq=11, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.013671203 = weight(abstract_txt:that in 1918) [ClassicSimilarity], result of:
            0.013671203 = score(doc=1918,freq=3.0), product of:
              0.053400688 = queryWeight, product of:
                1.4202853 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015898349 = queryNorm
              0.25601172 = fieldWeight in 1918, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.08288698 = weight(abstract_txt:algorithms in 1918) [ClassicSimilarity], result of:
            0.08288698 = score(doc=1918,freq=1.0), product of:
              0.23266305 = queryWeight, product of:
                2.5674176 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.015898349 = queryNorm
              0.3562533 = fieldWeight in 1918, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
          0.35034996 = weight(abstract_txt:embeddings in 1918) [ClassicSimilarity], result of:
            0.35034996 = score(doc=1918,freq=2.0), product of:
              0.42172644 = queryWeight, product of:
                2.8222961 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.015898349 = queryNorm
              0.8307517 = fieldWeight in 1918, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=1918)
        0.36 = coord(9/25)
    
  2. Safder, I.; Ali, M.; Aljohani, N.R.; Nawaz, R.; Hassan, S.-U.: Neural machine translation for in-text citation classification (2023) 0.15
    0.15072553 = sum of:
      0.15072553 = product of:
        0.53830546 = sum of:
          0.011019941 = weight(abstract_txt:data in 2055) [ClassicSimilarity], result of:
            0.011019941 = score(doc=2055,freq=1.0), product of:
              0.052945085 = queryWeight, product of:
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015898349 = queryNorm
              0.20813909 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.012325593 = weight(abstract_txt:using in 2055) [ClassicSimilarity], result of:
            0.012325593 = score(doc=2055,freq=1.0), product of:
              0.05704856 = queryWeight, product of:
                1.0380291 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015898349 = queryNorm
              0.21605442 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.017502816 = weight(abstract_txt:paper in 2055) [ClassicSimilarity], result of:
            0.017502816 = score(doc=2055,freq=2.0), product of:
              0.057205096 = queryWeight, product of:
                1.0394522 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.015898349 = queryNorm
              0.30596602 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.06701088 = weight(abstract_txt:outperform in 2055) [ClassicSimilarity], result of:
            0.06701088 = score(doc=2055,freq=1.0), product of:
              0.13999945 = queryWeight, product of:
                1.1498345 = boost
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.015898349 = queryNorm
              0.47865102 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.6584163 = idf(docFreq=56, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.0722032 = weight(abstract_txt:embedding in 2055) [ClassicSimilarity], result of:
            0.0722032 = score(doc=2055,freq=1.0), product of:
              0.14714101 = queryWeight, product of:
                1.178797 = boost
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.015898349 = queryNorm
              0.4907075 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.85132 = idf(docFreq=46, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.007893072 = weight(abstract_txt:that in 2055) [ClassicSimilarity], result of:
            0.007893072 = score(doc=2055,freq=1.0), product of:
              0.053400688 = queryWeight, product of:
                1.4202853 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015898349 = queryNorm
              0.14780845 = fieldWeight in 2055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
          0.35034996 = weight(abstract_txt:embeddings in 2055) [ClassicSimilarity], result of:
            0.35034996 = score(doc=2055,freq=2.0), product of:
              0.42172644 = queryWeight, product of:
                2.8222961 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.015898349 = queryNorm
              0.8307517 = fieldWeight in 2055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=2055)
        0.28 = coord(7/25)
    
  3. Su, S.; Li, X.; Cheng, X.; Sun, C.: Location-aware targeted influence maximization in social networks (2018) 0.15
    0.15069768 = sum of:
      0.15069768 = product of:
        0.62790704 = sum of:
          0.011019941 = weight(abstract_txt:data in 34) [ClassicSimilarity], result of:
            0.011019941 = score(doc=34,freq=1.0), product of:
              0.052945085 = queryWeight, product of:
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015898349 = queryNorm
              0.20813909 = fieldWeight in 34, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
          0.01237636 = weight(abstract_txt:paper in 34) [ClassicSimilarity], result of:
            0.01237636 = score(doc=34,freq=1.0), product of:
              0.057205096 = queryWeight, product of:
                1.0394522 = boost
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.015898349 = queryNorm
              0.21635064 = fieldWeight in 34, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4616103 = idf(docFreq=3788, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
          0.073409565 = weight(abstract_txt:approximate in 34) [ClassicSimilarity], result of:
            0.073409565 = score(doc=34,freq=1.0), product of:
              0.14877543 = queryWeight, product of:
                1.1853259 = boost
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.015898349 = queryNorm
              0.4934253 = fieldWeight in 34, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.894805 = idf(docFreq=44, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
          0.055402443 = weight(abstract_txt:algorithm in 34) [ClassicSimilarity], result of:
            0.055402443 = score(doc=34,freq=1.0), product of:
              0.15537891 = queryWeight, product of:
                1.713102 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.015898349 = queryNorm
              0.35656348 = fieldWeight in 34, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
          0.14356446 = weight(abstract_txt:algorithms in 34) [ClassicSimilarity], result of:
            0.14356446 = score(doc=34,freq=3.0), product of:
              0.23266305 = queryWeight, product of:
                2.5674176 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.015898349 = queryNorm
              0.6170488 = fieldWeight in 34, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
          0.33213428 = weight(abstract_txt:tree in 34) [ClassicSimilarity], result of:
            0.33213428 = score(doc=34,freq=3.0), product of:
              0.44793826 = queryWeight, product of:
                4.1134977 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.015898349 = queryNorm
              0.7414733 = fieldWeight in 34, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=34)
        0.24 = coord(6/25)
    
  4. White, K.J.; Sutcliffe, R.F.E.: Applying incremental tree induction to retrieval : from manuals and medical texts (2006) 0.15
    0.14933872 = sum of:
      0.14933872 = product of:
        0.6222447 = sum of:
          0.019480688 = weight(abstract_txt:data in 44) [ClassicSimilarity], result of:
            0.019480688 = score(doc=44,freq=2.0), product of:
              0.052945085 = queryWeight, product of:
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.015898349 = queryNorm
              0.3679414 = fieldWeight in 44, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
          0.030813983 = weight(abstract_txt:using in 44) [ClassicSimilarity], result of:
            0.030813983 = score(doc=44,freq=4.0), product of:
              0.05704856 = queryWeight, product of:
                1.0380291 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015898349 = queryNorm
              0.54013604 = fieldWeight in 44, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
          0.0677965 = weight(abstract_txt:substantially in 44) [ClassicSimilarity], result of:
            0.0677965 = score(doc=44,freq=1.0), product of:
              0.121589 = queryWeight, product of:
                1.0715669 = boost
                7.1371193 = idf(docFreq=95, maxDocs=44421)
                0.015898349 = queryNorm
              0.55758744 = fieldWeight in 44, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1371193 = idf(docFreq=95, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
          0.019732682 = weight(abstract_txt:that in 44) [ClassicSimilarity], result of:
            0.019732682 = score(doc=44,freq=4.0), product of:
              0.053400688 = queryWeight, product of:
                1.4202853 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015898349 = queryNorm
              0.3695211 = fieldWeight in 44, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
          0.06925306 = weight(abstract_txt:algorithm in 44) [ClassicSimilarity], result of:
            0.06925306 = score(doc=44,freq=1.0), product of:
              0.15537891 = queryWeight, product of:
                1.713102 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.015898349 = queryNorm
              0.44570434 = fieldWeight in 44, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
          0.41516784 = weight(abstract_txt:tree in 44) [ClassicSimilarity], result of:
            0.41516784 = score(doc=44,freq=3.0), product of:
              0.44793826 = queryWeight, product of:
                4.1134977 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.015898349 = queryNorm
              0.9268416 = fieldWeight in 44, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.078125 = fieldNorm(doc=44)
        0.24 = coord(6/25)
    
  5. French, J.C.; Brown, D.E.; Kim, N.-H.: ¬A classification approach to Boolean query reformulation (1997) 0.15
    0.14528047 = sum of:
      0.14528047 = product of:
        0.7264023 = sum of:
          0.012325593 = weight(abstract_txt:using in 197) [ClassicSimilarity], result of:
            0.012325593 = score(doc=197,freq=1.0), product of:
              0.05704856 = queryWeight, product of:
                1.0380291 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015898349 = queryNorm
              0.21605442 = fieldWeight in 197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=197)
          0.011162491 = weight(abstract_txt:that in 197) [ClassicSimilarity], result of:
            0.011162491 = score(doc=197,freq=2.0), product of:
              0.053400688 = queryWeight, product of:
                1.4202853 = boost
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.015898349 = queryNorm
              0.20903271 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3649352 = idf(docFreq=11344, maxDocs=44421)
                0.0625 = fieldNorm(doc=197)
          0.07835089 = weight(abstract_txt:algorithm in 197) [ClassicSimilarity], result of:
            0.07835089 = score(doc=197,freq=2.0), product of:
              0.15537891 = queryWeight, product of:
                1.713102 = boost
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.015898349 = queryNorm
              0.5042569 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7050157 = idf(docFreq=401, maxDocs=44421)
                0.0625 = fieldNorm(doc=197)
          0.11721988 = weight(abstract_txt:algorithms in 197) [ClassicSimilarity], result of:
            0.11721988 = score(doc=197,freq=2.0), product of:
              0.23266305 = queryWeight, product of:
                2.5674176 = boost
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.015898349 = queryNorm
              0.5038182 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7000527 = idf(docFreq=403, maxDocs=44421)
                0.0625 = fieldNorm(doc=197)
          0.5073435 = weight(abstract_txt:tree in 197) [ClassicSimilarity], result of:
            0.5073435 = score(doc=197,freq=7.0), product of:
              0.44793826 = queryWeight, product of:
                4.1134977 = boost
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.015898349 = queryNorm
              1.1326191 = fieldWeight in 197, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.849437 = idf(docFreq=127, maxDocs=44421)
                0.0625 = fieldNorm(doc=197)
        0.2 = coord(5/25)