Document (#38653)

Author
Líska, M.
Sojka, P.
Title
MIaS 1.5
Source
http://www.muni.cz/research/publications/1173732
Year
2014
Abstract
A math-aware, full-text indexing based search engine that enables users to search for mathematical formulae inside documents. Search engine is unique because it is able to index and search structural information like representation of mathematical formulae. There is no other software or IR system that is able to store three billions of formulae in its index and search it with response time below a second. MIaS processes documents containing mathematical notation in MathML format. The system is built as an extension to any full-text indexing engine and has been verifiend on state-of-the-art Lucene core. It is scalable - it was verified to index almost whole arxiv.org (440,000 papers) having more than 160,000,000 formulae. Software is being used in EuDML (eudml.org) and other digital libraries. For more details see papers in peer reviewed conferences: [1] Sojka, Petr; Líska, Martin. In Matthew R. B. Hardy, Frank Wm. Tompa. Proceedings of the 2011 ACM Symposium on Document Engineering. Mountain View, CA, USA : ACM, 2011. pp.57--60. [2] Sojka, Petr; Líska, Martin. In J.H.Davenport, W.M. Farmer, J.Urban, F. Rabe. Intelligent Computer Mathematics LNCS 6824. Springer, 2011, pp.228--243.
Content
Vgl.: https://mir.fi.muni.cz/mias/.
Field
Mathematik

Similar documents (author)

  1. Sojka, P.: Exploiting semantic annotations in math information retrieval (2012) 6.19
    6.1935673 = sum of:
      6.1935673 = weight(author_txt:sojka in 1032) [ClassicSimilarity], result of:
        6.1935673 = fieldWeight in 1032, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.625 = fieldNorm(doc=1032)
    
  2. Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010) 4.95
    4.954854 = sum of:
      4.954854 = weight(author_txt:sojka in 2058) [ClassicSimilarity], result of:
        4.954854 = fieldWeight in 2058, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.5 = fieldNorm(doc=2058)
    
  3. Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 4.95
    4.954854 = sum of:
      4.954854 = weight(author_txt:sojka in 4450) [ClassicSimilarity], result of:
        4.954854 = fieldWeight in 4450, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.5 = fieldNorm(doc=4450)
    
  4. Sojka, P.; Lee, M.; Rehurek, R.; Hatlapatka, R.; Kucbel, M.; Bouche, T.; Goutorbe, C.; Anghelache, R.; Wojciechowski, K.: Toolset for entity and semantic associations : Final Release (2013) 2.17
    2.1677487 = sum of:
      2.1677487 = weight(author_txt:sojka in 2057) [ClassicSimilarity], result of:
        2.1677487 = fieldWeight in 2057, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.909708 = idf(docFreq=5, maxDocs=44421)
          0.21875 = fieldNorm(doc=2057)
    

Similar documents (content)

  1. Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 0.40
    0.3954164 = sum of:
      0.3954164 = product of:
        1.4122014 = sum of:
          0.0324775 = weight(abstract_txt:documents in 4450) [ClassicSimilarity], result of:
            0.0324775 = score(doc=4450,freq=1.0), product of:
              0.072014056 = queryWeight, product of:
                1.061393 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.016454844 = queryNorm
              0.45098835 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.23519768 = weight(abstract_txt:math in 4450) [ClassicSimilarity], result of:
            0.23519768 = score(doc=4450,freq=3.0), product of:
              0.14834303 = queryWeight, product of:
                1.0771748 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.016454844 = queryNorm
              1.5854987 = fieldWeight in 4450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.26380014 = weight(abstract_txt:lucene in 4450) [ClassicSimilarity], result of:
            0.26380014 = score(doc=4450,freq=2.0), product of:
              0.18331258 = queryWeight, product of:
                1.1974262 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.016454844 = queryNorm
              1.4390728 = fieldWeight in 4450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.11645126 = weight(abstract_txt:engine in 4450) [ClassicSimilarity], result of:
            0.11645126 = score(doc=4450,freq=1.0), product of:
              0.19311771 = queryWeight, product of:
                2.1287482 = boost
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.016454844 = queryNorm
              0.60300666 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.05653269 = weight(abstract_txt:search in 4450) [ClassicSimilarity], result of:
            0.05653269 = score(doc=4450,freq=1.0), product of:
              0.14143014 = queryWeight, product of:
                2.3518445 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.016454844 = queryNorm
              0.39972165 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.17789885 = weight(abstract_txt:mathematical in 4450) [ClassicSimilarity], result of:
            0.17789885 = score(doc=4450,freq=1.0), product of:
              0.25615808 = queryWeight, product of:
                2.4516997 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.016454844 = queryNorm
              0.6944885 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
          0.5298434 = weight(abstract_txt:formulae in 4450) [ClassicSimilarity], result of:
            0.5298434 = score(doc=4450,freq=1.0), product of:
              0.5836295 = queryWeight, product of:
                4.2731805 = boost
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.016454844 = queryNorm
              0.90784204 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.109375 = fieldNorm(doc=4450)
        0.28 = coord(7/25)
    
  2. Sojka, P.: Exploiting semantic annotations in math information retrieval (2012) 0.29
    0.28725106 = sum of:
      0.28725106 = product of:
        0.89765954 = sum of:
          0.024702806 = weight(abstract_txt:text in 1032) [ClassicSimilarity], result of:
            0.024702806 = score(doc=1032,freq=2.0), product of:
              0.0691632 = queryWeight, product of:
                1.040172 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.016454844 = queryNorm
              0.3571669 = fieldWeight in 1032, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.026245782 = weight(abstract_txt:documents in 1032) [ClassicSimilarity], result of:
            0.026245782 = score(doc=1032,freq=2.0), product of:
              0.072014056 = queryWeight, product of:
                1.061393 = boost
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.016454844 = queryNorm
              0.3644536 = fieldWeight in 1032, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.123322 = idf(docFreq=1954, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.13439867 = weight(abstract_txt:math in 1032) [ClassicSimilarity], result of:
            0.13439867 = score(doc=1032,freq=3.0), product of:
              0.14834303 = queryWeight, product of:
                1.0771748 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.016454844 = queryNorm
              0.90599924 = fieldWeight in 1032, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.0377506 = weight(abstract_txt:indexing in 1032) [ClassicSimilarity], result of:
            0.0377506 = score(doc=1032,freq=3.0), product of:
              0.08016099 = queryWeight, product of:
                1.1198224 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.016454844 = queryNorm
              0.4709348 = fieldWeight in 1032, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.106591366 = weight(abstract_txt:lucene in 1032) [ClassicSimilarity], result of:
            0.106591366 = score(doc=1032,freq=1.0), product of:
              0.18331258 = queryWeight, product of:
                1.1974262 = boost
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.016454844 = queryNorm
              0.5814733 = fieldWeight in 1032, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.303573 = idf(docFreq=10, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.09410683 = weight(abstract_txt:engine in 1032) [ClassicSimilarity], result of:
            0.09410683 = score(doc=1032,freq=2.0), product of:
              0.19311771 = queryWeight, product of:
                2.1287482 = boost
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.016454844 = queryNorm
              0.48730296 = fieldWeight in 1032, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.045685314 = weight(abstract_txt:search in 1032) [ClassicSimilarity], result of:
            0.045685314 = score(doc=1032,freq=2.0), product of:
              0.14143014 = queryWeight, product of:
                2.3518445 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.016454844 = queryNorm
              0.3230239 = fieldWeight in 1032, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
          0.42817813 = weight(abstract_txt:formulae in 1032) [ClassicSimilarity], result of:
            0.42817813 = score(doc=1032,freq=2.0), product of:
              0.5836295 = queryWeight, product of:
                4.2731805 = boost
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.016454844 = queryNorm
              0.73364717 = fieldWeight in 1032, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.0625 = fieldNorm(doc=1032)
        0.32 = coord(8/25)
    
  3. Fife, E.D.; Husch, L.: ¬The Mathematics Archives : making mathematics easy to find on the Web (1999) 0.15
    0.1515325 = sum of:
      0.1515325 = product of:
        0.5411875 = sum of:
          0.017467521 = weight(abstract_txt:text in 2239) [ClassicSimilarity], result of:
            0.017467521 = score(doc=2239,freq=1.0), product of:
              0.0691632 = queryWeight, product of:
                1.040172 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.016454844 = queryNorm
              0.25255513 = fieldWeight in 2239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.07759511 = weight(abstract_txt:math in 2239) [ClassicSimilarity], result of:
            0.07759511 = score(doc=2239,freq=1.0), product of:
              0.14834303 = queryWeight, product of:
                1.0771748 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.016454844 = queryNorm
              0.5230789 = fieldWeight in 2239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.03795224 = weight(abstract_txt:software in 2239) [ClassicSimilarity], result of:
            0.03795224 = score(doc=2239,freq=3.0), product of:
              0.08044618 = queryWeight, product of:
                1.1218127 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.016454844 = queryNorm
              0.4717718 = fieldWeight in 2239, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.031623583 = weight(abstract_txt:full in 2239) [ClassicSimilarity], result of:
            0.031623583 = score(doc=2239,freq=1.0), product of:
              0.10273733 = queryWeight, product of:
                1.2677445 = boost
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.016454844 = queryNorm
              0.30781004 = fieldWeight in 2239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9249606 = idf(docFreq=876, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.09410683 = weight(abstract_txt:engine in 2239) [ClassicSimilarity], result of:
            0.09410683 = score(doc=2239,freq=2.0), product of:
              0.19311771 = queryWeight, product of:
                2.1287482 = boost
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.016454844 = queryNorm
              0.48730296 = fieldWeight in 2239, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.079129286 = weight(abstract_txt:search in 2239) [ClassicSimilarity], result of:
            0.079129286 = score(doc=2239,freq=6.0), product of:
              0.14143014 = queryWeight, product of:
                2.3518445 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.016454844 = queryNorm
              0.5594938 = fieldWeight in 2239, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
          0.20331298 = weight(abstract_txt:mathematical in 2239) [ClassicSimilarity], result of:
            0.20331298 = score(doc=2239,freq=4.0), product of:
              0.25615808 = queryWeight, product of:
                2.4516997 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.016454844 = queryNorm
              0.7937012 = fieldWeight in 2239, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0625 = fieldNorm(doc=2239)
        0.28 = coord(7/25)
    
  4. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.12
    0.12079932 = sum of:
      0.12079932 = product of:
        0.75499576 = sum of:
          0.07759511 = weight(abstract_txt:math in 499) [ClassicSimilarity], result of:
            0.07759511 = score(doc=499,freq=1.0), product of:
              0.14834303 = queryWeight, product of:
                1.0771748 = boost
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.016454844 = queryNorm
              0.5230789 = fieldWeight in 499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.369263 = idf(docFreq=27, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.021911737 = weight(abstract_txt:software in 499) [ClassicSimilarity], result of:
            0.021911737 = score(doc=499,freq=1.0), product of:
              0.08044618 = queryWeight, product of:
                1.1218127 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.016454844 = queryNorm
              0.27237758 = fieldWeight in 499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.2273108 = weight(abstract_txt:mathematical in 499) [ClassicSimilarity], result of:
            0.2273108 = score(doc=499,freq=5.0), product of:
              0.25615808 = queryWeight, product of:
                2.4516997 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.016454844 = queryNorm
              0.8873849 = fieldWeight in 499, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.42817813 = weight(abstract_txt:formulae in 499) [ClassicSimilarity], result of:
            0.42817813 = score(doc=499,freq=2.0), product of:
              0.5836295 = queryWeight, product of:
                4.2731805 = boost
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.016454844 = queryNorm
              0.73364717 = fieldWeight in 499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.30027 = idf(docFreq=29, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
        0.16 = coord(4/25)
    
  5. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.12
    0.12031877 = sum of:
      0.12031877 = product of:
        0.42970988 = sum of:
          0.030878507 = weight(abstract_txt:text in 1007) [ClassicSimilarity], result of:
            0.030878507 = score(doc=1007,freq=2.0), product of:
              0.0691632 = queryWeight, product of:
                1.040172 = boost
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.016454844 = queryNorm
              0.4464586 = fieldWeight in 1007, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.040882 = idf(docFreq=2122, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.027244149 = weight(abstract_txt:indexing in 1007) [ClassicSimilarity], result of:
            0.027244149 = score(doc=1007,freq=1.0), product of:
              0.08016099 = queryWeight, product of:
                1.1198224 = boost
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.016454844 = queryNorm
              0.33986792 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3503094 = idf(docFreq=1557, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.027389672 = weight(abstract_txt:software in 1007) [ClassicSimilarity], result of:
            0.027389672 = score(doc=1007,freq=1.0), product of:
              0.08044618 = queryWeight, product of:
                1.1218127 = boost
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.016454844 = queryNorm
              0.34047198 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3580413 = idf(docFreq=1545, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.053186487 = weight(abstract_txt:index in 1007) [ClassicSimilarity], result of:
            0.053186487 = score(doc=1007,freq=1.0), product of:
              0.14333278 = queryWeight, product of:
                1.8339437 = boost
                4.7496953 = idf(docFreq=1044, maxDocs=44421)
                0.016454844 = queryNorm
              0.37106994 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7496953 = idf(docFreq=1044, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.08317947 = weight(abstract_txt:engine in 1007) [ClassicSimilarity], result of:
            0.08317947 = score(doc=1007,freq=1.0), product of:
              0.19311771 = queryWeight, product of:
                2.1287482 = boost
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.016454844 = queryNorm
              0.43071902 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5132036 = idf(docFreq=486, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.080760986 = weight(abstract_txt:search in 1007) [ClassicSimilarity], result of:
            0.080760986 = score(doc=1007,freq=4.0), product of:
              0.14143014 = queryWeight, product of:
                2.3518445 = boost
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.016454844 = queryNorm
              0.5710309 = fieldWeight in 1007, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.654598 = idf(docFreq=3123, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
          0.1270706 = weight(abstract_txt:mathematical in 1007) [ClassicSimilarity], result of:
            0.1270706 = score(doc=1007,freq=1.0), product of:
              0.25615808 = queryWeight, product of:
                2.4516997 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.016454844 = queryNorm
              0.49606323 = fieldWeight in 1007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.078125 = fieldNorm(doc=1007)
        0.28 = coord(7/25)