Document (#40450)

Author
Zanibbi, R.
Yuan, B.
Title
Keyword and image-based retrieval for mathematical expressions
Source
https://www.cs.rit.edu/~rlaz/files/ZanibbiYuanDRR2011.pdf
Year
2011
Abstract
Two new methods for retrieving mathematical expressions using conventional keyword search and expression images are presented. An expression-level TF-IDF (term frequency-inverse document frequency) approach is used for keyword search, where queries and indexed expressions are represented by keywords taken from LATEX strings. TF-IDF is computed at the level of individual expressions rather than documents to increase the precision of matching. The second retrieval technique is a form of Content-Base Image Retrieval (CBIR). Expressions are segmented into connected components, and then components in the query expression and each expression in the collection are matched using contour and density features, aspect ratios, and relative positions. In an experiment using ten randomly sampled queries from a corpus of over 22,000 expressions, precision-at-k (k= 20) for the keyword-based approach was higher (keyword: µ= 84.0,s= 19.0, image-based:µ= 32.0,s= 30.7), but for a few of the queries better results were obtained using a combination of the two techniques.
Field
Mathematik

Similar documents (author)

  1. Yuan, W.: End-user searching behavior in information retrieval : a longitudinal study (1997) 5.19
    5.187669 = sum of:
      5.187669 = weight(author_txt:yuan in 394) [ClassicSimilarity], result of:
        5.187669 = fieldWeight in 394, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.625 = fieldNorm(doc=394)
    
  2. Yuan, W.; Meadow, C.T.: ¬A study of the use of variables in information retrieval user studies (1999) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 3943) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 3943, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=3943)
    
  3. Jin, Z.; Yuan, C.: On the ambiguity of information retrieval for visualization (1998) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 4216) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 4216, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=4216)
    
  4. Yuan, X.; Belkin, N.J.: Investigating information retrieval support techniques for different information-seeking strategies (2010) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 686) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 686, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=686)
    
  5. Yuan, X.; Belkin, N.J.: Evaluating an integrated system supporting multiple information-seeking strategies (2010) 4.15
    4.150135 = sum of:
      4.150135 = weight(author_txt:yuan in 979) [ClassicSimilarity], result of:
        4.150135 = fieldWeight in 979, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.30027 = idf(docFreq=29, maxDocs=44421)
          0.5 = fieldNorm(doc=979)
    

Similar documents (content)

  1. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.22
    0.22422601 = sum of:
      0.22422601 = product of:
        0.9342751 = sum of:
          0.019017894 = weight(abstract_txt:approach in 499) [ClassicSimilarity], result of:
            0.019017894 = score(doc=499,freq=2.0), product of:
              0.0575126 = queryWeight, product of:
                1.0204948 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015064259 = queryNorm
              0.33067352 = fieldWeight in 499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.18466878 = weight(abstract_txt:latex in 499) [ClassicSimilarity], result of:
            0.18466878 = score(doc=499,freq=3.0), product of:
              0.18149999 = queryWeight, product of:
                1.2818954 = boost
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.015064259 = queryNorm
              1.0174589 = fieldWeight in 499, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.398883 = idf(docFreq=9, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.1470145 = weight(abstract_txt:mathematical in 499) [ClassicSimilarity], result of:
            0.1470145 = score(doc=499,freq=5.0), product of:
              0.16567163 = queryWeight, product of:
                1.7320219 = boost
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.015064259 = queryNorm
              0.8873849 = fieldWeight in 499, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.3496094 = idf(docFreq=210, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.021218447 = weight(abstract_txt:using in 499) [ClassicSimilarity], result of:
            0.021218447 = score(doc=499,freq=1.0), product of:
              0.09820881 = queryWeight, product of:
                1.8859037 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015064259 = queryNorm
              0.21605442 = fieldWeight in 499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.14616635 = weight(abstract_txt:expression in 499) [ClassicSimilarity], result of:
            0.14616635 = score(doc=499,freq=1.0), product of:
              0.3555546 = queryWeight, product of:
                3.588372 = boost
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.015064259 = queryNorm
              0.41109398 = fieldWeight in 499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
          0.41618913 = weight(abstract_txt:expressions in 499) [ClassicSimilarity], result of:
            0.41618913 = score(doc=499,freq=3.0), product of:
              0.56692445 = queryWeight, product of:
                5.549483 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.015064259 = queryNorm
              0.73411745 = fieldWeight in 499, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.0625 = fieldNorm(doc=499)
        0.24 = coord(6/25)
    
  2. Yoon, J.W.; Chung, E.K.: Understanding image needs in daily life by analyzing questions in a social Q&A site (2011) 0.19
    0.18943006 = sum of:
      0.18943006 = product of:
        0.6765359 = sum of:
          0.013447682 = weight(abstract_txt:approach in 922) [ClassicSimilarity], result of:
            0.013447682 = score(doc=922,freq=1.0), product of:
              0.0575126 = queryWeight, product of:
                1.0204948 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015064259 = queryNorm
              0.2338215 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.01618647 = weight(abstract_txt:retrieval in 922) [ClassicSimilarity], result of:
            0.01618647 = score(doc=922,freq=1.0), product of:
              0.07449547 = queryWeight, product of:
                1.4224594 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015064259 = queryNorm
              0.21728125 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.044861075 = weight(abstract_txt:components in 922) [ClassicSimilarity], result of:
            0.044861075 = score(doc=922,freq=1.0), product of:
              0.12840378 = queryWeight, product of:
                1.5248187 = boost
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.015064259 = queryNorm
              0.349375 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.59 = idf(docFreq=450, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.051149804 = weight(abstract_txt:queries in 922) [ClassicSimilarity], result of:
            0.051149804 = score(doc=922,freq=1.0), product of:
              0.16041973 = queryWeight, product of:
                2.0873911 = boost
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.015064259 = queryNorm
              0.31884983 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.16923216 = weight(abstract_txt:image in 922) [ClassicSimilarity], result of:
            0.16923216 = score(doc=922,freq=8.0), product of:
              0.17809582 = queryWeight, product of:
                2.1993876 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.015064259 = queryNorm
              0.95023096 = fieldWeight in 922, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.14137183 = weight(abstract_txt:keyword in 922) [ClassicSimilarity], result of:
            0.14137183 = score(doc=922,freq=1.0), product of:
              0.3745875 = queryWeight, product of:
                4.1179013 = boost
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.015064259 = queryNorm
              0.3774067 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.038507 = idf(docFreq=287, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
          0.2402869 = weight(abstract_txt:expressions in 922) [ClassicSimilarity], result of:
            0.2402869 = score(doc=922,freq=1.0), product of:
              0.56692445 = queryWeight, product of:
                5.549483 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.015064259 = queryNorm
              0.4238429 = fieldWeight in 922, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.0625 = fieldNorm(doc=922)
        0.28 = coord(7/25)
    
  3. D'Ambrosio, D.M.: Conceptualizing metadata via repertory grids : exploring a method for the development of domain-specific systems for knowledge organization (2007) 0.13
    0.1275051 = sum of:
      0.1275051 = product of:
        0.7969068 = sum of:
          0.022891125 = weight(abstract_txt:retrieval in 1662) [ClassicSimilarity], result of:
            0.022891125 = score(doc=1662,freq=2.0), product of:
              0.07449547 = queryWeight, product of:
                1.4224594 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015064259 = queryNorm
              0.3072821 = fieldWeight in 1662, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=1662)
          0.030007415 = weight(abstract_txt:using in 1662) [ClassicSimilarity], result of:
            0.030007415 = score(doc=1662,freq=2.0), product of:
              0.09820881 = queryWeight, product of:
                1.8859037 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.015064259 = queryNorm
              0.3055471 = fieldWeight in 1662, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=1662)
          0.20671044 = weight(abstract_txt:expression in 1662) [ClassicSimilarity], result of:
            0.20671044 = score(doc=1662,freq=2.0), product of:
              0.3555546 = queryWeight, product of:
                3.588372 = boost
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.015064259 = queryNorm
              0.58137465 = fieldWeight in 1662, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.0625 = fieldNorm(doc=1662)
          0.53729784 = weight(abstract_txt:expressions in 1662) [ClassicSimilarity], result of:
            0.53729784 = score(doc=1662,freq=5.0), product of:
              0.56692445 = queryWeight, product of:
                5.549483 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.015064259 = queryNorm
              0.94774157 = fieldWeight in 1662, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.0625 = fieldNorm(doc=1662)
        0.16 = coord(4/25)
    
  4. Eerola, J.; Vakkari, P.: How a general and a specific thesaurus cover expressions in patients' questions and physicians' answers (2008) 0.13
    0.12598139 = sum of:
      0.12598139 = product of:
        0.78738374 = sum of:
          0.016809601 = weight(abstract_txt:approach in 2732) [ClassicSimilarity], result of:
            0.016809601 = score(doc=2732,freq=1.0), product of:
              0.0575126 = queryWeight, product of:
                1.0204948 = boost
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.015064259 = queryNorm
              0.29227686 = fieldWeight in 2732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.741144 = idf(docFreq=2864, maxDocs=44421)
                0.078125 = fieldNorm(doc=2732)
          0.06762973 = weight(abstract_txt:matched in 2732) [ClassicSimilarity], result of:
            0.06762973 = score(doc=2732,freq=1.0), product of:
              0.11547106 = queryWeight, product of:
                1.0224707 = boost
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.015064259 = queryNorm
              0.58568555 = fieldWeight in 2732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.496775 = idf(docFreq=66, maxDocs=44421)
                0.078125 = fieldNorm(doc=2732)
          0.18270795 = weight(abstract_txt:expression in 2732) [ClassicSimilarity], result of:
            0.18270795 = score(doc=2732,freq=1.0), product of:
              0.3555546 = queryWeight, product of:
                3.588372 = boost
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.015064259 = queryNorm
              0.5138675 = fieldWeight in 2732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.078125 = fieldNorm(doc=2732)
          0.52023643 = weight(abstract_txt:expressions in 2732) [ClassicSimilarity], result of:
            0.52023643 = score(doc=2732,freq=3.0), product of:
              0.56692445 = queryWeight, product of:
                5.549483 = boost
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.015064259 = queryNorm
              0.9176468 = fieldWeight in 2732, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7814865 = idf(docFreq=136, maxDocs=44421)
                0.078125 = fieldNorm(doc=2732)
        0.16 = coord(4/25)
    
  5. Corridoni, J.M.; Bimbo, A. del; Vicario, E.: Image retrieval by color semantics with incomplete knowledge (1998) 0.13
    0.1253731 = sum of:
      0.1253731 = product of:
        0.5223879 = sum of:
          0.046652988 = weight(abstract_txt:level in 1594) [ClassicSimilarity], result of:
            0.046652988 = score(doc=1594,freq=4.0), product of:
              0.08302923 = queryWeight, product of:
                1.2261534 = boost
                4.4950905 = idf(docFreq=1347, maxDocs=44421)
                0.015064259 = queryNorm
              0.5618863 = fieldWeight in 1594, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4950905 = idf(docFreq=1347, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
          0.028035786 = weight(abstract_txt:retrieval in 1594) [ClassicSimilarity], result of:
            0.028035786 = score(doc=1594,freq=3.0), product of:
              0.07449547 = queryWeight, product of:
                1.4224594 = boost
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.015064259 = queryNorm
              0.37634215 = fieldWeight in 1594, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4765 = idf(docFreq=3732, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
          0.04327955 = weight(abstract_txt:precision in 1594) [ClassicSimilarity], result of:
            0.04327955 = score(doc=1594,freq=1.0), product of:
              0.12536795 = queryWeight, product of:
                1.5066854 = boost
                5.5235233 = idf(docFreq=481, maxDocs=44421)
                0.015064259 = queryNorm
              0.3452202 = fieldWeight in 1594, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5235233 = idf(docFreq=481, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
          0.051149804 = weight(abstract_txt:queries in 1594) [ClassicSimilarity], result of:
            0.051149804 = score(doc=1594,freq=1.0), product of:
              0.16041973 = queryWeight, product of:
                2.0873911 = boost
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.015064259 = queryNorm
              0.31884983 = fieldWeight in 1594, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1015973 = idf(docFreq=734, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
          0.14655936 = weight(abstract_txt:image in 1594) [ClassicSimilarity], result of:
            0.14655936 = score(doc=1594,freq=6.0), product of:
              0.17809582 = queryWeight, product of:
                2.1993876 = boost
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.015064259 = queryNorm
              0.8229242 = fieldWeight in 1594, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.375318 = idf(docFreq=558, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
          0.20671044 = weight(abstract_txt:expression in 1594) [ClassicSimilarity], result of:
            0.20671044 = score(doc=1594,freq=2.0), product of:
              0.3555546 = queryWeight, product of:
                3.588372 = boost
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.015064259 = queryNorm
              0.58137465 = fieldWeight in 1594, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5775037 = idf(docFreq=167, maxDocs=44421)
                0.0625 = fieldNorm(doc=1594)
        0.24 = coord(6/25)