Document (#33352)

Author
Lin, M.
Zhang, Z.
Title
Question-driven segmentation of lecture speech text : towards intelligent e-learning systems
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.2, S.186-200
Year
2008
Abstract
Recently, lecture videos have been widely used in e-learning systems. Envisioning intelligent e-learning systems, this article addresses the challenge of information seeking in lecture videos by retrieving relevant video segments based on user queries, through dynamic segmentation of lecture speech text. In the proposed approach, shallow parsing such as part of-speech tagging and noun phrase chunking are used to parse both questions and Automated Speech Recognition (ASR) transcripts. A sliding-window algorithm is proposed to identify the start and ending boundaries of returned segments. Phonetic and partial matching is utilized to correct the errors from automated speech recognition and noun phrase chunking. Furthermore, extra knowledge such as lecture slides is used to facilitate the ASR transcript error correction. The approach also makes use of proximity to approximate the deep parsing and structure match between question and sentences in ASR transcripts. The experimental results showed that both phonetic and partial matching improved the segmentation performance, slides-based ASR transcript correction improves information coverage, and proximity is also effective in improving the overall performance.
Theme
Computer Based Training

Similar documents (author)

  1. Zhang, M.; Zhang, Y.: Professional organizations in Twittersphere : an empirical study of U.S. library and information science professional organizations-related Tweets (2020) 4.53
    4.5277104 = sum of:
      4.5277104 = weight(author_txt:zhang in 775) [ClassicSimilarity], result of:
        4.5277104 = score(doc=775,freq=2.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.527711 = fieldWeight in 775, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.5 = fieldNorm(doc=775)
    
  2. Zhang, Y.; Zhang, C.: Enhancing keyphrase extraction from microblogs using human reading time (2021) 4.53
    4.5277104 = sum of:
      4.5277104 = weight(author_txt:zhang in 1238) [ClassicSimilarity], result of:
        4.5277104 = score(doc=1238,freq=2.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.527711 = fieldWeight in 1238, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.5 = fieldNorm(doc=1238)
    
  3. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 7710) [ClassicSimilarity], result of:
        4.0019684 = score(doc=7710,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 7710, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=7710)
    
  4. Zhang, A.: Multimedia file formats on the Internet : a beginner's guide for PC users (1995) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 3280) [ClassicSimilarity], result of:
        4.0019684 = score(doc=3280,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 3280, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=3280)
    
  5. Zhang, J.: ¬A representational analysis of relational information displays (1996) 4.00
    4.0019684 = sum of:
      4.0019684 = weight(author_txt:zhang in 6471) [ClassicSimilarity], result of:
        4.0019684 = score(doc=6471,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.15617312 = queryNorm
          4.001969 = fieldWeight in 6471, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            6.40315 = idf(docFreq=199, maxDocs=44421)
            0.625 = fieldNorm(doc=6471)
    

Similar documents (content)

  1. Brill, E.: ¬An overview of empirical natural language processing (1997) 0.13
    0.13139871 = sum of:
      0.13139871 = product of:
        0.6569935 = sum of:
          0.022672337 = weight(abstract_txt:systems in 4249) [ClassicSimilarity], result of:
            0.022672337 = score(doc=4249,freq=1.0), product of:
              0.05317191 = queryWeight, product of:
                1.1075821 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.014073508 = queryNorm
              0.42639688 = fieldWeight in 4249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.125 = fieldNorm(doc=4249)
          0.08703943 = weight(abstract_txt:recognition in 4249) [ClassicSimilarity], result of:
            0.08703943 = score(doc=4249,freq=1.0), product of:
              0.11388461 = queryWeight, product of:
                1.3234931 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.014073508 = queryNorm
              0.7642774 = fieldWeight in 4249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.125 = fieldNorm(doc=4249)
          0.060909912 = weight(abstract_txt:learning in 4249) [ClassicSimilarity], result of:
            0.060909912 = score(doc=4249,freq=1.0), product of:
              0.10275668 = queryWeight, product of:
                1.5397131 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014073508 = queryNorm
              0.59275866 = fieldWeight in 4249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.125 = fieldNorm(doc=4249)
          0.17727023 = weight(abstract_txt:parsing in 4249) [ClassicSimilarity], result of:
            0.17727023 = score(doc=4249,freq=1.0), product of:
              0.18298334 = queryWeight, product of:
                1.6776252 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.014073508 = queryNorm
              0.968778 = fieldWeight in 4249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.125 = fieldNorm(doc=4249)
          0.30910158 = weight(abstract_txt:speech in 4249) [ClassicSimilarity], result of:
            0.30910158 = score(doc=4249,freq=1.0), product of:
              0.35977846 = queryWeight, product of:
                3.719433 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014073508 = queryNorm
              0.8591442 = fieldWeight in 4249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.125 = fieldNorm(doc=4249)
        0.2 = coord(5/25)
    
  2. Stolcke, A.: Linguistic knowledge and empirical methods in speech recognition (1997) 0.13
    0.12931705 = sum of:
      0.12931705 = product of:
        0.6465852 = sum of:
          0.046931133 = weight(abstract_txt:performance in 3660) [ClassicSimilarity], result of:
            0.046931133 = score(doc=3660,freq=1.0), product of:
              0.06501622 = queryWeight, product of:
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.014073508 = queryNorm
              0.72183734 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.15625 = fieldNorm(doc=3660)
          0.028340422 = weight(abstract_txt:systems in 3660) [ClassicSimilarity], result of:
            0.028340422 = score(doc=3660,freq=1.0), product of:
              0.05317191 = queryWeight, product of:
                1.1075821 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.014073508 = queryNorm
              0.5329961 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.15625 = fieldNorm(doc=3660)
          0.10879929 = weight(abstract_txt:recognition in 3660) [ClassicSimilarity], result of:
            0.10879929 = score(doc=3660,freq=1.0), product of:
              0.11388461 = queryWeight, product of:
                1.3234931 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.014073508 = queryNorm
              0.95534676 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.15625 = fieldNorm(doc=3660)
          0.076137386 = weight(abstract_txt:learning in 3660) [ClassicSimilarity], result of:
            0.076137386 = score(doc=3660,freq=1.0), product of:
              0.10275668 = queryWeight, product of:
                1.5397131 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014073508 = queryNorm
              0.7409483 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.15625 = fieldNorm(doc=3660)
          0.38637698 = weight(abstract_txt:speech in 3660) [ClassicSimilarity], result of:
            0.38637698 = score(doc=3660,freq=1.0), product of:
              0.35977846 = queryWeight, product of:
                3.719433 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014073508 = queryNorm
              1.0739303 = fieldWeight in 3660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.15625 = fieldNorm(doc=3660)
        0.2 = coord(5/25)
    
  3. Benitez, A.B.; Zhong, D.; Chang, S.-F.: Enabling MPEG-7 structural and semantic descriptions in retrieval applications (2007) 0.11
    0.10973553 = sum of:
      0.10973553 = product of:
        0.4572314 = sum of:
          0.013508156 = weight(abstract_txt:used in 1518) [ClassicSimilarity], result of:
            0.013508156 = score(doc=1518,freq=1.0), product of:
              0.051502556 = queryWeight, product of:
                1.0900569 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.014073508 = queryNorm
              0.26228127 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
          0.014170211 = weight(abstract_txt:systems in 1518) [ClassicSimilarity], result of:
            0.014170211 = score(doc=1518,freq=1.0), product of:
              0.05317191 = queryWeight, product of:
                1.1075821 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.014073508 = queryNorm
              0.26649806 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
          0.058236975 = weight(abstract_txt:intelligent in 1518) [ClassicSimilarity], result of:
            0.058236975 = score(doc=1518,freq=1.0), product of:
              0.11917913 = queryWeight, product of:
                1.3539083 = boost
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.014073508 = queryNorm
              0.4886508 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.25473 = idf(docFreq=231, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
          0.079235524 = weight(abstract_txt:videos in 1518) [ClassicSimilarity], result of:
            0.079235524 = score(doc=1518,freq=1.0), product of:
              0.1463348 = queryWeight, product of:
                1.5002477 = boost
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.014073508 = queryNorm
              0.5414674 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.930783 = idf(docFreq=117, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
          0.11336195 = weight(abstract_txt:segments in 1518) [ClassicSimilarity], result of:
            0.11336195 = score(doc=1518,freq=1.0), product of:
              0.18580006 = queryWeight, product of:
                1.690488 = boost
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.014073508 = queryNorm
              0.6101287 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.809647 = idf(docFreq=48, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
          0.1787186 = weight(abstract_txt:segmentation in 1518) [ClassicSimilarity], result of:
            0.1787186 = score(doc=1518,freq=1.0), product of:
              0.28810087 = queryWeight, product of:
                2.5781434 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.014073508 = queryNorm
              0.62033343 = fieldWeight in 1518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.078125 = fieldNorm(doc=1518)
        0.24 = coord(6/25)
    
  4. Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.11
    0.10888587 = sum of:
      0.10888587 = product of:
        0.45369112 = sum of:
          0.016425896 = weight(abstract_txt:performance in 68) [ClassicSimilarity], result of:
            0.016425896 = score(doc=68,freq=1.0), product of:
              0.06501622 = queryWeight, product of:
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.014073508 = queryNorm
              0.25264308 = fieldWeight in 68, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.619759 = idf(docFreq=1189, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
          0.013372391 = weight(abstract_txt:used in 68) [ClassicSimilarity], result of:
            0.013372391 = score(doc=68,freq=2.0), product of:
              0.051502556 = queryWeight, product of:
                1.0900569 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.014073508 = queryNorm
              0.2596452 = fieldWeight in 68, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
          0.014027792 = weight(abstract_txt:systems in 68) [ClassicSimilarity], result of:
            0.014027792 = score(doc=68,freq=2.0), product of:
              0.05317191 = queryWeight, product of:
                1.1075821 = boost
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.014073508 = queryNorm
              0.2638196 = fieldWeight in 68, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.411175 = idf(docFreq=3984, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
          0.06595606 = weight(abstract_txt:recognition in 68) [ClassicSimilarity], result of:
            0.06595606 = score(doc=68,freq=3.0), product of:
              0.11388461 = queryWeight, product of:
                1.3234931 = boost
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.014073508 = queryNorm
              0.5791482 = fieldWeight in 68, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.114219 = idf(docFreq=266, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
          0.10968036 = weight(abstract_txt:parsing in 68) [ClassicSimilarity], result of:
            0.10968036 = score(doc=68,freq=2.0), product of:
              0.18298334 = queryWeight, product of:
                1.6776252 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.014073508 = queryNorm
              0.5994008 = fieldWeight in 68, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
          0.23422861 = weight(abstract_txt:speech in 68) [ClassicSimilarity], result of:
            0.23422861 = score(doc=68,freq=3.0), product of:
              0.35977846 = queryWeight, product of:
                3.719433 = boost
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.014073508 = queryNorm
              0.65103567 = fieldWeight in 68, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8731537 = idf(docFreq=124, maxDocs=44421)
                0.0546875 = fieldNorm(doc=68)
        0.24 = coord(6/25)
    
  5. Çelebi, A.; Özgür, A.: Segmenting hashtags and analyzing their grammatical structure (2018) 0.10
    0.101661205 = sum of:
      0.101661205 = product of:
        0.508306 = sum of:
          0.015282733 = weight(abstract_txt:used in 221) [ClassicSimilarity], result of:
            0.015282733 = score(doc=221,freq=2.0), product of:
              0.051502556 = queryWeight, product of:
                1.0900569 = boost
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.014073508 = queryNorm
              0.29673737 = fieldWeight in 221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3572001 = idf(docFreq=4205, maxDocs=44421)
                0.0625 = fieldNorm(doc=221)
          0.030454956 = weight(abstract_txt:learning in 221) [ClassicSimilarity], result of:
            0.030454956 = score(doc=221,freq=1.0), product of:
              0.10275668 = queryWeight, product of:
                1.5397131 = boost
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.014073508 = queryNorm
              0.29637933 = fieldWeight in 221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7420692 = idf(docFreq=1052, maxDocs=44421)
                0.0625 = fieldNorm(doc=221)
          0.08863512 = weight(abstract_txt:parsing in 221) [ClassicSimilarity], result of:
            0.08863512 = score(doc=221,freq=1.0), product of:
              0.18298334 = queryWeight, product of:
                1.6776252 = boost
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.014073508 = queryNorm
              0.484389 = fieldWeight in 221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.750224 = idf(docFreq=51, maxDocs=44421)
                0.0625 = fieldNorm(doc=221)
          0.12629353 = weight(abstract_txt:noun in 221) [ClassicSimilarity], result of:
            0.12629353 = score(doc=221,freq=2.0), product of:
              0.1839014 = queryWeight, product of:
                1.6818284 = boost
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.014073508 = queryNorm
              0.6867458 = fieldWeight in 221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.769642 = idf(docFreq=50, maxDocs=44421)
                0.0625 = fieldNorm(doc=221)
          0.24763975 = weight(abstract_txt:segmentation in 221) [ClassicSimilarity], result of:
            0.24763975 = score(doc=221,freq=3.0), product of:
              0.28810087 = queryWeight, product of:
                2.5781434 = boost
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.014073508 = queryNorm
              0.8595592 = fieldWeight in 221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.9402676 = idf(docFreq=42, maxDocs=44421)
                0.0625 = fieldNorm(doc=221)
        0.2 = coord(5/25)