Document (#40932)

Author
Calvanese, D.
Kalayci, T.E.
Montali, M.
Santoso, A.
Title
OBDA for log extraction in process mining
Source
Reasoning Web: Semantic Interoperability on the Web, 13th International Summer School 2017, London, UK, July 7-11, 2017, Tutorial Lectures. Eds.: Ianni, G. et al
Imprint
Cham : Springer International Publishing
Year
2017
Pages
S.292-345
Series
Lecture Notes in Computer Scienc;10370) (Information Systems and Applications, incl. Internet/Web, and HCI
Abstract
Process mining is an emerging area that synergically combines model-based and data-oriented analysis techniques to obtain useful insights on how business processes are executed within an organization. Through process mining, decision makers can discover process models from data, compare expected and actual behaviors, and enrich models with key information about their actual execution. To be applicable, process mining techniques require the input data to be explicitly structured in the form of an event log, which lists when and by whom different case objects (i.e., process instances) have been subject to the execution of tasks. Unfortunately, in many real world set-ups, such event logs are not explicitly given, but are instead implicitly represented in legacy information systems. To apply process mining in this widespread setting, there is a pressing need for techniques able to support various process stakeholders in data preparation and log extraction from legacy information systems. The purpose of this paper is to single out this challenging, open issue, and didactically introduce how techniques from intelligent data management, and in particular ontology-based data access, provide a viable solution with a solid theoretical basis.

Similar documents (content)

  1. Barrio, P.; Gravano, L.: Sampling strategies for information extraction over the deep web (2017) 0.30
    0.2956445 = sum of:
      0.2956445 = product of:
        0.8212347 = sum of:
          0.009650153 = weight(abstract_txt:this in 4412) [ClassicSimilarity], result of:
            0.009650153 = score(doc=4412,freq=3.0), product of:
              0.042339675 = queryWeight, product of:
                1.0039073 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017527334 = queryNorm
              0.22792223 = fieldWeight in 4412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.016008774 = weight(abstract_txt:information in 4412) [ClassicSimilarity], result of:
            0.016008774 = score(doc=4412,freq=8.0), product of:
              0.042786542 = queryWeight, product of:
                1.0091913 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.017527334 = queryNorm
              0.37415442 = fieldWeight in 4412, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.008402393 = weight(abstract_txt:from in 4412) [ClassicSimilarity], result of:
            0.008402393 = score(doc=4412,freq=1.0), product of:
              0.05568016 = queryWeight, product of:
                1.1512512 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017527334 = queryNorm
              0.15090463 = fieldWeight in 4412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.18988867 = weight(abstract_txt:extraction in 4412) [ClassicSimilarity], result of:
            0.18988867 = score(doc=4412,freq=9.0), product of:
              0.18691891 = queryWeight, product of:
                1.7222686 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.017527334 = queryNorm
              1.015888 = fieldWeight in 4412, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.2059983 = weight(abstract_txt:execution in 4412) [ClassicSimilarity], result of:
            0.2059983 = score(doc=4412,freq=2.0), product of:
              0.32581204 = queryWeight, product of:
                2.2738292 = boost
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.017527334 = queryNorm
              0.63226116 = fieldWeight in 4412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.175107 = idf(docFreq=33, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.07019672 = weight(abstract_txt:techniques in 4412) [ClassicSimilarity], result of:
            0.07019672 = score(doc=4412,freq=2.0), product of:
              0.20026848 = queryWeight, product of:
                2.5211318 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.017527334 = queryNorm
              0.35051307 = fieldWeight in 4412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.04177587 = weight(abstract_txt:data in 4412) [ClassicSimilarity], result of:
            0.04177587 = score(doc=4412,freq=2.0), product of:
              0.16219923 = queryWeight, product of:
                2.778813 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.017527334 = queryNorm
              0.257559 = fieldWeight in 4412, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.15670909 = weight(abstract_txt:mining in 4412) [ClassicSimilarity], result of:
            0.15670909 = score(doc=4412,freq=1.0), product of:
              0.46427736 = queryWeight, product of:
                4.291736 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017527334 = queryNorm
              0.33753335 = fieldWeight in 4412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
          0.12260471 = weight(abstract_txt:process in 4412) [ClassicSimilarity], result of:
            0.12260471 = score(doc=4412,freq=3.0), product of:
              0.31968266 = queryWeight, product of:
                4.5046782 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.017527334 = queryNorm
              0.38352007 = fieldWeight in 4412, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0546875 = fieldNorm(doc=4412)
        0.36 = coord(9/25)
    
  2. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.24
    0.24436377 = sum of:
      0.24436377 = product of:
        1.0181824 = sum of:
          0.014404103 = weight(abstract_txt:from in 3899) [ClassicSimilarity], result of:
            0.014404103 = score(doc=3899,freq=1.0), product of:
              0.05568016 = queryWeight, product of:
                1.1512512 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017527334 = queryNorm
              0.25869364 = fieldWeight in 3899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.10850781 = weight(abstract_txt:extraction in 3899) [ClassicSimilarity], result of:
            0.10850781 = score(doc=3899,freq=1.0), product of:
              0.18691891 = queryWeight, product of:
                1.7222686 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.017527334 = queryNorm
              0.5805074 = fieldWeight in 3899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.08509127 = weight(abstract_txt:techniques in 3899) [ClassicSimilarity], result of:
            0.08509127 = score(doc=3899,freq=1.0), product of:
              0.20026848 = queryWeight, product of:
                2.5211318 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.017527334 = queryNorm
              0.424886 = fieldWeight in 3899, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.101280004 = weight(abstract_txt:data in 3899) [ClassicSimilarity], result of:
            0.101280004 = score(doc=3899,freq=4.0), product of:
              0.16219923 = queryWeight, product of:
                2.778813 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.017527334 = queryNorm
              0.6244173 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.5372883 = weight(abstract_txt:mining in 3899) [ClassicSimilarity], result of:
            0.5372883 = score(doc=3899,freq=4.0), product of:
              0.46427736 = queryWeight, product of:
                4.291736 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017527334 = queryNorm
              1.1572572 = fieldWeight in 3899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
          0.17161086 = weight(abstract_txt:process in 3899) [ClassicSimilarity], result of:
            0.17161086 = score(doc=3899,freq=2.0), product of:
              0.31968266 = queryWeight, product of:
                4.5046782 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.017527334 = queryNorm
              0.5368163 = fieldWeight in 3899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.09375 = fieldNorm(doc=3899)
        0.24 = coord(6/25)
    
  3. Benoit, G.: Data mining (2002) 0.23
    0.23315868 = sum of:
      0.23315868 = product of:
        0.7286209 = sum of:
          0.0063674496 = weight(abstract_txt:this in 5296) [ClassicSimilarity], result of:
            0.0063674496 = score(doc=5296,freq=1.0), product of:
              0.042339675 = queryWeight, product of:
                1.0039073 = boost
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.017527334 = queryNorm
              0.15038967 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4062347 = idf(docFreq=10885, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.009147871 = weight(abstract_txt:information in 5296) [ClassicSimilarity], result of:
            0.009147871 = score(doc=5296,freq=2.0), product of:
              0.042786542 = queryWeight, product of:
                1.0091913 = boost
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.017527334 = queryNorm
              0.21380253 = fieldWeight in 5296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4188995 = idf(docFreq=10748, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.016632425 = weight(abstract_txt:from in 5296) [ClassicSimilarity], result of:
            0.016632425 = score(doc=5296,freq=3.0), product of:
              0.05568016 = queryWeight, product of:
                1.1512512 = boost
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.017527334 = queryNorm
              0.29871368 = fieldWeight in 5296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.759399 = idf(docFreq=7646, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.030106999 = weight(abstract_txt:models in 5296) [ClassicSimilarity], result of:
            0.030106999 = score(doc=5296,freq=1.0), product of:
              0.10419616 = queryWeight, product of:
                1.285879 = boost
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.017527334 = queryNorm
              0.28894538 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.623126 = idf(docFreq=1185, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.10230214 = weight(abstract_txt:extraction in 5296) [ClassicSimilarity], result of:
            0.10230214 = score(doc=5296,freq=2.0), product of:
              0.18691891 = queryWeight, product of:
                1.7222686 = boost
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.017527334 = queryNorm
              0.5473076 = fieldWeight in 5296, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.192079 = idf(docFreq=246, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.08269478 = weight(abstract_txt:data in 5296) [ClassicSimilarity], result of:
            0.08269478 = score(doc=5296,freq=6.0), product of:
              0.16219923 = queryWeight, product of:
                2.778813 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.017527334 = queryNorm
              0.5098346 = fieldWeight in 5296, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.4004711 = weight(abstract_txt:mining in 5296) [ClassicSimilarity], result of:
            0.4004711 = score(doc=5296,freq=5.0), product of:
              0.46427736 = queryWeight, product of:
                4.291736 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017527334 = queryNorm
              0.8625686 = fieldWeight in 5296, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
          0.080898136 = weight(abstract_txt:process in 5296) [ClassicSimilarity], result of:
            0.080898136 = score(doc=5296,freq=1.0), product of:
              0.31968266 = queryWeight, product of:
                4.5046782 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.017527334 = queryNorm
              0.25305763 = fieldWeight in 5296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.0625 = fieldNorm(doc=5296)
        0.32 = coord(8/25)
    
  4. Saz, J.T.: Perspectivas en recuperacion y explotacion de informacion electronica : el 'data mining' (1997) 0.20
    0.2025213 = sum of:
      0.2025213 = product of:
        1.2657582 = sum of:
          0.14181879 = weight(abstract_txt:techniques in 4723) [ClassicSimilarity], result of:
            0.14181879 = score(doc=4723,freq=1.0), product of:
              0.20026848 = queryWeight, product of:
                2.5211318 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.017527334 = queryNorm
              0.70814335 = fieldWeight in 4723, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.15625 = fieldNorm(doc=4723)
          0.14618509 = weight(abstract_txt:data in 4723) [ClassicSimilarity], result of:
            0.14618509 = score(doc=4723,freq=3.0), product of:
              0.16219923 = queryWeight, product of:
                2.778813 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.017527334 = queryNorm
              0.9012687 = fieldWeight in 4723, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.15625 = fieldNorm(doc=4723)
          0.7755089 = weight(abstract_txt:mining in 4723) [ClassicSimilarity], result of:
            0.7755089 = score(doc=4723,freq=3.0), product of:
              0.46427736 = queryWeight, product of:
                4.291736 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017527334 = queryNorm
              1.6703569 = fieldWeight in 4723, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.15625 = fieldNorm(doc=4723)
          0.20224534 = weight(abstract_txt:process in 4723) [ClassicSimilarity], result of:
            0.20224534 = score(doc=4723,freq=1.0), product of:
              0.31968266 = queryWeight, product of:
                4.5046782 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.017527334 = queryNorm
              0.63264406 = fieldWeight in 4723, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.15625 = fieldNorm(doc=4723)
        0.16 = coord(4/25)
    
  5. Garcia Marco, F.J.: ¬El factor humano en los sistemas de información (2003) 0.20
    0.2025213 = sum of:
      0.2025213 = product of:
        1.2657582 = sum of:
          0.14181879 = weight(abstract_txt:techniques in 1929) [ClassicSimilarity], result of:
            0.14181879 = score(doc=1929,freq=1.0), product of:
              0.20026848 = queryWeight, product of:
                2.5211318 = boost
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.017527334 = queryNorm
              0.70814335 = fieldWeight in 1929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5321174 = idf(docFreq=1298, maxDocs=44421)
                0.15625 = fieldNorm(doc=1929)
          0.14618509 = weight(abstract_txt:data in 1929) [ClassicSimilarity], result of:
            0.14618509 = score(doc=1929,freq=3.0), product of:
              0.16219923 = queryWeight, product of:
                2.778813 = boost
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.017527334 = queryNorm
              0.9012687 = fieldWeight in 1929, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3302255 = idf(docFreq=4320, maxDocs=44421)
                0.15625 = fieldNorm(doc=1929)
          0.7755089 = weight(abstract_txt:mining in 1929) [ClassicSimilarity], result of:
            0.7755089 = score(doc=1929,freq=3.0), product of:
              0.46427736 = queryWeight, product of:
                4.291736 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.017527334 = queryNorm
              1.6703569 = fieldWeight in 1929, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.15625 = fieldNorm(doc=1929)
          0.20224534 = weight(abstract_txt:process in 1929) [ClassicSimilarity], result of:
            0.20224534 = score(doc=1929,freq=1.0), product of:
              0.31968266 = queryWeight, product of:
                4.5046782 = boost
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.017527334 = queryNorm
              0.63264406 = fieldWeight in 1929, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.048922 = idf(docFreq=2105, maxDocs=44421)
                0.15625 = fieldNorm(doc=1929)
        0.16 = coord(4/25)