Document (#43503)

Author
Goldberg, D.M.
Zaman, N.
Brahma, A.
Aloiso, M.
Title
Are mortgage loan closing delay risks predictable? : A predictive analysis using text mining on discussion threads
Source
Journal of the Association for Information Science and Technology. 73(2022) no.3, S.419-437
Year
2022
Abstract
Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out-of-the-box sentiment analysis techniques, two dictionary-based and two rule-based, to predict delays. We contrast these approaches with domain-specific approaches, including firm-provided keyword searches and "smoke terms" derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top-ranking records well, performance quickly declines thereafter. The firm-provided keyword searches perform at the rate of random chance. We observe that the domain-specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24559.
Theme
Data Mining

Similar documents (author)

  1. Goldberg, M.: CD-ROM periodical indexes : better evaluation necessary (1992) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:goldberg in 8302) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 8302, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=8302)
    
  2. Goldberg, J.E.: Library of Congress Classification : shelving device for collections or organization of knowledge fields? (1996) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:goldberg in 4647) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 4647, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=4647)
    
  3. Goldberg, E.: ¬The retrieval problem in photography (1932) (1992) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:goldberg in 322) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 322, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=322)
    
  4. Goldberg, J.: Classification of religion in LCC (2000) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:goldberg in 6402) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 6402, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=6402)
    
  5. Goldberg, E.: ¬Die Regie im Gehrin : Wo wir Pläne schmieden und Entscheidungen treffen (2002) 5.87
    5.874302 = sum of:
      5.874302 = weight(author_txt:goldberg in 2343) [ClassicSimilarity], result of:
        5.874302 = fieldWeight in 2343, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.398883 = idf(docFreq=9, maxDocs=44421)
          0.625 = fieldNorm(doc=2343)
    

Similar documents (content)

  1. Taylor, N.J.; Dennis, A.R.; Cummings, J.W.: Situation normality and the shape of search : the effects of time delays and information presentation on search behavior (2013) 0.13
    0.1250869 = sum of:
      0.1250869 = product of:
        1.042391 = sum of:
          0.0214425 = weight(abstract_txt:searches in 1741) [ClassicSimilarity], result of:
            0.0214425 = score(doc=1741,freq=1.0), product of:
              0.0656095 = queryWeight, product of:
                1.18694 = boost
                5.229121 = idf(docFreq=646, maxDocs=44421)
                0.010570833 = queryNorm
              0.32682008 = fieldWeight in 1741, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.229121 = idf(docFreq=646, maxDocs=44421)
                0.0625 = fieldNorm(doc=1741)
          0.17362279 = weight(abstract_txt:delay in 1741) [ClassicSimilarity], result of:
            0.17362279 = score(doc=1741,freq=4.0), product of:
              0.1666611 = queryWeight, product of:
                1.8917447 = boost
                8.334172 = idf(docFreq=28, maxDocs=44421)
                0.010570833 = queryNorm
              1.0417715 = fieldWeight in 1741, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.334172 = idf(docFreq=28, maxDocs=44421)
                0.0625 = fieldNorm(doc=1741)
          0.8473256 = weight(abstract_txt:delays in 1741) [ClassicSimilarity], result of:
            0.8473256 = score(doc=1741,freq=7.0), product of:
              0.57388437 = queryWeight, product of:
                6.080205 = boost
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.010570833 = queryNorm
              1.4764745 = fieldWeight in 1741, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                8.928879 = idf(docFreq=15, maxDocs=44421)
                0.0625 = fieldNorm(doc=1741)
        0.12 = coord(3/25)
    
  2. Zhang, C.; Zeng, D.; Li, J.; Wang, F.-Y.; Zuo, W.: Sentiment analysis of Chinese documents : from sentence to document level (2009) 0.09
    0.09199372 = sum of:
      0.09199372 = product of:
        0.4599686 = sum of:
          0.011615555 = weight(abstract_txt:using in 283) [ClassicSimilarity], result of:
            0.011615555 = score(doc=283,freq=1.0), product of:
              0.04300974 = queryWeight, product of:
                1.1769946 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.010570833 = queryNorm
              0.27006802 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=283)
          0.044074282 = weight(abstract_txt:mining in 283) [ClassicSimilarity], result of:
            0.044074282 = score(doc=283,freq=1.0), product of:
              0.09140429 = queryWeight, product of:
                1.4009695 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.010570833 = queryNorm
              0.48219052 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.078125 = fieldNorm(doc=283)
          0.08713204 = weight(abstract_txt:predict in 283) [ClassicSimilarity], result of:
            0.08713204 = score(doc=283,freq=1.0), product of:
              0.16481322 = queryWeight, product of:
                2.3040242 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.010570833 = queryNorm
              0.5286714 = fieldWeight in 283, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.078125 = fieldNorm(doc=283)
          0.23982817 = weight(abstract_txt:sentiment in 283) [ClassicSimilarity], result of:
            0.23982817 = score(doc=283,freq=4.0), product of:
              0.20391709 = queryWeight, product of:
                2.5628185 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.010570833 = queryNorm
              1.1761063 = fieldWeight in 283, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=283)
          0.07731854 = weight(abstract_txt:approaches in 283) [ClassicSimilarity], result of:
            0.07731854 = score(doc=283,freq=2.0), product of:
              0.15219344 = queryWeight, product of:
                3.131151 = boost
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.010570833 = queryNorm
              0.5080281 = fieldWeight in 283, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.078125 = fieldNorm(doc=283)
        0.2 = coord(5/25)
    
  3. Thelwall, M.; Buckley, K.; Paltoglou, G.; Cai, D.; Kappas, A.: Sentiment strength detection in short informal text (2010) 0.07
    0.07355492 = sum of:
      0.07355492 = product of:
        0.36777458 = sum of:
          0.009292444 = weight(abstract_txt:using in 200) [ClassicSimilarity], result of:
            0.009292444 = score(doc=200,freq=1.0), product of:
              0.04300974 = queryWeight, product of:
                1.1769946 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.010570833 = queryNorm
              0.21605442 = fieldWeight in 200, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.0625 = fieldNorm(doc=200)
          0.030529685 = weight(abstract_txt:discussion in 200) [ClassicSimilarity], result of:
            0.030529685 = score(doc=200,freq=1.0), product of:
              0.0950521 = queryWeight, product of:
                1.7497336 = boost
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.010570833 = queryNorm
              0.32118896 = fieldWeight in 200, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.0625 = fieldNorm(doc=200)
          0.06970563 = weight(abstract_txt:predict in 200) [ClassicSimilarity], result of:
            0.06970563 = score(doc=200,freq=1.0), product of:
              0.16481322 = queryWeight, product of:
                2.3040242 = boost
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.010570833 = queryNorm
              0.4229371 = fieldWeight in 200, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7669935 = idf(docFreq=138, maxDocs=44421)
                0.0625 = fieldNorm(doc=200)
          0.21450885 = weight(abstract_txt:sentiment in 200) [ClassicSimilarity], result of:
            0.21450885 = score(doc=200,freq=5.0), product of:
              0.20391709 = queryWeight, product of:
                2.5628185 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.010570833 = queryNorm
              1.0519415 = fieldWeight in 200, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0625 = fieldNorm(doc=200)
          0.043737974 = weight(abstract_txt:approaches in 200) [ClassicSimilarity], result of:
            0.043737974 = score(doc=200,freq=1.0), product of:
              0.15219344 = queryWeight, product of:
                3.131151 = boost
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.010570833 = queryNorm
              0.2873841 = fieldWeight in 200, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.0625 = fieldNorm(doc=200)
        0.2 = coord(5/25)
    
  4. Wei, W.; Liu, Y.-P.; Wei, L-R.: Feature-level sentiment analysis based on rules and fine-grained domain ontology (2020) 0.07
    0.06516594 = sum of:
      0.06516594 = product of:
        0.40728712 = sum of:
          0.034334417 = weight(abstract_txt:domain in 876) [ClassicSimilarity], result of:
            0.034334417 = score(doc=876,freq=3.0), product of:
              0.053656515 = queryWeight, product of:
                1.0733877 = boost
                4.7288613 = idf(docFreq=1066, maxDocs=44421)
                0.010570833 = queryNorm
              0.6398928 = fieldWeight in 876, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7288613 = idf(docFreq=1066, maxDocs=44421)
                0.078125 = fieldNorm(doc=876)
          0.011615555 = weight(abstract_txt:using in 876) [ClassicSimilarity], result of:
            0.011615555 = score(doc=876,freq=1.0), product of:
              0.04300974 = queryWeight, product of:
                1.1769946 = boost
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.010570833 = queryNorm
              0.27006802 = fieldWeight in 876, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4568708 = idf(docFreq=3806, maxDocs=44421)
                0.078125 = fieldNorm(doc=876)
          0.044074282 = weight(abstract_txt:mining in 876) [ClassicSimilarity], result of:
            0.044074282 = score(doc=876,freq=1.0), product of:
              0.09140429 = queryWeight, product of:
                1.4009695 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.010570833 = queryNorm
              0.48219052 = fieldWeight in 876, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.078125 = fieldNorm(doc=876)
          0.31726286 = weight(abstract_txt:sentiment in 876) [ClassicSimilarity], result of:
            0.31726286 = score(doc=876,freq=7.0), product of:
              0.20391709 = queryWeight, product of:
                2.5628185 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.010570833 = queryNorm
              1.5558424 = fieldWeight in 876, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.078125 = fieldNorm(doc=876)
        0.16 = coord(4/25)
    
  5. Pang, B.; Lee, L.: Opinion mining and sentiment analysis (2008) 0.06
    0.06433244 = sum of:
      0.06433244 = product of:
        0.3216622 = sum of:
          0.015545524 = weight(abstract_txt:provided in 2171) [ClassicSimilarity], result of:
            0.015545524 = score(doc=2171,freq=1.0), product of:
              0.057878144 = queryWeight, product of:
                1.1148148 = boost
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.010570833 = queryNorm
              0.26859057 = fieldWeight in 2171, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9113703 = idf(docFreq=888, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2171)
          0.05343722 = weight(abstract_txt:mining in 2171) [ClassicSimilarity], result of:
            0.05343722 = score(doc=2171,freq=3.0), product of:
              0.09140429 = queryWeight, product of:
                1.4009695 = boost
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.010570833 = queryNorm
              0.5846249 = fieldWeight in 2171, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1720386 = idf(docFreq=251, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2171)
          0.026713476 = weight(abstract_txt:discussion in 2171) [ClassicSimilarity], result of:
            0.026713476 = score(doc=2171,freq=1.0), product of:
              0.0950521 = queryWeight, product of:
                1.7497336 = boost
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.010570833 = queryNorm
              0.28104034 = fieldWeight in 2171, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1390233 = idf(docFreq=707, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2171)
          0.18769525 = weight(abstract_txt:sentiment in 2171) [ClassicSimilarity], result of:
            0.18769525 = score(doc=2171,freq=5.0), product of:
              0.20391709 = queryWeight, product of:
                2.5628185 = boost
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.010570833 = queryNorm
              0.92044884 = fieldWeight in 2171, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.5270805 = idf(docFreq=64, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2171)
          0.038270727 = weight(abstract_txt:approaches in 2171) [ClassicSimilarity], result of:
            0.038270727 = score(doc=2171,freq=1.0), product of:
              0.15219344 = queryWeight, product of:
                3.131151 = boost
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.010570833 = queryNorm
              0.2514611 = fieldWeight in 2171, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5981455 = idf(docFreq=1215, maxDocs=44421)
                0.0546875 = fieldNorm(doc=2171)
        0.2 = coord(5/25)