ML models for Credit Risk Analysis

Our team has developed a solution that helps assessesing the credit risk of multiple companies on a daily basis.

The problem

To be able to grant business loans, our client assesses the credit risk of multiple companies on a daily basis. In order to speed up and automate the process, and to reduce the human error factor, it needs a solution to process raw financial data and quickly generate risk ratios, based on which to determine the insurance premium of each loan.

The solution

Our solution accepts raw data in .xlc format, which is repeatedly processed and a model trained by us for Loss Given Default (LGD) is attached to the newly obtained data. LGD represents the percentage of the loss that the financial institution would incur if the borrower defaulted, and contained comprehensive methods and calculations as . The purpose of the LGD model is to predict this loss, taking into account various factors such as the type of collateral, the loan segment, market conditions and others. The training and evaluation of the LGD model requires careful data collection, selection of appropriate modeling algorithms, and use of metrics to evaluate the performance.

Then comes the second model – Probability of the Default (PD). It represents the probability of default of a borrower for a given period of time. This model is key to credit risk assessment and is the basis for calculating the capital needed to cover risks.

The whole process will be automated and at the end, our client will receive a PDF report with a summary of the key valued points.

Our team was responsible for

Identifying and gathering requirements
UI/UX
Implementing the web user interface
Developing and training ML models

The client

One of the largest Bulgarian commercial banks, for their Business Credits Department. Potential users could also be other credit and insurance organizations, along with other institutions which need to predict and assess risk.

Frontend technologies

web UI – Angular

How we used ML

To process the raw data effectively, we use powerful machine learning tools and Python libraries. Scikit-learn helps with tasks such as data preprocessing and evaluating the performance of the models. The XGBRegressor is used to build and train regression models for the models.For efficient data manipulation and numerical computations, we use Pandas and NumPy. Matplotlib is used for visualizing the results.