Scorecard - Credit Engine

Credit risk scorecards provide quantitative estimates of the likelihood of customer delinquencies, such as bankruptcy or loan default, in relation to their credit status. We've developed a fast, automated credit risk engine to enhance decision-making efficiency. Upcoming posts will explore common ML models used in the consumer finance credit risk sector.

Scorecard - Credit Engine

Credit risk scorecards are models that help in providing a quantitative estimate of the chances that a customer will display behavior like bankruptcy, loan default, or other nonpayment delinquencies, with respect to their current or proposed credit position with a lender. With the promise of fast and automatic decisions, we have developed an automatic and fast credit risk engine to assess credit risk. In this post and in later posts, we propose to introduce the most common ML models used in the consumer finance credit risk space.

Executive Summary

The credit model uses information contained in the application like salary, credit commitments, and past loan performances to determine a credit score of an application or an existing customer. The model outputs a score that represents how likely it is that the lender will be repaid on time if they give a person a loan or a credit card.

Building a Credit Scorecard

The target variable usually takes a binary form. Depending on the data, it can be 0 for performing customers or non-defaulted customers and 1 for defaulted customers. In order to build the credit scorecard engine, many kinds of credit scoring techniques have been used. The Logistic regression model is the most commonly used among them because it is rather robust, transparent and easy to interpret. Below are the steps for building the scorecard engine.

Step 1: Data Exploration, Analysis and Cleaning
Step 2: Data Transformation Weight of Evidence (WoE) Method
Step 3: Feature Selection using Information Value (IV)
Step 4: Model Fitting & Interpreting Results

Step 1: Data Exploration, Analysis and Cleaning

While building the credit scoring model an exploratory analysis of the data was performed and a summary statistic of the variables was provided. Furthermore, the data have been split into training and testing datasets. Thereafter, the continuous and discrete variables were processed, and missing values were checked while performing data cleaning.

Step 2: Data Transformation Weight of Evidence (WoE)

The default threshold had to be defined to be able to distinguish between defaulted and non-defaulted clients. Due to high interpretability, the logistic regression model has been used to model the probability of defaulting. All the independent variables (age, income, years at current job, accommodation status, etc.) have been transformed using the WOE method to predict the dependent variable.

WoE is calculated as

Step 3: Feature Selection using Information Value (IV)

Feature selection is the process of choosing a subset of the full set of features available for use in a scorecard by eliminating features that are either redundant or possess little predictive information. IV measures the predictive power of independent variables which is useful in feature selection. For example, IV shows how much information the original independent variable brings with respect to explaining the dependent variable. It is good practice to perform feature selection to determine if it’s necessary to include all the features in the model as we want to eliminate weak features due to its poor predictive power.

IV is calculated as

Step 4: Model Fitting & Interpreting Results

After the feature selection, the attributes are replaced with the corresponding WOE. The selected features have been used and a logistic regression model has been deployed to determine the probability of the default of a borrower. When scaling the model into a scorecard, both the Logistic Regression coefficients from model fitting as well as the transformed WOE values will be required. Having the advantage of standardizing scorecards makes them more interpretable for non-technical people and allows comparison of the different PD models. Note that applying the scorecard is just like applying the PD model itself. A scorecard tool produces an individual creditworthiness assessment that directly corresponds to a specific probability of default. The table below refers to the methodology of how the credit score is constructed.

Total Score = Σ Score_i