Predicting Mobile Financial Service Adoption with Machine Learning in 2021


Mobile money in Africa has quickly evolved from its traditional role as a payment service to a portal for millions of people across the continent to access a growing range of financial products and services. For banks and other traditional financial services and marketing services companies, future profitability will largely depend on their ability to work with mobile service providers and focus their network subscribers on precisely relevant financial services.

There are interesting business opportunities for effective customer targeting and cross-selling: for banks, digital channels with low deposit loans are increasing; For mobile service providers, providing digital finance products that meet the needs of subscribers increases engagement and retention.

In this post, I will explore how machine learning can classify individuals into one of four categories, based on the types of financial services they are likely to use. This is an example of multiple classifications where the task involves using an algorithm to generate a mapping function between a given set of input functions and a categorical target variable that takes on more than two values.

The data

The dataset for this project was originally prepared for the Tanzania Mobile Money and Financial Inclusion Challenge and provided by Zindi. Education data contains 36 socioeconomic and demographic characteristics of approximately 7,100 individuals in Tanzania. The target variable describes the types of financial services used by an individual, grouped into four categories that are mutually exclusive:

Predicting Mobile Financial services :

people who do not use mobile money, do not save, have no credit, and are not insured

1. Only others: individuals who do not use mobile money, but use at least one of the other financial services (savings, credit, insurance)

2. Mm_only: individuals who use only mobile money

3. Mm_plus: people who use mobile money and also use at least one of the other financial services (savings, credit, insurance)

To enrich the training data, a geographic map of all financial access points in the country was provided (for example, ATMs, bank branches, mobile money agents, etc.). ArcGIS also provided regional, demographic, economic, and other data to create additional input functions.

Exploratory data analysis

EDA is an important first step that allows us to get an idea of the data and formulate initial hypotheses about the functional form of the relationship between the input functions and the destination.

Read the data

First, read the data and rename the raw access codes with more intuitive column names for easier analysis.

Data pre-processing

If you are dealing with categorical attributes, add category levels with few training examples, little information value, and contribute unnecessarily to a large number of resources as we use coding. We will talk about this later.


We have seen that many of our data sets consist of nominal sources with some ordinal data. Now that we have reduced the number of categorical attributes and rare category levels, we will use PD get dummies panda to convert it into dummy variables and remember to pass the drop_first = True argument to avoid multicollinearity.


For continuous functions, such as distance to the nearest financial access point and “Age”, we apply the same frequency to compensate for compensation and exhaustion.

Target encoding

The location attributes we have with reverse geocoding, such as ‘district’, are the main nominal attributes:

Some ‘brands’ will increase the dimensionality of data from 168 sources. With limited training examples, this can lead to over-adaptation. Instead, we use the coding estimator M from the category coding library to calculate the mean target value for each “district” and replace each “district” with the calculated mean. To reduce excessive correction, the M estimate uses additive smoothness to combine the “District” mean with the mean for all “districts”. For ‘neighborhood values’ with few training examples, the estimate M can be determined in a way that is more dependent on the overall mean and vice versa.


Now let’s look at the relative performance of three known classifications in our final data set: it supports vector machines, logistic regression, and random forests.

SVMs classify observations looking for ideal hyperlinks in n-dimensional space separating positive and negative examples. When classes cannot be separated linearly, SVMs use core functions to map the original input space to a larger source space to search for an optimal decision boundary.

Logistic regression assumes a linear relationship between the input sources and the logistic probability of association with the class. Ideal weights of resources maximize your chances of reporting training data.

Basic accuracy

To determine a base score, we use the Zero score. This model only offers the majority class for all champions. This benchmark strategy offers 44% accuracy.

Class imbalance

In unbalanced data sets, accuracy is not always an appropriate measure of performance. With extreme class imbalance, even weak models can score high. Take a look at the confusion matrix for the SVM model to get an idea of   how each of the class labels worked:

The confusion matrix shows that ‘Mm_only’ is almost always incorrectly classified as ‘Mm_plus’, the majority label. Remember that ‘Mm_only’ is close to zero. The model also incorrectly classifies nearly 50% of “No_Financial_Services” as “Other_Alone”.

Returning to our previous discussion of commitments regarding unbalanced datasets, we noted that the goal of the challenge was to accurately predict “Mm_plus”, the majority class label. This is analogous to a business strategy that benefits a specific segment of customers. Here we have 78% accuracy and 91% memory, which is significantly higher than other class labels – a satisfying result for the task.


The purpose of this post was to provide an end-to-end view of how we can use guided learning to refine customer segmentation and cross-selling strategies for mobile network operators and financial services providers. Vector machine support, logistical regression, and randomized forests have shown promising results by splitting individuals into mutually exclusive categories representing different levels of financial inclusion. However, there are several ways to improve the performance of the model, including adjusting hyperparameters and additional functions.

Translate »