Customer churn prediction with Google AutoML
In today’s highly competitive business landscape, customer retention is crucial for the sustained success of any company. Customer churn prediction and the rate at which customers discontinue doing business with a company is a major challenge that businesses face. Google’s AutoML, a suite of ML tools that enable developers to build and deploy machine learning models with minimal effort, can help your businesses identify patterns and factors that lead to customer churn.
By using AutoML, your business can benefit from customer churn prediction using machine learning and build accurate models that forecast customer attrition, allowing them to take proactive steps to improve customer satisfaction and retention.
In this article, written by Dawid Marczak, AI Data Scientist, based on a solution prepared by Piotr Zakrzewski, Senior AI Data Scientist, we cover how machine learning models and Google AutoML tools can help you predict customer churn, and prepare to implement retention strategies effectively.
Data preparation and exploration – the first step to predict customer churn
The first step is to examine the available data and what can be learned about the customers. Google Cloud Platform provides a great tool that can be used for business insights and visualisation of data trends – Looker. The data used for this is the consisting of roughly 7,000 users based in UK out of which 1,869 are labelled as “churn”:
The dataset contains features such as Tenure, Contract (type), Total charges, Monthly charges, Payment method, gender and many indicators of what Telecom services are being used, such as TV, Internet, Streaming Movies, etc.
In the sample dataset that we are analysing, one of the most important, if not the biggest, driving factors for customer retention is tenure – how long a customer has been using the company services. This sparks a potential “what comes first?” chicken and egg debate regarding retention and tenure, but it is not counter-intuitive that customers who have been loyal to a company for many years continue to be so. This also holds true for the IBM telecom data when comparing “churn” and “non-churn” users for different tenure values.
Upon further examination of the data, it reveals that there are three types of contract periods offered by the company:
- Month-to-month
- One year
- Two year
This distinction into categories could also be an important driving churn factor as we examine the average monthly charges vs. the average tenure of each group. The month-to-month users pay the most out of the three available contract types and tend to stay the least time with the company – less than 20 months on average. The users who stay the longest with the company are the “Two-year” contractors who also have the lowest monthly charges out of the three groups – a bit less than $61.
A simple business insight to draw is the average revenue generated by a user of each group:
Contract type | Average revenue per user (tenure x charges) |
Month-to-month | $1,188 (18 x $66) |
One year | $2,730 (42 x $65) |
Two year | $3,416 (56 x $61) |
Firstly, let’s note that a “One-year” user generates $1,500 more revenue than a “Month-to-month” user. The monthly users tend to stay with the company for 18 months (more than a year!) on average, so upgrading their contract to a yearly subscription would benefit both them due to lower monthly charges and the company due to increased revenue over a longer period. Similar case for “One year” to “Two-year” conversion: on average, a “One-year” user generates $2,730 in revenue and stays for 42 months, meaning that if they were converted to “Two year”, it could result in a revenue increase of $700 per user.
Regardless of the contract type, the key point remains – the longer a user stays, the more revenue for a company. To retain customers, it is vital to predict which users are likely to churn and come up with some incentive to convince them to keep using the company’s services. This is where machine learning comes to aid.
Machine Learning model for predicting customer churn
AutoML, or Automated Machine Learning, is revolutionising the way businesses approach machine learning and predicting customer churn. AutoML allows the automation of some of the time-consuming tasks, such as feature engineering, hyperparameter tuning, and model selection which are necessary for a traditional machine learning workflow. It allows for quick experimentation with high-quality models that are specific to the business needs.
Source: YouTube
Google Cloud provides an excellent tool which can easily carry out AutoML tasks – Vertex AI. Once the data is loaded into Google Cloud, there are a few steps needed to train a new model:
1. Log in to Google Cloud Platform and navigate to Vertex AI > Training (you might need to find it in ”MORE PRODUCTS” when using it for the first time)
2. Create a new model and train it.
3. Select your dataset, which is “auto-ml-churn-test” in our case. Select objective – “Classification” and then “AutoML.”
4. Enter all necessary details and select the correct Target column, such as “Churn.” No need to go into Advanced Options.
5. Validate if you wish to use all columns and if the target column is correct.
6. The best part – select the number of budget hours you wish to spend on training. In our case, we entered only 2 hours as the churn dataset is small. Additionally, there is an option to “Enable early stopping” which ends the model training if it cannot learn anything new.
After the budgeted training time, your model will be ready, and its results will be available in the Model Registry in Vertex AI.
Results & Insights
For our Telecom churn dataset, we achieved a ROC AUC score of 0.889, indicating that the model quality is very good. Additionally, the model obtained a Precision score of 88.6% for a 0.7 confidence threshold, meaning that 9 out of 10 model predictions of the “Churn” label on the test set were correct.
At Spyrosoft, we are set on the practical application of AI in business to generate the most value for our clients. Combining the power of the model feature importance in Vertex AI and the visual powers of Looker dashboards, we were able to determine that Tenure and Contract features are the two most important churn drivers in the data. The bar chart on the left indicates how important each feature in the data is according to the model and, subsequently – how much impact a feature has on user retention. As seen in the line chart below, the lower the tenure, the more likely the user is to churn, with the highest value at zero and the lowest between 60-70.
Similarly, the Contract feature importance bar chart indicates that users on Two-year contracts are much more likely to stay with the company than Month-to-month. This only adds more weight to the insights we made while initially looking at the data.
The next step in the analysis and churn prediction would be to look at specific customers with a high likelihood to churn. By looking at customers churn score, service call log, monthly payment, and data usage, we can get a better picture of where a customer currently stands and what could be their potential pain points. Below is an example comparison of two of our customer profiles: Evie Goddard and Max Knight (fictional names).
Evie Goddard is an example of a user who already churned. Many of her issues remained unresolved, and her monthly payments showed a dramatic increase which eventually resulted in her dropping the company’s services. Although there is not much that can be done in Evie’s case, there is something that the company can learn from her example in their future churn prediction activities: Max Knight remains a user who continues using the company services, but it is worth noticing that his churn score is high. This means that giving Max some incentive could be the difference maker if he decides to stay with the company or leave.
Finally, maintaining our focus on creating value for a client, we made a list of the top 25 customers with the highest churn score. Such a list allows the telecom company to proactively contact the customers who are in the high-risk group and offer them incentives or discounts to prevent churn and keep their customers happy. With more data on possible actions, we can create a tailored recommendation for each customer. In our demo, we didn’t have such data, so the recommendation is the same for everyone.
Conclusion: Benefit from customer churn prediction using machine learning model
Churn prediction of customers remains a critical topic in business. With the help of Google Cloud Platform AutoML, and Looker, companies can leverage machine learning algorithms to create predictive models that accurately identify customers who are likely to leave. This technology allows businesses to take proactive measures to retain customers and increase customer lifetime value.
Google AutoML is an incredibly simple tool which can be used by businesses of all sizes. By adopting this technology, companies can employ effective churn prediction and, consequently, increase customer retention, boost revenue, and gain a competitive edge in their industry. With the help of Spyrosoft, businesses can take full advantage of Google’s machine learning solutions and keep up in this fast-paced world of data-driven decision-making.
Harness an Automated Machine Learning model to predict churn of customers
At Spyrosoft, we are set on leveraging AI solutions and applying them in practice to maximise the value offered to our clients. Customer churn prediction is a great use-case where machine learning model and the AutoML approach can make a significant impact, but there are many other examples of its usefulness, such as propensity modelling, predictive maintenance, sales and demand forecasting, and many others.
Reach out to us if you would like to learn more about how AI and machine learning algorithms could help your business.
About the author
Contact us