Develop model for customer churn prediction using decision tree

Key takeaways

  • Customer churn prediction helps businesses identify which customers are likely to stop using their products or services. This is done by analyzing past behavior and patterns in customer data to understand signs of potential churn.

  • By predicting churn, businesses can take proactive measures like targeted campaigns and personalized offers to retain at-risk customers, boosting overall customer satisfaction and profitability.

  • Machine learning models such as decision trees can efficiently predict churn, offering actionable insights into customer retention strategies, while model tuning can further improve prediction accuracy and business outcomes.

Customer churn prediction involves identifying individuals who may discontinue their usage of a product or service. This is achieved by analyzing past customer data to recognize patterns and behaviors indicating potential churn. By utilizing machine learning algorithms, businesses can predict which customers are at risk of churning. The objective is to implement preemptive measures, like targeted marketing campaigns and personalized offers, to retain customers and enhance satisfaction, thereby bolstering business profitability.

Step-by-step guide

We'll develop a model which involves several steps.

Initialize DecisionTreeClassifier

We import the DecisionTreeClassifier from scikit-learn and train_test_split for data splitting, then initialize a DecisionTreeClassifier object, and finally display the first few rows of the DataFrame df.

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
dectree=DecisionTreeClassifier()
df.head()
Initializing DecisionTreeClassifier for data splitting

Split data into training and testing sets

We split the dataset into features (X) and the target variable (y), then further split the data into training and testing sets using a 70-30 split ratio. The test_size=0.3 indicates that 30% of the data will be used for testing and the remaining 70% for training. It fits the DecisionTreeClassifier model to the training data and subsequently makes predictions on the test data, storing the predictions in the variable dectree_predict.

X=df.drop('Exited',axis=1)
y=df['Exited']
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=101)
dectree.fit(X_train,y_train)
# Making predictions
dectree_predict=dectree.predict(X_test)
Splitting data into training and testing sets

Evaluate classifier performance

We compute and print a classification report, which includes precision, recall, F1-score, and support for each class, based on the predictions made by the decision tree model (dectree_predict) on the test data (y_test). Additionally, we calculate and print the accuracy and f1_score for the test set predictions.

from sklearn.metrics import classification_report,confusion_matrix,accuracy_score,f1_score
print(f" Classification report :\n {classification_report(y_test,dectree_predict)}")
print("Accuracy (Test Set): %.2f" % accuracy_score(y_test, dectree_predict))
print("F1-Score (Test Set): %.2f" % f1_score(y_test, dectree_predict))
Evaluating decision tree classifier performance with classification report

Visualize confusion matrix

We create a DataFrame matrix_df containing the confusion matrix computed from the predictions (dectree_predict) and the actual labels (y_test). It then plots the confusion matrix as a heatmap using Seaborn, annotating the cell values with the actual counts. The title, x-axis label, and y-axis label are set accordingly, and the plot is displayed.

matrix_df = pd.DataFrame(confusion_matrix(y_test,dectree_predict))#plot the result
ax = plt.axes()
sns.set(font_scale=1.3)
plt.figure(figsize=(10,7))
sns.heatmap(matrix_df, annot=True, fmt="g", ax=ax, cmap="magma")#set axis titles
ax.set_title('Confusion Matrix - Decision Tree')
ax.set_xlabel("Predicted label", fontsize =15)
ax.set_ylabel("True Label", fontsize=15)
plt.show()
Visualizing confusion Matrix for decision tree classifier

Tuning the parameters

We initialize a new DecisionTreeClassifier with specified hyperparameters (criterion='entropy', min_samples_split=10, min_samples_leaf=6, max_features='sqrt', random_state=1), train it on the training data, and make predictions on the test data. Then, we print the classification report, confusion matrix, accuracy, and F1-score for the new decision tree classifier (dectreeclasfier_new).

dectreeclasfier_new = DecisionTreeClassifier(criterion = 'entropy', min_samples_split = 10, min_samples_leaf = 6 , max_features = 'sqrt', random_state = 1)
dectreeclasfier_new.fit(X_train,y_train)
dectreeclasfier_predict=dectreeclasfier_new.predict(X_test)
print(f" Classification report :\n {classification_report(y_test,dectreeclasfier_predict)}")
print(f" Confusion Matrix :\n {confusion_matrix(y_test,dectreeclasfier_predict)}")
print("Accuracy (Test Set): %.2f" % accuracy_score(y_test, dectreeclasfier_predict))
print("F1-Score (Test Set): %.2f" % f1_score(y_test, dectreeclasfier_predict))
Evaluating model's performance after tuning the parameters.

Try it yourself

Click the "Run" button and then click the link provided under the "Run" button to open the Jupyter Notebook.

Please note that the notebook cells have been pre-configured to display the outputs
for your convenience and to facilitate an understanding of the concepts covered. 
You are encouraged to actively engage with the material by changing the 
variable values. 
Developing model for customer churn prediction using decision tree

Frequently asked questions

Haven’t found what you were looking for? Contact Us


What is the best model for customer churn prediction?

The best model varies by dataset but commonly includes logistic regression, decision trees, random forests, and gradient boosting machines.


What regression model is used for churn prediction?

Logistic regression is typically used for churn prediction as it handles binary outcomes effectively.


What is a decision tree?

A decision tree is a supervised learning algorithm that splits data into branches based on feature values to make predictions or classifications.


Free Resources

Copyright ©2025 Educative, Inc. All rights reserved