"Innovation is taking things that exist and using them in a new way" – Tom Freston, Co-founder of MTV
In this story, let me introduce you to stacked data exploration, which is a new and advanced way to explore your data. As per my research, the term stacked data exploration does not exist yet. So consider this story as the primer on this very interesting subject.
What is Stacked Data Exploration
Stacked Data exploration is where you combine different data exploration techniques resulting in more advanced data exploration results. The output of one data exploration technique becomes an input to the next technique. The combined results are generally more powerful than the individual techniques.

Why is this useful
Stacking has proven to be very useful during machine learning where a learner is trained to combine the individual learners. You can take a similar concept and apply it to data exploration.
During the data exploration phase, generally, data scientist uses data exploration technique individually. Histograms, Correlation Matrix, Dimensionality reduction, Clustering, etc… are all used individually to explore the data. If individually they can give powerful results, imagine what they can do when we combine them together!
Let us see this in action!
Stacked Data Exploration Example
Now let us see stacked data exploration in action using an example. Let us take a dataset on telecommunication company customers. The dataset has demographic information, services, billing information, and if the customer has churned or not.

In this example, we will attempt the following stacked data exploration.

Here is a description of what each of these steps does. The final result will be revealed at the end.
Step 1 – Dimensionality Reduction (TSNE)
In this step, we will reduce the high dimensional data to 2-dimensions. This will help us visualize the data exploration results in a better way. This step will use TSNE (t-distributed stochastic neighbor embedding), as it does a very good job in keeping points that are close in high dimension, also close to each other in lower dimension space.
Here is the result of TSNE applied to the telecommunication dataset, where we reduce the data to two dimensions.

Each point represents a customer. We can go one step further and color the points based on field customer churn.

Now we will take the result of TSNE and input it into the next step of clustering.
Step 2— Clustering (DBSCAN)
In the previous step, we can observe nice cluster formation. We can take advantage of this fact and use a clustering algorithm on the TSNE output. This will help us assign a cluster number to each of the visually formed clusters in the visualization above.
The clustering technique used here is DBSCAN (Density-Based Spatial Clustering of Applications with Noise). The advantage of this technique is that it does not require specifying the number of clusters beforehand. Here is the result of DBSCAN applied to TSNE output.

We can observe 5 clusters clearly identified. As DBSCAN is a density-based technique, the clusters correspond to the dense regions. The cluster on the extreme right is a not too dense cluster. So we can ignore it for time being.
Insights after 2 stacking steps
Just with two steps in the data exploration stacking process, we have some very interesting insights such as:
- We observe nice cluster formation and we have been able to tag the densely formed clusters. This signifies that customers can be segmented into separate groups. This insight could be very useful.
- In each of the dense clusters, there is no clear separation of churn vs non-churn customers. This means that if you are using machine learning to predict churners, you will require a complex algorithm to separate churners from non-churners.
Step 3 – Machine learning to interpret the clusters (Decision Tree)
Let us go to the next level in stacked data exploration. We can try to interpret each of the clusters to see what differentiates churners from non-churners. So in this step 3, we run a decision tree for each of the clusters.

There are 5 decision tree which are calculated. However for simplicity, only the decision tree for cluster_3 and cluster_4 are shown below.


We can observe that for cluster_3, the most important field which differentiates churners and non-churners is Total Charges. This means that the predicted churners in cluster_3 are sensitive to total charges.
For cluster_4, the most important field which differentiates churners and non-churners is Contract and Tenure. The predicted churners in this cluster are those having low tenure and having monthly contract.
Using the results from stacked data exploration
You can use the results obtained till now in following ways:
Customer Segmentation Messaging: You can use the segments created above for customer segmentation and sending any marketing messages to the customers to avoid churn. The message can be fine-tuned based on each segment.
For example, as the predicted churners in cluster_3 are sensitive to total charges, the focus should be on value they are getting and thus justifying the charges.
For cluster_4, as the predicted churners have low tenure and monthly contract, the messaging should focus on advantages of long term contract, with an objective of converting monthly contract to yearly contracts.
Improving Machine learning model: If you are developing a machine learning model for predicting churn, it could be useful to train a model for each cluster rather than one single model. As the underlying reasons for churn are different for each cluster, you will be get better overall results.
The figure below show confusion matrix for one model approach, as well as confusion matrix for multi-model approach. In the multi-model approach, one machine learning classifier is trained for each of the cluster above.
With the multi-model approach, there is increase in true-positives as well as reduction on false-positives.

Conclusion
Stacked data exploration is an advanced way to explore your data. The results are powerful collectively compared to individual data exploration technique. In this story, I gave you an example of stacked data exploration. However there are limitless ways in you can combine different data exploration techniques.
Now it is your turn to come out with your own way of stacking different data exploration techniques ! You can comment on this story on which technique you have used.
Dataset citation
The telecommunication dataset is available here. Both commercial and non-commercial use of it is permitted.
Please subscribe in order to stay informed whenever I release a new story.
You can also join Medium with my referral link
Additional Resources
Website
You can visit my website to make analytics with zero coding. https://experiencedatascience.com
Youtube channel
Here is a link to my YouTube channel https://www.youtube.com/c/DataScienceDemonstrated





