Product
Outlier is automated data insights for your entire business.
Book a Demo
Industries
Outlier is automated data insights for your entire business.
Book a Demo
Solutions
Outlier is automated data insights for your entire business.
Book a Demo
Resources
Outlier is automated data insights for your entire business.
Book a Demo

Data Insights

What Data Insights should you look for in your data?

This is part 1 of a 5 part series on Data Insights.

Last week, we covered how to explore your data, but what should you be looking for when you do? Knowing what to look for is often the difference between success and failure when it comes to data exploration.

An insight in your data can take many forms, but is typically going to be related to an unexpected change in one or more of your metrics. For example, if revenue dips it is likely that sales are down, prices have dropped or something is wrong in your payments systems. Because they are unexpected, insights like these can be hard to find since you might not know to look for them.

Luckily, there are some common classes of interesting insights in your data. These common characteristics make it easier to know what kinds of insights you can find and what to look for when you find them. For example, I cannot tell you what might be causing Revenue to dip for your new line of blue shoes, but the kinds of changes that lead to dips in revenue are common.

This week we’ll review some of these common insights, how to find them and how to provide the context to understand them.

Tomorrow we’ll get started with the simplest form of insight, anomalies.

“A moment’s insight is sometimes worth a life’s experience.” 

Data Insights: What are Anomalies and why do they matter?

This is part 2 of a 5 part series on Data Insights.

The simplest kind of insight is an anomaly, or a single point that does not fit the normal pattern of its own historic data. [1] Consider the following metric over time:

There is an obvious anomaly in May where the value drops significantly. However, it might not always be clear from the data what is “normal”. To help with this, we can add a linear model like a regression or a simple moving average, which shows us the general trend over time. Below is the same data with a moving average trendline (green line) and confidence interval (light green) added:

That makes the anomaly easy to see, but how can you detect it automatically? The easiest way is to calculate the residuals for each point. The residual is simply the difference between the actual value and the trendline value for the same time. Below are the residuals for each point:

Looking at the data this way, the anomaly in May is easy to detect as it has a significantly higher residual than any other point (highlighted in orange).

Anomalies are obvious indications that something is changing, because there must be some cause for the metric to shift so significantly away from the previous pattern.

Tomorrow we’ll talk about a more interesting insight which involves more than one data point, developing trends.

[1] We covered anomaly detection before, in our series on Ad Campaign Optimization, so if this is too advanced you are welcome to use the simpler approach we covered there.

“Any fool can know. The point is to understand.” 

Data Insights: What are New Trends and why do they matter?

This is part 3 of a 5 part series on Data Insights.

New trends develop when the fundamentals of a given metric value change. Unlike anomalies, these changes persist over a period of time. The most interesting form of new trend is called a break, and it happens when the average of the values changes abruptly at a specific point in time.

For example, take the following data:

Clearly something changed! But how do we detect it? We can look at the fundamentals of the data before and after the drop, either by using a linear regression or a moving average. A break is apparent when there is a significant change in that fundamental from one point to the next.

As you can see, the regression lines are both flat but changed their magnitude by almost 1,200. In many cases breaks will not be as obvious as this example and you will need to choose a change threshold that is as sensitive as you need.

Detecting breaks is somewhat more difficult than anomalies because you will need to test if every point is a break point. One way is to create a sliding window that moves across each data point and computes the regressions before and after that point. If those regressions differ by more than a threshold you pick, you will have found a break in the data.

Tomorrow we’ll go even deeper by looking at insights that span more than one metric.

“The surest way of concealing from others the boundaries of one’s own knowledge is not to overstep them.”

Data Insights: What are Changing Relationships and why do they matter?

This is part 4 of a 5 part series on Data Insights.

Finding insights in single metrics can be helpful, but often the most valuable insights are those that deal with the relationships between metrics. Specifically, when the relationship between two metrics change it can indicate a serious shift in your business.

For example, here is a chart of two metrics:

As you can see, they look highly related over time. In statistics, this means they are highly correlated, which you can verify by calculating the correlation coefficient of the two metrics. For this example, the two metrics are highly correlated with a coefficient of 0.989 (1.0 would be perfect correlation).

However, at a specific point in time (September), their high correlation was broken in an obvious deviation. Such a break indicates a change to the business processes and performance that drive those metrics, and in this example something clearly significant happened. Such an insight is as close to a smoking gun as you are likely to find in any data!

It can be hard to find these relationship shifts because it requires you to combine every pair of metrics in your business, which is usually quite large. However, often you can reduce the complexity after starting your data exploration by only investigating relationships between metrics you know are important.

Tomorrow, we’ll wrap up our review of data insights with the most advanced insight yet, identifying composition.

“The only true wisdom is in knowing you know nothing.” 

Data Insights: Here are More Insights to look for…

This is part 5 of a 5 part series on Data Insights.

We’ve covered the three most common types of data insights so far this week, but there are many more insights you should keep an eye out for while exploring. Here are a few:

  • Emerging Trends. If an overall metric is not changing, but some of the key segments or dimensions of that metric are all trending up or down, there might be an emerging change. These changes start small but eventually grow large enough to impact the overall metric, so if you find one you can proactively handle the upcoming overall change. (Read more in our series on Metric Component Analysis).
  • Clustered Insights. A single anomaly or trend might not be interesting, but if there are a cluster of anomalies or trends that are all happening at the same time, it might be very interesting. (Read more in our series on Clustering).
  • Changes in Seasonality. Your data can change in ways that aren’t visible day-to-day, or even month-to-month. Changes in the seasonal shifts in your metrics (and business) can indicate larger market shifts and customer behavior changes. To see these changes, you’ll need to look at years of data and see if the cycles are consistent. (Read more in our series on Seasonality).

One of the most important types of insights that I can’t describe here are those you find using your own judgement. As an expert in your business, if something doesn’t look right, you should trust you instincts and dig in deeper. It’s very likely that your expertise can detect insights that might be hidden via any other means.

In Review: In the past two weeks we’ve reviewed a general framework for data exploration and some specific types of data insights you should look for during that exploration. Unfortunately, there isn’t much more I can tell you as how your exploration will proceed depends on your data and your business. I hope that you have some good tools at your disposal as they can help make it even easier!

“No thief, however skillful, can rob one of knowledge, and that is why knowledge is the best and safest treasure to acquire.” 


Sign up for a single idea in your inbox every Monday, to help you make better decisions using data.

Share this Post