Skip to content

Metric Component Analysis: Dimension Reduction

This is part 7 of a 9 part series on Metric Component Analysis.

One of the biggest challenges of Component Analysis is that metrics can have so many components! For example, Revenue in the United States is a single metric, but if you break it down by State you now have 50 component metrics (Revenue per US State). If you have 10 products, you can again break Revenue into 10 component metrics (Revenue per Product). If you break Revenue down by both US State and Product, you end up with 500 component metrics! [1]

The process of reducing all of those options down to the few most important dimensions is called dimension reduction. Doing so can involve some complex mathematics, but let’s start with a general explanation of how those approaches work. Normally you would be working with dozens of dimensions, but I can’t chart those easily so we’ll use a simple example of only two dimensions.

Let’s take an example where we have Revenue for a large number of Products and a large number of Countries. While there are a lot of combinations of Product and Country, we can chart them in two dimensions such as the following:dimredux-02

It’s not obvious from this chart whether Product or Country is more interesting when finding patterns in Revenue. In fact, it’s hard to read anything from this chart at all! However, we can map this two-dimensional chart into the sub-dimensions (one-dimensional) as follows:

This allows us to view each dimension independently. For the Country dimension, there are no clear patterns as everything is fairly evenly distributed:



But the Product dimension shows two clear clusters developing:


If we were doing an investigation, we would start looking at the Product dimension first, as these two clusters might indicate an interesting behavior.

Obviously, this is an extremely simplistic example! While we did this in two dimension, it’s the same process you would use on data with dozens of dimensions where you map them onto small sets of dimensions looking for the interesting characteristics or behaviors.

By breaking down the higher dimension data into sub-dimensions, we are doing dimension reduction (well-named, huh?)! It’s not possible to do it manually, you will use some complex mathematical tools to do it for you – all you need to understand is how they work and when to use them.

The first one we will cover is called Principal Component Analysis and we’ll cover it tomorrow!

[1] If you want to know how complex this can get, check out our series on Combinatorics!

Quote of the Day: “Our tendency to perceive—to impose—narrativity and causality are symptoms of the same disease—dimension reduction.” ― Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable