Wednesday, July 11, 2018

Diagnostic analytics

Effective prognosis is not possible without effective diagnosis.
Diagnostic Analytics is a  form of advanced analytics which examines data or content to answer the question “Why did it happen?”, and is characterized by techniques such as drill-down, data discovery, data mining and correlations.

Diagnostic analytics is an augmented form of descriptive analytics, which compares the relationship between two variables and outcomes to discover ongoing trends. Descriptive analytics provides information on the lines of what happened in the past whereas diagnostic analytics gives insights into the ‘why’ aspect of the outcome. For instance, diagnostic analytics compares the data set of two different promotional campaigns to ascertain why one campaign succeeded while the other failed. By establishing a correlation between multiple variables, a retailer can determine what factors can be changed to achieve the desired result.

Diagnostic Analytics

Diagnostic analytics is a form of advance analytics which examines data or content to answer the question “Why did it happen?”, and is characterized by techniques such as drill-down, data discovery, data mining and correlations. Diagnostic analytics takes a deeper look at data to attempt to understand the causes of events and behaviors.
What Are The Benefits of Diagnostic Analytics?
Diagnostic analytics lets you understand your data faster to answer critical workforce questions. Cornerstone View provides the fastest and simplest way for organizations to gain more meaningful insight into their employees and solve complex workforce issues. Interactive data visualization tools allow managers to easily search, filter and compare people by centralizing information from across the Cornerstone unified talent management suite. For example, users can find the right candidate to fill a position, select high potential employees for succession, and quickly compare succession metrics and performance reviews across select employees to reveal meaningful insights about talent pools. Filters also allow for a snapshot of employees across multiple categories such as location, division, performance and tenure.
So here's an example. We have Twitter mention and sales, and we can see that the more Twitter mentions you have the higher the sales go as you can see from this line here. And these dots out here, the distance away from these lines and the more clustered these dots are to this line, the more confident we are that this correlation is in fact true. In contrast, we go down here and we see time on market and we plot it relative to sales. And these points are kind of all over the place and the line is flat. This is just clearly not correlated. Time on market really doesn't impact the revenue. And this is all really good to know because we need to know what drives revenue in order to build a good predictive model, and to understand what's really driving all of our data in general.
If you're starting to go into diagnostic analytics and data science you probably have a pretty good descriptive and analysis base. So you probably have a pretty good dashboard. So here we're gonna go into an example where you have revenue by different departments, like pantry, and dairy, and meat. And if you have a really good visualization you click in it and then it shows you the revenue of all your products in that department. And then let's say you're interested in your top seller. And you click on it and you see it over time in a given year, your sales has this weird bump where it's really high on either side, but then it's really low. Which makes sense because we've identified that the time of year is correlated to revenue, which means that for a given time of year you're gonna sell more than another time of the year.
The next step on your analytics journey is to discover why something has happened, and for that you need diagnostic analytics. Here's an overview of the practices including an example, a step-by-step guide and some best practices.
Before we start, though, we'd like to let you know that Econsultancy is running an Advanced Mastering Analytics in Singapore on Tuesday, August 15th. Click here to see more details and book your spot.

What is analytics?

In the previous post, we defined analytics in detail, but essentially analytics is a practice, a process, and a discipline; the purpose of which is to turn data into actionable insight.

Diagnostic analytics overview

Previously, we discussed how descriptive analytics will tell you what just happened. To understand why, however, you need to do some more work. You need to perform diagnostic analytics.
In many cases, when there is a single 'root cause' of the situation, diagnostic analytics can be quick and simple - you just need to find that root cause.
But, if no root cause is apparent, then you need to use diagnotic techniques to discover a causal relationships between two or more data sets.
The analyst also needs to make it clear what data is relevant to the analysis so that the relationship between the two data sets is clear.

An example of diagnostic analytics

In a descriptive report, you note that website revenue is down 8% from the same quarter last year. In an attempt to get ahead of your boss's questions, you conduct diagnostic analytics to find out why.
First, you look for a root cause.  Perhaps there was a change in ad spend, a rise in cart abandonments, or even a change in Google's algorithm which has affected your web traffic.
Finding nothing, you then look at one of the data sets which contribute to revenue: impressions, clicks, conversions, and new customer sign-ups.
You discover from the data that changes in revenue closely tracks changes in new customer sign-ups, and so you isolate these two data series in a graph showing the relationship. This then leaves you, or one of your colleagues, to conduct diagnostic analysis on user registrations to find out why they are down.

The distinguishing features of diagnostic analytics

Like descriptive analytics, diagnostics requires past 'owned' data but, unlike descriptive analytics, diagnostic analytics will often include outside information if it helps determine what happened.
From the example above, it's clear that domain knowledge is also more important with diagnostic analytics. External information from a wide range of sources should be considered in root cause analysis.
And, when comparing data sets looking for a relationship, statistical analysis may be required for a diagnoses, specifically regression analysis (see point 2 below).
Finally, with diagnostic analytics you are trying to tell a story which isn't apparent in the data and so the analyst needs to go 'out on a limb' and offer an opinion.

How to do diagnostic analytics:

1) Identify something worth investigating

The first step is doing diagnostic analytics is to find something that is worth investigating. Typically this is something bad, like a fall in revenue or clicks, but it could also be an unexpected performance boost.
Regardless, the change you're looking to diagnose should be rare as analysing volatile data is a pointless exercise.

2) Do the analysis

As shown in the example above, diagnostic analytics may be as straightforward as finding a single root cause - i.e. revenue dropped last month because new customer sign-ups were down.
More complex analyses, however, may require multiple data sets and the search for a correlation using regression analysis. How to carry out regression analysis is beyond the scope of this post but there are many excellent tutorials available to help you with it.
What you are trying to accomplish in this step is to find a statistically valid relationship between two data sets, where the rise (or fall) in one causes a rise (or fall) in another.
More advanced techniques in this area include data mining and principal component analysis, but straightforward regression analysis is a great place to get started.

3) Selectively filter your diagnoses

While it may be interesting that a variety of factors contributed to a change in performance, it's not helpful to list every possible cause in a report.
Instead an analyst should aim to discover the single, or at most two, most influential factor(s) in the issue being diagnosed.

4) State your conclusion clearly

Finally, a diagnostic report must come to a conclusion and make a very clear case for it.
It does not have to include all of the background work, but you should:
  • identify the issue you're diagnosing,
  • state why you think it happened, and
  • provide your supporting evidence
Currently, analytics seems to be largely focused on describing data through reports. The potential for the practice, however, is far greater than displaying data and letting the audience make conclusions.
Analysts can do better, though. They can provide further insights into the data by using diagnostic analytics to try and explain why certain things happen.
Ideally, marketing reports should contain both. Descriptive charts and graphs to keep people informed about the systems and results which concern them and separate, diagnostic reports which aim to explain a significant phenomena such as a decline in new business or a change in web browsing behaviour.
Not only will this help the reader to understand why some decisions have been made, but it also provides evidence that the report writer understands the data and the point of collecting it. That is, we collect data so that we can make better-informed decisions through analytics.
In contrast to descriptive analytics, diagnostic analytics is less focused on what has occurred but rather focused on why something happened. In general, these analytics are looking on the processes and causes, instead of the result. Here is an example diagnostic analytics “Revenue is up in the East coast and the likely reason is the increase in investment on targeted marketing approach, closure of a major competitor in the area.”
Take note that descriptive analytics cannot provide an answer to important questions such as “How can we avoid this problem” or “How can we duplicate this solution?” These are covered by diagnostic analytics.  One application of diagnostic analytics is in the field of sports. For instance, in the Major League, many teams have ditched away from assessing the pitchers on the number of runs they allow.
Based on gathered data, the rate that the pitchers allow runs (result) are more likely to be less consistent compared to their rate of strikeouts and walks (process). As a matter of fact, the runs for the current year are less closely related with the runs allowed last year compared to the strikeouts and walks last year.  Successful baseball teams are now focusing on the process instead of the outcome, so they can better assess the talent of the pitchers and determine which team they should trade or acquire.
It may cost about $6 million to $8 million to sign up a free-agent pitcher, so understanding whether his performance was really caused by his skills can be helpful for league teams.

No comments:

Post a Comment