Predictive Analytics: What Is It Good For?
By Evan Miller
May 4, 2013
“Predictive analytics” and “Big Data” are exciting concepts to geeks, investors, and businesspeople, but in many ways they are hammers still in search of nails. There have been a few high-profile successes, not to mention breathless books about our Big Data future, but it can be hard to cut through all the hyperbole. Since I develop statistics software, I spend a lot of time thinking about what this analytics stuff can actually be used for, and thought I'd share my perspective.
Before diving in too deeply, I should define what I mean by predictive analytics. Here I mean a model that predicts some outcome given information on multiple influencing factors. The outcome might be which team will win a basketball game, or a high school kid's chances of admission to various colleges. These models must be estimated from data. Although various kinds of predictive analytics have been around for a long time, there has been recent surge of interest in the subject because “Big Data” is promising models of things we couldn't model before.
But as a poet once asked about war: what is it good for? When it comes down to it, I think predictive analytics has basically two classes of business applications that generalize to multiple industries:
- Individual recommendations in a sea of information
- Predicting failure rates
There are other niche applications, but those two are the big ones. Recommendation systems have been written about extensively elsewhere (see especially the history of recommendations at Amazon and the fable of the pregnant teenager and the unscented lotion). I'll talk about large product catalogs and recommendation systems another day; but in this post I want to talk about failure rates, and how predictive models can reduce them. Along the way I want to answer the question: which businesses should be paying more attention to predictive modeling?
And conversely, which industries should ignore the hype, plug their ears, and go about their business?
Fraud and Failure
“Failure” is a broad term; in my mind it just means a costly outcome to an uncertain event. It could be anything from failing a term paper to all-out global nuclear warfare. Even though avoidable failures are a source of untold costs throughout the world economy, failure doesn't get very much attention in the media, except perhaps when it involves plane crashes, botched surgeries, or celebrity outfits. Like the insurance business, failure rates are depressing to think about and boring to discuss.
Insurance is the business of predicting failure, which perhaps explains its reputation for dullness. Actuarial science, of course, is the original predictive analytics. For hundreds of years now, insurance companies have been estimating the probability of death or accident in order to set premiums. Historically, the accuracy of a few columns of digits on an actuarial table could spell the difference between a firm's profit and its ruin. So for the modern insurance firm, embracing “predictive analytics” just means business as usual.
But even in insurance, there's a non-obvious application of predictive analytics: deciding when to send an investigator to verify an insurance claim. Investigation is costly, and reducing the rate of frivolous investigations — and increasing the rate of detected frauds — increases the firm's profit. Although not as essential as the core actuarial models, being able to predict insurance fraud is a business function perfectly suited to predictive analytics.
In fact, any business that deals regularly with fraud would benefit from predictive analytics. Credit card companies are the most obvious; anecdotally I have heard that their models are woefully unsophisticated. Part of the reason for this may be that they force merchants to bear the costs of fraud through “chargeback”. As a result, any company that lives in fear of next month's chargeback rate would be prudent to consider predictive analytics. (Consumer-facing internet companies are especially prone to this fear.)
Government agencies are susceptible to fraud, particularly when they disburse cash benefits or collect taxes. Any government agency that employs fraud investigators (there are many) would be irresponsible not to use predictive analytics in order to deploy them most effectively. A greater number fraudsters would be prosecuted, and fewer innocents would be harassed.
Conversely, companies that operate in environments of high trust do not really need predictive analytics, at least when it comes to deciding whether to approve transactions. In these cases, predictive analytics would tend to undermine trust. When a friend asks to borrow your car, you don't first ask for his credit score.
Predictive analytics may create new business opportunities in traditional industries. I think bail bonding is an excellent example of an industry that could be radically transformed by predictive analytics. Of course, statistically savvy bondsmen would find themselves with little to do if judges first start using predictive analytics to set bail.
Sales and Advertising
Any good salesman is accustomed to having the door slammed in his face; as with getting dates with strangers, sales consists mostly of rejection. Spending time with a prospect without closing a sale constitutes failure in the sense I described above. So in industries where salesmen have more leads than they have time to pursue, predictive analytics could increase sales by picking out the most promising leads out of the shoebox.
In fact, Michael Dell used a primitive form of predictive analytics to sell newspapers as a teenager. He figured newlyweds and people who just bought houses were the most likely to start a newspaper subscription. Supposedly he used the technique to earn more money in a summer than his teachers earned all year.
You can think of political campaigns as sales with other ends. As such, predictive analytics was a boon for determining how to persuade voters in the 2012 presidential campaign. Who should we call? Who should we not call? What phrases should we use? In some ways, predictive analytics is just a codification of a great politician's instincts. (For that matter, future county politicians may hone their techniques by reading regression results.)
Advertising is sales in print. Online advertisers pride themselves in their statistical sophistication with A/B testing, but the technique has been used in direct-mail advertising for at least a hundred years. Predictive analytics can take A/B testing to the next level with personalized messaging in both direct-mail and online advertisements. (For some technical ideas, see my previous article, “Linear Regression for Fun and Profit”.)
Conversely, predictive analytics is useless when selling to a small number of customers; just call all the phone numbers on the list, predictions be damned. In advertising, predictive analytics is useless if advertisements cannot be targeted, or if responses cannot be measured. If you're thinking of buying a Super Bowl ad, then forget predictive analytics. Just do old-fashioned market research.
A number of industries face another kind of costly failure that might be averted with predictive analytics: failure to sell non-durable inventory.
The classic examples are hotels, airlines, and railroads. Their facilities have fixed capacities, and the companies lose money on every empty bed, seat, and boxcar. Unsold inventory is costly to any business, but the cost is especially acute when the product can't be put into storage. A hotel with 500 rooms can't sell rent out 200 beds one night and 800 the next.
For years, these industries have been applying operations research and predictive analytics in order to set the right prices at the right times in order to maximize profit. The prices are carefully tuned to the time of year, the day of week, the weather, and anything else the model says is important.
More recently, baseball stadiums have been applying similar techniques to sell more seats at games. Franchise owners enjoy an added benefit of selling unused inventory, which is that games tend to be more exciting when the stadium is packed. (On the other hand, airplane flights tend to be less pleasant when they are completely full.)
It's possible that other industries could use similar techniques to manage their revenue, but they need to have a few key characteristics:
The first characteristic is the ability to change prices over time. Otherwise there's no lever to push.
The second characteristic is fluctuating demand. Otherwise there's no point in moving the prices.
The third characteristic is a fixed capacity. Otherwise inventory could just be stored and retrieved to meet demand.
Upon reading about airlines' sophisticated pricing and predictive models, it's tempting to think “Let's do that!” But unless an industry has characteristics similar to airlines, there's not much value sitting the proverbial table. Still, it's not hard to think of industries that stand to benefit from these kinds of models: buses and shuttle services, live music and theater, and container shipping, to name a few.
Not For Eveyone
Thus far I've argued that predictive analytics can assist any industry for which a failure rate (broadly defined) is a major source of costs: insurance, sales, and airlines were the big ones that I could think of, but there are others: oil and gas companies have been building sophisticated predictive models for years, professional sports teams have all gone Moneyball, and the members of the financial sector regularly obsess over the probability of credit and loan defaults. Personalized medicine seems to be the source of the next big wave of predictive successes.
So there are applications aplenty, and predictive analytics (I predict) is here to stay. But then which industries don't often deal with predictable failure, and can basically ignore all the hubbub?
Industries built on a small number of permanent relationships have little need for predictive analytics. Wholesalers, for instance, can do without predictive analytics unless they happen to deal with a large amount of theft, damage, or default. Law firms that have a handful of clients should probably just keep doing what they are paid to do: represent their clients' interests. Even though law firms often deal with failure in the form of lost lawsuits, there's not much predictive analytics can do except recommend better clients.
Retail operations might benefit from sales forecasting, but that is more the domain of traditional operations research. They would also benefit from reducing chargeback from credit card companies, but for an in-person purchase, they often don't have much information to go on. (Do credit card thieves buy fewer Bibles?) Experiments are an old technique in retail — putting bread and milk at opposite ends of the store, and so on — but predictive analytics does not have much extra insight to offer. As I alluded in the introduction, the only substantive application of predictive analytics in retail is cross-merchandising a large catalog.
Manufacturers might use predictive analytics in order to study why parts failed out in the field, but for the most part, predictive analytics is irrelevant to the average factory. All manufacturers experience failures on the assembly line, but these are probably analyzed well enough with pivot tables and a good pair of boots. Statistical analysis is of course central to quality control and to demand forecasting, but the models typically associated with “predictive analytics” — models which presage an outcome by analyzing multiple influencing factors — are probably not of much use. (Predicting the success or failure of R&D projects is a different matter.)
Predictive analytics comprises a powerful set of statistical techniques, but outside of insurance, it won't make or break the average company. It may provide a competitive edge and enable new business opportunities, but it's not the only sword that cuts. In the grand scheme of business, competent leadership, good products, and strong customer loyalty are usually more important.
No one likes to think about failure, but I think preventing failures is the real “killer application” of predictive analytics — it's at least as important as the traditional flagship, “Customers also bought…”. A well-tested predictive model is like having an experienced business analyst that can size up a situation quickly and tell you if something doesn't smell right. So if failure rates are an important source of costs in your business — and if you think the failure rates could be reduced with better information and judgment — then predictive analytics just might help your organization drive down costs, pursue promising leads, and increase profitability.