• Capital Gains
  • Posts
  • What is Alternative Data, and Why Does it Matter?

What is Alternative Data, and Why Does it Matter?

Data Exhaust, Signals Intelligence, and Other Features of the Modern Investment Process

Know someone who might like Capital Gains? Use the referral program to gain access to my database of book reviews (1), an invite to the Capital Gains Discord (2), stickers (10), and a mug (25). Scroll to the bottom of the email version of this edition or subscribe to get your referral link!

Investing well means finding profitable transactions, which means finding buyers and sellers willing to pay too much or ask too little for whatever it is that they're trading. So investing is inherently about outperforming others, making it a competitive game of relative performance. It’s easy to see how this competition unfolds: there was a time when alpha consisted of reading press releases faster than other people, or of reading the entire 10-K instead of focusing on a glossy annual report that omitted key metrics and juicy accounting-related footnotes. These things can still be valuable, but the easy upside from them is gone. Investors can try to get better at analysis, and some do make this a deliberate practice, but that's more of a wish than a plan. The alternative is to be just as good at analysis as you always were, and to apply that skill to information other people aren't looking at.

This is new as a defined field, but it's been around for a long time. The Bank of England used to have a weathervane showing which way wind was blowing on the Thames, because when it was easy for more ships to get to London, London needed more liquidity. People have done channel checks for decades (Peter Lynch writes about paying close attention to what his wife and kids wanted to shop for at the mall; Warren Buffett's investment process for Amex included swinging by local restaurants to see if people were afraid to use Amex to pay). What's changed in the last decade-and-change is the proliferation of datasets—anonymized card transaction data, web-scraping outputs, marketers' databases, regulatory documents, point-of-sale data, coupon tracking, industry-specific price feeds, etc.—and the tools needed to ingest and analyze them.

(I was, incidentally, a participant in this process, both at a hedge fund and at various data providers. It's a fun space, and it appears to have gotten even more fun since I left.)

One of the earliest alternative data projects for which there's a public record is Paulson & Co's process for testing their mortgage hypothesis, as described in The Greatest Trade Ever. They acquired granular data on housing prices and mortgages, and ran through simulations on what different macroeconomic scenarios would do to default rates.

This was a surprisingly advanced project. Not just in the sense that it involved assembling reams of data to refine an investment thesis well before that was a standard practice, but because it involved using data to articulate a thesis about path-dependence. The industry's later data applications were actually more simplistic, and started with the observation that some datasets—web traffic, consumer spending data, some kinds of web scraping—could be used to predict revenue growth.

The next era of alternative data in finance was all about finding the data source (or a combination of data sources) that best predicts topline performance. If revenue's rising, buy; if it's declining, sell. But this turned out to be trickier than it looked. Coverage isn't perfect—panels have demographic skew, tracking web traffic means prioritizing some sales channels over others, tracking foot traffic means prioritizing an entirely different channel. And once the companies that owned datasets realized that investors would buy them, they started selling. Soon enough, the credit card data became synonymous with the buy-side consensus, and that meant that the real action was now in finding cases where the data overshot fundamentals and then betting against the large cohort of levered, tightly risk-managed short-term investors who were all trading on the same signal.

But the real evolution of alternative data has been away from calling quarters and towards producing The Missing KPIs. Companies report some of their numbers in a standardized way driven by accounting rules, but they'll also drop other numbers—active users, retention, net promoter score, the mix of one-time purchases and subscription revenue, etc. When companies choose which numbers to release, one of the things they have to keep in mind is that if they regularly report a number, and then decide to stop, investors will worry that something has gone wrong. Sometimes there's a business justification, like when Apple decided to offer less detail on unit sales because their economics had shifted more to recurring revenue. But it always makes investors skeptical. And a result of this is that if there's a metric that's important to the company, and management worries that that number won't always look good, they won't report it.

But someone with enough data can tease out the number. For example, if a company offers a subscription product in addition to one-time purchases, but is reluctant to reveal the subscription numbers, anyone with access to a consumer spending panel can look at sequences of $9.99 purchases occuring a month apart and start to see, at least within their panel, what market penetration for the subscription product looks like. But they don't have to stop there; ironically, it's hard to get a good risk-adjusted return out of being the only person to discover a secret because the price doesn't react if the secret stays secret. So the analyst has two options:

  1. They could just make a trade and then tell everyone else that the company's subscription revenue is far from the consensus view and that the stock ought to be repriced.

  2. They can go deeper, and try to figure out what those subscription economics look like exactly. What is the difference in profitability between a Walmart+ user and a regular shopper? How does customer loyalty compare across different price points for Netflix's various offers? Does an UberOne customer shift enough of their spending from Lyft to Uber that it would make sense to cut the price of a subscription?

All of this means getting at the true unit economics of the business, and getting inside the head of people making strategic decisions. Outside data still has limitations: you can't easily control for the selection effect where heavy spenders are likely to subscribe, whereas the company could highlight its membership offering to one subset of its audience, not show it to another, and get a like-for-like comparison. But this analysis has an information advantage in benchmarking: a data-driven investor might get the customer acquisition cost and customer lifetime value numbers wrong by 10%, but if that investor sees that one company's LTV/CAC ratio is 50% higher than all of its peers, it doesn't matter if the real number is 30% or 70%. Either way, it's a big deal, and it's entirely possible that the company doesn't recognize how far ahead of the competition it really is. Even better, when the data doesn’t lead to an immediate long-term thesis, it provides more context to react to short-term news flow. Depending on the economics of a business, cutting the price of one of their products might drive a big spike in the purchases of a high-margin complement, or might mean that the company’s unable to beat its competition on quality and has to resort to price instead. The alternative data analyst is better-positioned to have a view on this in advance and to be able to validate or refute it through data.

Alternative data’s also useful as a complement to “corporate access,” ($, Diff) meetings with company management typically arranged by investment banks’ research departments. Someone who is familiar with the data has better questions to ask management—and they also get a chance to demonstrate their bona fide expertise on the business, or even to tell the company something useful about how it stacks up against competitors. These meetings always involve some give-and-take; the investor learns more about the company, the company learns what investors want to see to make the stock go up. If the investor can give a bit more, they’ll usually get more in return.

So that's where alternative data ends up. It's actually a bit like the regular structure of the market, where short-term speculation and market-making create enough liquidity to make careful asset selection a worthwhile effort. And its proliferation for extremely short-term trading has had two effects: the ebbs and flows in a company's business were more accurately reflected in its stock price intra quarter, and that there are better tools available to understand what a company will look like in ten years rather than to predict what investors will think of it next quarter.

Read More in The Diff

The Diff has covered alternative data, corporate disclosures, and how data gets used many times. A sampling:

Share Capital Gains

Subscribed readers can participate in our referral program! If you're not already subscribed, click the button below and we'll email you your link; if you are already subscribed, you can find your referral link in the email version of this edition.

Join the discussion!

Join the conversation

or to participate.