- Capital Gains
- Posts
- The Punitive Cost of High-Skill Hiring
The Punitive Cost of High-Skill Hiring
On O-rings and opportunity cost
Know someone who might like Capital Gains? Use the referral program to gain access to my database of book reviews (1), an invite to the Capital Gains Discord (2), stickers (10), and a mug (25). Scroll to the bottom of the email version of this edition or subscribe to get your referral link!
Over a sufficiently long period, every organization is defined by who joins, who gets promoted, and who quits. Physical capital decays, brand names come in and out of fashion, assets get acquired and divested, but there's a sort of hiring lineage that connects every single person involved in the organization back to its roots: you're either a founder, someone hired by a founder, or someone hired by someone in that chain. There are some fun things to look at in the left tail of the skill distribution, like Amazon (disclosure: long) determining warehouse locations by looking at how many non-college-bound high school seniors will graduate each year within commuting distance, or amusement parks' implicit minimum wage hedge where newly-expensive workers are in the same demographic as newly-flush incremental customers.1
But hiring gets a lot more interesting at the far right tail of the skill distribution, because firms have two complex problems to solve:
They need to hire people who are not just talented, but probably more talented than the ones they already have. Any good business that efficiently converts smarts, conscientiousness, risk-tolerance, etc. into money will inevitably attract competition, and there's a natural synergy between older-and-wiser senior employees and younger ones who are quick on the uptake. But
It's murderously expensive to assess these people's skills, because doing so takes time that those same senior employees would otherwise use for directly revenue-generating activities. The stricter their cutoff, the more people they're evaluating.2
One way companies handle this is, of course, to be ruthlessly selective about who makes it to an interview. There are two complementary ways to do this: you can set out an exact path with a known metric, and select for people who got on the right track very early (if you're going to write for the Harvard Law Review or do well in a math Olympiad, you need to have decided that that was the plan pretty early in life).3 The complementary theory is to find unique signals of high achievement, which also turn out to be signals of agency. Someone who fails out of their statistics major because they spend all their time making positive-EV bets on sporting events and politics has, in effect, decided to roll their own undergrad capstone project in very-much-applied statistics instead.
The proof-of-work-and-agency model seems anecdotally more common at earlier stages, and it's rarer but not unheard of for it to work at big institutions. One of the drawbacks is that the flow of talent is unpredictable: how do you estimate how often someone will come out of nowhere and deliver an incredibly impressive app, incisive essay, killer trade writeup, etc.? It's also getting rarer as companies look for talent earlier: these companies have a vested interest in making the path towards working for them more visible, so they don't miss out on talent that just wasn't aware that they had the option. And since time is finite, all of these legible options crowd out the illegible ones. There's also a point in a company's existence, quite variable between companies and over time, at which a certain amount of agency is a liability. A brand and a balance sheet both get more valuable over time, and at some point it's just not great to hire people who are excessively willing to risk both. Key man risk is also a consideration. At early stage companies, capital is scarce. This means every employee is expected to generate a sufficiently large return on the capital invested in employing them, because if they don’t the company won’t hit the incremental operational milestone necessary to raise more capital. This means that while the machine is still being built, large quantums of output per employee are table stakes. This sort of outsized return per employee requires outsized risk-taking per employee. Large companies optimize for the bus problem: maximizing the amount of employees that could get hit by a bus and it not have an effect on the broader organization or machine. Institutional knowledge becomes adequately distributed and systematized such that everyone plays a small role in operating the machine, but it doesn’t require idiosyncratic or irreplaceable knowledge for the machine to keep operating.
One of the downsides to the legible-filter model is that it selects a pool of candidates who anyone can tell is probably desirable. And this means that they're expensive, but also that employers have an incentive to make preemptive exploding offers early in the hiring process. In extreme cases, like investment bankers interviewing for private equity associate roles, the interview process for the PE job starts a few months into the banking job (or increasingly these days, during training, before banking analysts even hit the desk). It's worth thinking through the overall incentive structure here. These offers are necessarily less informed than the ones that happen after a slower interview process. If the employer collects relatively more of their information about talent after they've made the decision to hire them, the natural equilibrium is some combination of highly variable compensation and a policy of fast firing.
And that fast-firing is more common than one might naively expect because of o-ring dynamics: complex multi-step processes (lawsuits, private equity deals, signal generation for quant funds, brain surgeries, chip fabrication) are most dependent on the worst-executed set of steps, so there's a point at which someone who is objectively very skilled, but near the bottom of the distribution in their company, is more of a liability than an asset. This shows up in all kinds of ways, but the easiest to think about is communication: sometimes the difference between success and failure comes down to whether someone immediately understands an email they received, or has to ask a few clarifying questions. When a 1590 on the SAT gets noted as a negative signal, this is why—the person in question is obviously bright, but they just might waste some generally-valuable time right when its value peaks.
And this helps explain why starting compensation in some fields goes so nonlinear over time. Increasing the acceptance rate of new hires by bumping their comp from low- to mid-six figures can save more than that in opportunity cost. So, once you've gone through the incredible expense of identifying the right person, the cost of paying them enough to make a "yes" on the offer the likely decision is comparatively trivial.
The Diff has written a few pieces on the economics of talent, including:
Share Capital Gains
Subscribed readers can participate in our referral program! If you're not already subscribed, click the button below and we'll email you your link; if you are already subscribed, you can find your referral link in the email version of this edition.

Join the discussion!
1 Incidentally, this is a case where theme parks have taken advantage of an opportunity for operational hedging of macro risks: a disproportionate share of their attendance growth has come from expanding the traditional season into Halloween and Christmas events, instead of closing the parks from Labor Day to Easter. For Halloween in particular, the teen and young-adult attendees are exactly the kinds of people whose labor costs affect margins during the more family-friendly peak summer period.
2 This is especially punitive because the value these people add is best understood as a return on incremental labor paired with the human capital they've accumulated over time. So you're taking a business that has a high upfront cost and high incremental margins, and deliberately cutting utilization. It's like grounding a plane or idling an oil refinery or leaving an autonomous car in the parking lot: not something you want to do gratuitously.
3 "You" is doing a bit of work here, and parents' assumptions about which skills count and willingness to push their kids make a big difference in outcomes at age 18, even if that's mostly a wash by 28 or so. My kids are going to participate in more math competitions than I did, because that wasn't really on my parents' radar when I was growing up, whereas I'd been raring to go on The Art of Problem Solving ever since my oldest could count to five. I don't necessarily expect them to do very well, but it's a good idea to calibrate how "good at math" you are by finding out the range of talent and commitment that exists in the broader world. If the average classroom size is 25 kids, and kids are age-segregated, it's very valuable to know if being the best in class means being top 4% in one very narrow cohort or placing well on a more comprehensive leaderboard.
Reply