From Payrolls to Prompts: Firm-Level AI Labor Substitution

From payrolls to prompts: this firm-level study uses Ramp expense data to quantify whether cheaper AI tools substitute human labor. It analyzes how firms allocate spend between online labor marketplaces and AI providers after ChatGPT’s 2022 surge, revealing heterogeneity across firm types Real-world
1st MONTH FREE Basic or Pro • code FREE
Claim Offer

From Payrolls to Prompts: Firm-Level AI Labor Substitution

Table of Contents
- Introduction
- Why This Matters
- Data & Methodology
- Substitution Findings
- Efficiency, Costs, and Real-World Implications
- Limitations and Robustness
- Key Takeaways
- Sources & Further Reading

Introduction
If you’ve been trying to read the tea leaves on how AI is reshaping work, you’re not alone. The big question “are firms substituting labor with AI?” has policymakers and business leaders asking for concrete, micro-level evidence rather than broad headlines. A new study based on firm-level data, drawn from Ramp’s expense platform, dives into this exactly-correct question. The core idea is simple but powerful: when AI tools get cheaper or more capable, do firms scale back human labor and spend more on AI? And if they do, how big is the substitution, and does it look different across types of firms?

This analysis is built around a natural experiment—the sudden surge in AI awareness after the release of ChatGPT in October 2022—and a rich dataset that tracks firms’ spending on two things: online labor marketplaces (think Upwork, Fiverr, Toptal, and similar platforms) and AI model providers (OpenAI and Anthropic). The authors, led by Ryan Stevens, use a firm-level lens to see who shifts spending from people to prompts. For a deeper dive, you can read the original paper here: Payrolls to Prompts: Firm-Level Evidence on the Substitution of Labor for AI.

The study is not just about whether substitution happens; it’s about who gets hit first, how strong the substitution is, and what the price of AI-enabled labor looks like in practice. Spoiler: the evidence points to substitution, but not in a uniform way. The more a firm was already spending on online labor markets before ChatGPT, the more it scales AI spending after ChatGPT—and the substitution happens at a surprisingly low relative cost.

Why This Matters
This research lands right in the middle of a hot-policy and business debate today: is AI a job destroyer or a driver of new kinds of work? The immediate takeaway is nuanced. The paper shows micro-level substitution patterns—some firms shift a noticeable share of labor budget into AI spending, while others lag behind. Crucially, it highlights heterogeneity: high-exposure firms (those that already did a lot with outsourced labor) are the ones most likely to ramp AI tools, and they do so at a cost that looks surprisingly favorable from a cost-savings perspective.

Here’s why it matters now:
- Real-world decision-making. CFOs and procurement teams are wrestling with AI budgets in a world where software licenses, API calls, and AI infrastructure can be paid for out of a different line item than labor. The paper provides a framework to think about substitution intensity and cost savings in a concrete way.
- Labor-market implications. The finding that substitution is not uniform suggests the labor market might experience compositional shifts rather than an outright collapse in demand for labor. In other words, AI could change what kinds of roles are in demand, rather than simply how many roles exist.
- Policy and inequality. If high-exposure firms reap outsized cost savings by substituting labor for AI, wage growth and job composition could diverge across sectors and firm size. That has implications for training, wage policies, and regional labor dynamics.
- Builds on and extends prior AI literature. This study complements earlier work that uses job postings, wage data, or vacancy data to gauge AI exposure. It adds a firm-level, expenditure-based perspective that directly maps AI spending to labor-market substitutes, offering a complementary view to task-based or occupation-based exposure indices.

Real-world scenario: Imagine a mid-sized software consultancy that was already leaning on freelance developers via online marketplaces to handle peak work. After ChatGPT, this firm starts experimenting with AI copilots and API-powered automation. The study’s findings imply that such a firm—the kind of “higher prior OLN (online labor marketplace) spend” user—would be more likely to tilt its budget toward AI tools, while the relative need for human freelancers could fall, at least for certain tasks. This is exactly the kind of practical decision that business leaders need to make today: where to allocate scarce dollars, and how quickly to expect returns on AI investments.

Data & Methodology
Structure matters here. The researchers leverage Ramp, a business expense platform used by thousands of firms. Ramp tracks two main spending streams relevant to AI adoption:
- Online labor marketplaces (OLMs): Upwork, Fiverr, Toptal, PeoplePerHour, Arc, MarketerHire, Catalant.
- AI model providers: OpenAI and Anthropic.

The dataset spans Q3 2021 through Q3 2025, with a robust focus on the quarters before and after the ChatGPT introduction in October 2022. The study makes two important preparatory moves:
- Pre-treatment baseline: They define pre-treatment as Q2 2022 (the quarter before ChatGPT’s launch). This lets them capture firms’ exposure to AI tooling before the major shock.
- Exclusion criteria: Firms with minimal overall spending (less than $2,500 in a quarter) are excluded to avoid noise from entry or exit. They also exclude firms with no OLM spending in Q2 2022, since exposure can’t be measured reliably for those firms.

A key part of the design is the exposure variable Ei. The researchers measure exposure by the share of a firm’s total spending on online labor marketplaces in Q2 2022 (SOLMi,Q2 2022). They then bucket firms into quartiles of this OL market spend share, creating a simple and interpretable “dosage” variable: higher prior OLm spend implies higher AI adoption motivation after the ChatGPT shock.

The identification strategy uses a Two-Way Fixed Effects model with a difference-in-differences structure. The interaction term captures the causal effect of AI adoption shocks on labor spending:
- They run two parallel models: one for the share of spend on online labor marketplaces, and one for the share of spend on AI model providers.
- The key is the post-treatment period (post-ChatGPT), while controlling for firm fixed effects and quarter fixed effects. The coefficient on the interaction term δk tells us how much AI spending (or labor marketplace spending) changes per unit increase in exposure, after the ChatGPT shock.

The authors also use bootstrap methods to estimate the ratio δAI/δOLM, i.e., how much AI spending changes per dollar shift away from labor marketplaces. This ratio is crucial for translating the substitution into tangible cost-savings terms.

For context and transparency, you can read the methodology and full results in the original paper here: Payrolls to Prompts: Firm-Level Evidence on the Substitution of Labor for AI.

Substitution Findings
Substitution direction is clear: post-ChatGPT, firms with higher exposure to AI shocks tend to shift more of their spending from labor marketplaces to AI model providers. The results are twofold:
- AI model provider spending increases with AI exposure. Among the highest exposure quartile (firms with at least 75% of their OL marketplace spend in Q2 2022), AI model provider spending increased by 0.8 percentage points in Q3 2025 relative to the least-exposed group. This increase is measured as an absolute change in the AI spend share, on top of a baseline AI spend share that sits around 2.85% as of Q3 2025. In plain terms: a meaningful, statistically detectable tilt toward AI spending in the most AI-exposed firms, and not in the least-exposed ones.
- Labor marketplace spending declines at the same time. The highest-exposed firms cut their labor marketplace spending by about 15% relative to the least-exposed firms, indicating a real substitution away from external, human-enabled task marketplaces.

These results show both signs of substitution and the timing dynamics of adoption:
- Higher-exposed firms ramp AI spending earlier, suggesting a faster adoption path when a firm has more to gain from AI tooling.
- The substitution pattern holds after controlling for fixed firm characteristics and seasonal effects, reinforcing the view that the ChatGPT shock interacted with existing exposure to drive spending shifts.

A key nuance the authors acknowledge is that the evidence is not purely causal in the strictest sense. There are potential confounders and alternative explanations:
- Pre-trends: The highest-exposure group already spent more on OL marketplaces in late 2021 (before the AI shock), signaling potential prior momentum that could influence post-2022 dynamics.
- Compositional effects: Firms could be reorganizing their spend rather than making a straight line substitution. Some of the observed AI adoption could be coupled with other investments that amplify the effect.
- In-house costs: The analysis does not capture costs of building or maintaining AI capabilities in-house—like data infrastructure, model deployment, or engineering headcount—which means the true cost savings could be understated.

Importantly, the paper’s core contribution is the firm-level substitution pattern, not a blanket claim about macro labor market outcomes. The authors emphasize that their evidence is consistent with a world in which AI is labor-augmenting in many contexts—so even as some tasks become automated, the demand for labor in other areas (like AI deployment, integration, and maintenance) could rise faster than the rate of direct substitution.

Efficiency, Costs, and Real-World Implications
One of the most striking takeaways is the cost-side interpretation of substitution. The researchers quantify the efficiency gain by looking at the ratio of the AI spending response to the labor-marketplace spending response, δAI/δOLM, for different exposure quartiles. The results are telling:
- Highest exposure quartile (≥75% OL spend in Q2 2022): For every $1 decrease in labor marketplace spend, there is only about a $0.03 increase in AI model provider spend in Q3 2025 (relative to the 2022 baseline). In other words, the rate of substitution is small in dollar terms but against a much larger pre-existing labor marketplace spend for these firms.
- Middle exposure quartile (50–75% OL spend): For every $1 decrease in labor marketplace spend, there is a $0.30 increase in AI model provider spend—an order of magnitude larger change than the highest quartile, reflecting stronger substitution in this band.

The authors summarize the cost implication in practical terms: even if the substitution ratio is not one-for-one, the net effect is a meaningful cost saving. They provide a concrete example: if a firm spends $100,000 on labor marketplaces and $10,000 on AI model providers, substituting labor with AI could save around $90,000. That’s a striking illustration of how even relatively small shifts in the AI spend share can translate into substantial absolute savings, especially for firms with substantial pre-existing reliance on online labor marketplaces.

There’s also a broader point about efficiency and composition. The paper’s counterpoint to a naive “AI will replace all workers” view is that AI adoption can be labor-augmenting in aggregate demand terms. AI might automate certain tasks or reduce the need for contractors in some tasks while simultaneously creating demand for new roles in AI deployment, model maintenance, data preparation, and system integration. The net effect on employment depends on the pace of AI-enabled productivity gains in services and product delivery over time.

Limitations and Robustness
Every study with microdata and a natural experiment has caveats. The authors acknowledge several:
- Causality caveats: While the design uses a quasi-experimental shock (ChatGPT) and a dosage variable (pre-ChatGPT OL marketplace spending), there could still be unobserved factors driving both AI adoption and labor-market spending. Pre-trends also suggest that high-exposure firms had different spending patterns even before ChatGPT.
- Observability of all costs: The analysis focuses on observable expenditures through Ramp. It does not capture internal investments like data infrastructure, custom AI tooling, cloud compute, or engineering headcount that accompany AI adoption.
- External validity: Ramp’s customer base is diverse but not a bedroom-precise cross-section of all firms. The exact magnitudes of substitution could vary by sector, geography, and firm size not fully captured in the dataset.

Key Takeaways
- Substitution happens, but not uniformly. Firm-level data show that AI spending increased more for firms with higher pre-existing exposure to online labor marketplaces after ChatGPT, while labor marketplace spending declined more for those same firms.
- The cost picture is surprisingly favorable to AI adoption. In high-exposure firms, substituting labor for AI occurs at a relatively low incremental AI spend—though the exact ratio varies by exposure level. The study emphasizes a “20–25x” ballpark for cost savings in some interpretations, caveated by measurement limits.
- The dynamics are about micro-structure, not macro-structure. The evidence points to heterogeneous substitution at the firm level, which could translate into evolving job tasks, a shift in job quality, and changes in recruitment patterns, rather than an immediate, uniform decline in employment.
- Read beyond the headline: AI is likely to both automate certain tasks and create new ones. Firms with the right capabilities—data, integration, and AI deployment—may experience productivity gains that offset, or even exceed, reductions in contractor labor costs.

If you’re in the business of budgeting for AI or shaping a workforce strategy, what does this mean for you?
- Start with your exposure. If you’re already heavily relying on external labor marketplaces for execution, you’re more likely to see AI-enabled substitutions. This isn’t a universal rule, but it’s a meaningful signal.
- Price the AI correctly. The paper’s cost-saving takeaway is not about one-for-one swaps; it’s about the relative shift in spend. When evaluating AI tools, factor in not just the license costs but the potential savings from reduced contractor spend, plus any hidden costs for integration and maintenance.
- Think in steps, not flanks. Early adopters may show faster shifts in spend but still require human expertise for deployment and governance. The job of managing AI will lean toward “orchestrating” AI-enabled processes rather than simply “replacing” human labor.

Bottom line: This study adds a robust, firm-level piece to the evolving picture of AI and work. It shows genuine, measurable substitution from labor to AI spending, especially among firms with high prior investment in online labor marketplaces. But the path from micro-substitution to macro job outcomes is not a straight line. AI adoption might well push productivity and create new roles even as it reduces demand for some contractor-based tasks.

For readers who want to dive deeper, the original research paper offers a thorough walk-through of the data, identification strategy, and nuanced interpretation of the results. You can find it here: Payrolls to Prompts: Firm-Level Evidence on the Substitution of Labor for AI, and the study is led by Ryan Stevens.

Key Takeaways (condensed)
- Firms with higher pre-existing spend on online labor marketplaces are more likely to substitute labor for AI after the ChatGPT shock.
- AI spending grows in higher-exposure firms, while labor marketplace spending declines, with notable timing differences across exposure groups.
- The estimated substitution is not one-for-one; the ratio varies by exposure quartile, but even modest AI investment can generate substantial cost savings when labor costs are large.
- The results emphasize micro-level heterogeneity and suggest AI adoption may be labor-augmenting overall, depending on how firms scale their AI capabilities and how demand for AI-enabled services evolves.
- Real-world implication: boardrooms and policy discussions should consider not just fewer jobs, but more nuanced shifts in task allocation, skill demands, and the cost structure of labor versus AI tooling.

Sources & Further Reading
- Original Research Paper: Payrolls to Prompts: Firm-Level Evidence on the Substitution of Labor for AI
- Authors: Ryan Stevens
- Additional context (for interested readers): See related literature on automation exposure, AI in labor markets, and online freelancing dynamics cited in the paper, including works by Acemoglu, Brynjolfsson, Felten, and colleagues, which provide broader context on how AI interfaces with work, wages, and employment.

Frequently Asked Questions

Limited Time Offer

Unlock the full power of AI.

Ship better work in less time. No limits, no ads, no roadblocks.

1ST MONTH FREE Basic or Pro Plan
Code: FREE
Full AI Labs access
Unlimited Prompt Builder*
500+ Writing Assistant uses
Unlimited Humanizer
Unlimited private folders
Priority support & early releases
Cancel anytime 10,000+ members
*Fair usage applies on unlimited features to prevent abuse.