Replicate updated their API terms with two changes that deserve attention if you're running any kind of automated workflow on their platform. Neither change was announced. Both are now live.

What Changed

You pay for failed runs. This isn't entirely new behavior, but it's now explicitly codified in the terms. For private models and deployments, failed and canceled runs are billed for the time the compute instance was active — regardless of whether your request produced any output. If a model call fails halfway through, you're paying for that half. If you're running multi-step workflows where one model calls downstream models, you're also billed for any downstream compute consumed before the point of failure.

Disputing those charges just got more restrictive. The updated terms require you to raise any billing dispute within 30 days of the payment due date. More importantly: continued use of the API after experiencing a billing issue is treated as acceptance of the charge. In practice, that means if you notice a questionable charge, keep running your workflows while you sort it out, and then try to dispute — you've already waived your right to dispute by staying active on the platform.

The terms also establish Replicate's internal billing process as the required first step before any external dispute or chargeback. There's no specified timeline for how long their internal review takes, which is the detail worth watching.

The Bigger Picture

Failed API calls and billing disputes have become a recurring friction point across AI infrastructure platforms in 2026. Google Gemini spent most of Q1 dealing with a billing bug that generated five-figure charges for some developers, with refunds issued as credits rather than actual refunds, and payment profiles locked out for anyone who filed a bank dispute directly.

Replicate's terms don't go that far. But the pattern is consistent: as AI API usage scales, platforms are quietly tightening the language around who bears the cost when things go wrong. The answer is increasingly: you do, and your recourse is narrower than you might assume.

For developers running high-volume or automated pipelines on Replicate — image generation, fine-tuned model inference, multi-model chains — the failed-run billing clause is the one that will show up as an unexpected line item first. The dispute window clause is the one that limits what you can do about it afterward.

What to Do

Three things worth doing now if Replicate is in your stack:

First, check your error handling. If failed runs are billable compute, unhandled errors and retry loops become a direct cost. Make sure your pipeline is logging failures and not silently retrying on a dead request.

Second, set up usage monitoring. Replicate's billing dashboard shows current usage — use it. Surprises discovered at month-end are harder to dispute than surprises caught in real time, and the 30-day window starts from payment due date, not from when you noticed the charge.

Third, read the dispute clause before you need it. If you ever do have a billing issue, don't keep running production workloads while you wait for resolution — the continued usage waiver is the clause most likely to catch developers off guard.

Trish @ StackDrift

This change was detected by the StackDrift scanner. Drift Intel tracks vendor terms, pricing, and API policy changes so you don't have to find them the hard way.

Want to stay in the loop? Check out our Youtube Channel or subscribe to Drift Intel for weekly deep dives.

Keep Reading