How to Improve Time to First Byte Without Code

Your website starts answering questions the moment a user's browser sends a request. Time to First Byte is how long that first answer takes to arrive. It's the single most consequential metric that doesn't show up in Core Web Vitals, which is why most founders treat it like background noise until they notice their pipeline has gotten quieter.

TTFB sits upstream of everything else. Before your JavaScript loads, before images render, before the page is even visible, TTFB is already running. It determines whether a user bounces or stays. It controls whether Google's crawler spends budget on your site or moves on. It affects whether LLMs can cite you or give up waiting. And it has a direct, measurable relationship to revenue.

This isn't about optimization theater. It's about treating your web presence like infrastructure instead of a marketing project.

What TTFB Actually Measures

Time to First Byte is the elapsed time from when a browser sends an HTTP request until the first byte of the response arrives back. That's it. It includes network latency, server processing, and nothing else. It does not include rendering, JavaScript execution, or image download. It's purely the handshake between your user and your origin server.

Google rates a TTFB of 800 ms or less as "good" at the 75th percentile[1], with anything over 1,800 ms rated "poor." Most tech company websites sit between 1,000 and 2,500 ms, which means they're failing silently. You don't see it in Lighthouse reports, visitors don't consciously feel it, but the damage is compounding every single day.

The reason TTFB matters more than the scores suggest is that it's a constraint. A high TTFB means that even if everything else on your page is optimized, the user has already decided whether to stay or leave. Every additional second of load time reduces conversion rates by 4.42% and raises bounce rate probability by 32%[2]. That's not marketing noise. That's revenue leaking directly into network latency.

Why Your Current Setup Probably Has Slow TTFB

Most tech companies use one of two setups that guarantee slow TTFB: a traditional CMS running on shared hosting, or a monolithic application server sitting behind a CDN.

WordPress averages around 1.4 seconds of TTFB[4]. That's the baseline, before any of your customization adds more lag. The CMS is doing the same work every time someone visits the homepage: spinning up PHP, querying the database, running plugins, and rendering HTML on every single request.

An application server sitting behind a CDN looks fast on the first visit because the server is close to you. But the origin—the actual server responding to cache misses—is often slow because it's doing real work: querying databases, rendering dynamic content, running your application logic. A CDN can't cache the first byte; it can only cache what the origin already produced. If your origin takes 1.5 seconds to generate a response, a CDN cache miss will always be at least 1.5 seconds.

The third, more subtle problem is that you're measuring TTFB wrong. Most teams check their homepage from their office, or from a Lighthouse audit, which is already near-optimal network conditions. Real TTFB—the metric Google actually uses for ranking—comes from field data: real users on real networks. That data is almost always slower than what you see locally, sometimes by 2-3x.

TTFB and Crawl Budget

Search visibility isn't just about content. It's about whether Google can afford to crawl your site.

Google allocates a crawl budget to each domain: a finite number of requests per day. If your TTFB is high, Google crawls fewer pages per budget. If your TTFB drops, Google crawls more. This creates a vicious circle: slow sites get crawled less often, their indexes get stale, their rankings drift, and no one notices why because the content wasn't wrong—the site was just invisible.

Google recommends keeping server response TTFB under 200 ms to preserve crawl budget, and one documented case reduced TTFB from 2.27 seconds to 483 ms and saw measurable crawl frequency gains[3]. That's not a small improvement. That's the difference between getting re-crawled every week and getting re-crawled every month.

This matters even more now that LLMs are crawling sites. Claude, GPT, and other models request pages to cite them. If your TTFB is slow, the crawler times out or deprioritizes your content. You don't appear in summaries or citations. You're competing for visibility not just with Google's algorithm, but with whether your site is fast enough for machines to bother reading it.

The Infrastructure Decisions That Actually Move TTFB

TTFB is almost entirely determined by your hosting architecture. You can't code your way to a fast TTFB if your infrastructure is built for something else.

Statically generated content served from the edge. If your marketing site doesn't require real-time personalization, static generation is the single biggest leverage point. A static HTML file served from a CDN edge node arrives in a fraction of the time an origin would take. No database queries, no backend processing, no "time for the server to think." This is why choosing the right framework for production sites matters: statically generated Next.js pages hit around 0.08 seconds TTFB compared to WordPress's 1.4 seconds—roughly 17x faster[4].

Static generation means you generate the HTML ahead of time, when demand is zero. GitHub Pages, Vercel, Netlify, and similar platforms all use this model. Your pages live on servers distributed globally. When a user requests your homepage, the closest server responds immediately with the already-rendered HTML. Database queries don't happen on the critical path.

Origin architecture that respects server response time. If your site requires dynamic content—user authentication, personalization, real-time data—then your origin server is the constraint. The question is whether your origin is designed for that.

A monolithic application that queries multiple databases, renders templates, and processes logic on every request will always have high TTFB. Splitting that work into smaller, cacheable pieces—or using a database optimized for read speed (not write flexibility)—can cut server response time in half.

The practical question is whether your origin takes 500 ms to respond or 1.5 seconds. That difference almost always comes down to database design, caching layer (Redis, Memcached), and architectural choices that were made years ago and never revisited.

Geographic routing and regional failover. If you're using a CDN for content delivery but your origin is in one region, users in other regions hit that single origin. Network latency alone can be meaningful depending on distance. Using regional origins, or geographically distributed edge computing (like Vercel's Edge Functions or Cloudflare Workers), moves computation closer to users. This is smaller than the static generation win, but it compounds.

Cache headers and stale-while-revalidate. Content that doesn't change on every request should be cached aggressively. Setting a 30-second cache on your homepage is not conservative; it's baseline. Using stale-while-revalidate allows a user to get a cached version immediately while the cache refreshes in the background. This masks noticeable network latency without rebuilding your entire architecture.

The Decision You Actually Need to Make

The reason most founders don't fix TTFB is that fixing it requires a decision about what your site is actually for.

If your marketing site is a brochure—relatively static content that changes infrequently—you should use static generation. The performance, reliability, and maintenance costs are all lower. Your engineering team shouldn't be involved in weekly homepage updates.

If your marketing site requires real-time data or heavy personalization, you have two choices: accept that TTFB will be slower because the work is real, or invest in infrastructure (caching, edge computing, database optimization) that makes it faster. Most teams choose to ignore TTFB instead, which is the worst option.

The hidden cost of slow TTFB isn't the metric itself. Amazon reported that every 100 ms of added latency cost them roughly 1% in sales[5]. Apply that ratio to your own business. If you're doing several million in ARR and your TTFB is 1.5 seconds instead of 800 ms, you're losing money every day to a decision made years ago and never revisited.

This is why infrastructure decisions compound. A site built with the right architecture—static generation, edge serving, minimal origin processing—stays fast. A site built with the wrong architecture gets slower as you add features, plugins, and personalization, until you can't add anything more without complete rewrite. And by then, your conversion rate has already suffered.

The people who say "our site is fine" are usually measuring load time from their office. The people watching their conversion rate drop are measuring real TTFB from real users. One of these groups is making decisions based on data.

If you've read about how infrastructure underpins Core Web Vitals, you know that performance is architectural. TTFB is the clearest proof. You can't fix it with JavaScript optimization or image compression. You fix it by choosing a hosting model that fits what your site actually does, and then defending that choice against feature creep that would force you back to slow infrastructure.