From 10.1s to 4.5s: A Frontend Performance Turnaround

March 2026 · 8 min read · 10.1s -> 4.5s speed index

A practical playbook for profiling bottlenecks, removing fetch waterfalls, and improving perceived speed without backend changes.

Frontend performance work can look deceptively simple from the outside. A page is slow, so you compress a few images, memoize a component, maybe lazy-load something, and hope the graph moves. In practice, the useful work starts earlier than that. You need to understand what the user is waiting on, which work blocks the first meaningful screen, and which changes can actually ship within the constraints of the product.

On this project, the core flow had a measured speed index of 10.1 seconds. More importantly, it felt slow. The page did not merely miss a benchmark in a dashboard; it created that familiar user experience where the interface appears to be thinking too hard before it lets you do anything useful. The goal was not to chase a perfect Lighthouse score. The goal was to make the experience feel materially faster without requiring a backend rewrite.

The first pass brought the speed index down to 4.5 seconds. That was roughly a 55% improvement, and the difference was visible. Here is the process I used, and the parts I would repeat on another performance turnaround.

Start with the user-visible bottleneck

Before touching code, I wanted a crisp definition of what slow meant. Performance can degrade in several different ways: the document can take too long to respond, JavaScript can block rendering, data fetching can serialize work that could happen in parallel, images can dominate the critical path, or the page can render quickly but then shift around as late content arrives. Each problem asks for a different fix.

In this case, the issue was not one isolated bug. It was a chain of small delays that compounded. The page had important content waiting behind request waterfalls, image-heavy surfaces were doing more work than necessary up front, and some client-side rendering paths were firing before they were useful to the user. That matters because a collection of medium-sized delays can feel worse than one obvious slow endpoint. Users do not experience your architecture in layers; they experience the sum of everything you put between them and the screen they need.

So I treated the first phase as evidence gathering. I looked at timing, network behavior, render sequencing, and what appeared in the viewport first. I wanted to know which requests started immediately, which requests waited on other requests, which visual assets were essential, and which work could be deferred without changing the product experience.

Profile before optimizing

The most expensive performance mistake is optimizing whatever happens to be familiar. In React and Next.js applications, that often means reaching first for memoization, component splitting, or small rendering tweaks. Those tools are useful, but only when rendering is actually the bottleneck. If the page is mostly waiting on serialized network calls, memoizing a component is tidying the room while the front door is still locked.

I started with the network waterfall because it gives a brutally honest picture of sequencing. The key question was simple: what work is happening one step at a time that could safely happen at the same time? Several independent fetches were waiting behind earlier work even though they did not depend on those responses. That meant the user paid for each request sequentially.

The next question was about first paint and perceived readiness. Not every piece of data has equal priority. Some content is needed for the first useful screen, while other content can load after the page is interactive. I separated what had to be ready immediately from what merely needed to appear soon. That distinction helped keep the fix focused. The point was not to hide loading or pretend work disappeared; it was to stop low-priority work from delaying high-priority perception.

Remove waterfalls where the dependencies are fake

The biggest improvement came from removing unnecessary request waterfalls. Some fetches had been written in a way that implied dependency, but the product behavior did not actually require one response before starting the next request. Reworking those calls to start in parallel reduced waiting time without changing backend contracts.

This kind of change is especially valuable because it improves the experience without asking the backend to get faster. A 700ms endpoint is still a 700ms endpoint, but three independent 700ms waits can be either 2.1 seconds or roughly 700ms depending on how the frontend schedules them. That difference is often larger than what you can get from local component cleanup.

I also looked for duplicate or redundant fetching. It is easy for modern component trees to request the same conceptual data from multiple places, especially when features grow incrementally. Reducing redundant calls helped lower contention and made the page easier to reason about. Performance work is maintainability work when it reduces hidden duplication.

Make the first screen earn priority

After the request scheduling work, I focused on what the browser had to render before the user saw value. The page included image-heavy areas, and some of those assets were treated as equally important even when they were not equally visible. I adjusted loading behavior so the first viewport received priority and less important imagery could wait.

This is where I try to be careful with lazy loading. Lazy loading everything is not a strategy; it can hurt the hero or first meaningful content if used indiscriminately. The right question is: what does the user need to believe the page is ready? Those assets should be prioritized. Everything else should justify its place on the critical path.

I also reduced avoidable rerender work around the first load path. Not every rerender is a problem, but rerenders become visible when they combine with network waits, image decoding, or layout changes. Cleaning up unnecessary state changes made the page feel calmer during load. The user may not name that as performance, but they feel it as stability.

Optimize for perceived speed, not only raw numbers

The speed index improvement from 10.1 seconds to 4.5 seconds mattered because it matched what the user could feel. The page began showing useful content sooner, and the experience stopped feeling stuck behind invisible work. That is the difference between benchmark-driven optimization and product-centered performance work.

I like speed index for this kind of project because it captures visual progress over time. It is not the only metric that matters, but it is useful when the complaint is that a page feels slow. Core Web Vitals, network timing, and profiling traces all help explain the why, but visual progress is what users notice first.

The tradeoff is that perceived speed work can tempt teams into cosmetic loading states that do not solve the underlying problem. Skeletons and spinners have a place, but they should not become a substitute for reducing actual wait time. In this case, the better answer was to start independent work sooner, reduce critical-path weight, and make the first screen more deliberate.

Keep the changes small enough to ship

A performance turnaround can spiral into a rewrite if you let every discovery become part of the first fix. I kept the initial pass constrained to frontend changes that could ship safely: request scheduling, render-path cleanup, loading priority, and cache-aware behavior. That gave the team a real improvement while preserving space for deeper backend or infrastructure work later.

This sequencing matters. A 55% improvement bought trust and time. It gave product stakeholders a visible win, reduced user friction, and created a clearer baseline for future optimization. It also made later conversations more concrete. Once the obvious frontend bottlenecks were removed, deeper work could be evaluated against a better-understood system rather than a pile of mixed symptoms.

What I would repeat

The repeatable lesson is simple: do not start with the optimization technique. Start with the shape of the wait. Is the browser waiting on HTML, JavaScript, data, images, layout, or main-thread work? Are requests serialized because they must be, or because the code happened to grow that way? Is the first viewport getting priority, or is the page treating every asset and every data dependency as equally urgent?

Once you can answer those questions, the implementation choices become much less mysterious. In this case, the right fixes were not exotic. They were practical: remove fake dependencies, parallelize independent work, prioritize the first useful screen, reduce unnecessary client work, and verify that the metric improvement matched the felt experience.

That is the kind of frontend performance work I trust most. It is not flashy, but it is durable. It makes the product faster, makes the code easier to reason about, and gives the team a clearer foundation for whatever optimization comes next.

Topics: Performance, React, Next.js, Web Vitals