Please wait while we're digging into our systems to find what you want....

What’s on your mind..?

LEMON BLOG

Cloudflare Cuts Cold Starts by 10x With Worker Sharding

Monday, 23 February 2026

Web Development

If you've ever used a serverless platform and felt that occasional "why did that request suddenly feel slow?" moment, you've brushed up against cold starts. Cloudflare recently shared how it cut Workers cold start delays by roughly 10x, not by shaving milliseconds off compilation, but by making cold starts happen far less often in the first place. The trick is something called worker sharding.

This article walks through the idea, why the old approach stopped working, and how Cloudflare redesigned request routing to keep Workers warm more reliably.

What A Cold Start Actually Means For Workers

A cold start happens when a server doesn't already have your code running in memory, so it has to fully "spin up" the serverless workload before it can answer the request.

For Cloudflare Workers, that startup has four main steps:

• Fetch the JavaScript bundle from storage
• Compile it into something the CPU can execute
• Run any top-level initialization code (the stuff that runs on import)
• Finally, call the request handler for the actual incoming request

That sequence matters because only the last step produces a response. Everything before it is pure waiting from the user's point of view.

The big headline improvement is that Cloudflare says 99.99% of requests now land on already-running instances, meaning only a tiny fraction of requests ever pay the "startup tax."

The First Fix: Hiding Cold Starts Behind TLS Handshakes

Back in 2020, Cloudflare had a clever workaround: hide the startup time during the TLS handshake.

TLS is the encryption setup phase for HTTPS. Before any real request data is exchanged, the client and server perform a handshake that takes multiple back-and-forth round trips. That handshake delay created a window where Cloudflare could quietly start a Worker "in the background."

And they had a key advantage: the very first TLS message includes the SNI field (Server Name Indication), which reveals the hostname the user is trying to reach. With that hostname, Cloudflare could guess which Worker would be needed and start warming it immediately.

This worked well when cold starts were short and TLS handshakes were relatively long. In the best case, the Worker finished starting before the handshake completed, so the user didn't feel the cold start at all.

Why That Trick Stopped Working

Over time, the timing relationship flipped.

Cold starts got longer
Cloudflare raised script size limits (allowing much bigger deployments) and increased the CPU time allowed during startup. Bigger code takes longer to fetch and compile. More startup CPU budget means initialization can do more work, which can also extend the cold start.

TLS got faster
TLS 1.3 reduced handshake overhead compared to TLS 1.2, shrinking the "free hiding time" Cloudflare used for prewarming.

Put those together and the illusion broke. The handshake no longer provided enough time to cover the full startup cost, so users started to feel delays again.

The Real Insight: Don't Fight Cold Starts, Reduce How Often They Happen

At some point, optimizing compilation and fetch times becomes a game of diminishing returns. Cloudflare's shift in thinking was: instead of trying to make every cold start faster, reduce the number of cold starts across the network.

The root cause wasn't just "Workers are slow to start." It was "Workers are getting started too often because requests are spread too thin."

Here's the classic example:

Imagine a data center with 300 servers. A low-traffic Worker gets one request per minute. With normal load balancing, those requests distribute across many servers, so any one server might only see that Worker once every few hours. In a busy environment, a few hours is plenty of time for the Worker to get evicted from memory to free resources. So the next request that lands on that server triggers a cold start again.

That's how a low-traffic app can end up with an almost constant cold-start feeling, even though it's not "down" and it's not "broken."

Worker Sharding: Keep Each Worker "At Home" Inside A Data Center

Worker sharding changes the routing model inside a data center:

Instead of letting requests for the same Worker land on any server, Cloudflare routes requests for that Worker to a "home server." If a Worker's requests keep hitting the same server, it stays in memory, and future requests are warm.

This does two things at once:

Performance improves for low-traffic Workers because the Worker stays alive between requests.
Memory efficiency improves because you don't need 300 servers each holding a copy of a Worker that only runs once every few hours.

In other words, the system stops wasting memory on duplicates and uses that memory to keep more Workers warm overall.

Why A Consistent Hash Ring Matters

If you're going to give each Worker a home server, you need a mapping strategy that doesn't fall apart every time servers come and go.

A naive hash table approach breaks badly when the server pool changes. Add or remove a server and suddenly lots of Workers get remapped, causing a wave of cold starts because everyone "moves house" at the same time.

A consistent hash ring avoids that.

The basic idea:

• Hash each server to a position on a ring
• Hash each Worker to a position on the same ring
• For a Worker, walk clockwise and pick the first server you hit

When a server disappears, only the Workers that mapped to that server need to move. When a server is added, only a slice of Workers shift over. Most Workers keep the same home server, which is exactly what you want if the goal is to stay warm.

What Happens When A Request Hits The "Wrong" Server

With sharding, the server that first receives the request isn't always the home server for that Worker.

So Cloudflare treats servers in two roles:

Shard client: the server that receives the request from the internet
Shard server: the Worker's home server according to the hash ring

If the shard client is also the home server, great, it runs the Worker locally. If not, it forwards the request internally to the shard server.

Yes, forwarding adds latency (about a millisecond). But that's tiny compared to a cold start that can take hundreds of milliseconds. In practice, a warm Worker plus a short internal hop wins.

Avoiding Overload Without Throwing Errors

Sharding concentrates traffic, so there's a risk: what if a Worker's home server gets overloaded?

Cloudflare considered a "permission first" approach where the shard client asks before sending the request, but that adds an extra network round trip on every sharded request.

Instead, it chose an optimistic approach:

Send the request to the shard server normally.
If the shard server is overloaded, it bounces the request back to the shard client, which then runs the Worker locally.

Because overload refusals are rare, it's better to optimize the common case rather than punish every request with extra chatter.

Why Cap'n Proto RPC Helps In The Messy Edge Cases

Cloudflare uses Cap'n Proto RPC to connect servers. This matters because it lets them pass around "capabilities," which are basically handles to services or objects that can be invoked later.

The clever part: the shard client can include a "lazy Worker capability" that represents a Worker instance that hasn't started yet on the shard client.

If the shard server refuses due to overload, it can return that lazy capability back. When the client then invokes it, the system realizes it's pointing to a local instance and short-circuits, avoiding pointless back-and-forth and preventing wasted bandwidth on large request bodies.

Nested Worker Calls: Making Sharding Work For Real-World Products

Cloudflare's ecosystem isn't just "one Worker per request." Workers can call other Workers through service bindings, KV-related flows, and especially Workers for Platforms where multiple Workers may participate in a single request pipeline.

Sharding makes this harder because execution context now needs to travel across servers: permissions, limits, feature flags, logging, and tracing setup.

Cloudflare handles this by serializing the context stack and sending it along with sharded requests, so each server can continue execution with the correct configuration. For tracing, callback capabilities allow different servers to report back without each server having to know where "the collector" lives.

What Cloudflare Got Out Of It

After rolling out worker sharding globally, Cloudflare reported outcomes along these lines:

• A small portion of enterprise requests are actually sharded to a different server (because high-traffic Workers already run in many places)
• Even with low sharding volume, eviction rates dropped significantly, because the long tail of low-traffic Workers stopped churning in and out of memory
• Warm request rate improved from 99.9% to 99.99%, meaning cold starts became 10x less frequent

The theme here is important: the biggest wins didn't come from making startup faster. They came from engineering the system so startup is rarely needed.

Final Thoughts

Cloudflare's worker sharding story is a classic distributed systems lesson dressed up as a performance fix. When a platform scales, the bottleneck often isn't a single slow step, it's how frequently you force that slow step to happen.

By routing each Worker toward a stable "home" server using consistent hashing, Cloudflare turned cold starts from a constant annoyance for low-traffic apps into something that mostly happens once, then disappears into the background.

How do you feel about this post?

About the author

Mr LemonGuy

Creator of Lemon Web Solutions, Mr. LemonGuy explores the front lines of technology—from cybersecurity to AI-driven development. Part developer, part digital architect, he focuses on delivering high-impact industry news and open-source projects that bridge the gap between emerging tech and real-world application.

Comments

No comments made yet. Be the first to submit a comment

Discover Topics

Application Development Explore software and tools by LemonWeb, built to solve problems and drive digital innovation.

Cybersecurity Stay informed on digital threats, security best practices, and the latest cybersecurity news.

Designs & Artworks Explore the creativity behind latest digital designs and visual projects.

Games Explore retro games, browser emulation, and interactive projects from our game library.

General Information Your hub for news update, announcements, and essential resources.

Mobile Development Stay updated with the latest mobile app news, trends, releases, and innovations.

News Latest global and Malaysian news, technology trends and breakthroughs.

Tech Gadgets Explore the latest hardware, peripherals, and tools shaping today’s digital experience.

Technical Solutions Practical guides and strategies to simplify workflows and solve technical challenges.

Web Development Web development insights covering front-end, back-end, and modern industry standards.

Explore all articles

Stay Updated

FRESH FINDS

Celebrating a Decade of Excellence: ISM Insurance Services Malaysia Berhad’s 10th Anniversary

02 February 2025 | Designs & Artworks

Looking back at my time at ISM Insurance Services Malaysia Berhad, I a...

Apple Introduces iPhone 16e: A Strategic Move Towards Modem Independence

21 February 2025 | Tech Gadgets

Apple has officially unveiled the iPhone 16e, positioning it as an aff...

Kioxia Fast-Tracks Its Next-Gen 3D NAND Memory for the AI Era

19 December 2025 | Tech Gadgets

The race to build faster, denser, and more efficient flash memory has ...

A Fake OpenClaw npm Package Just Showed How Dangerous Supply Chain Attacks Can Get

11 March 2026 | Cybersecurity

A newly uncovered npm threat is a sharp reminder that not every packag...

My Heart Will Go On – Guitar Cover

10 May 2026 | Guitar Covers

Some melodies are so timeless that they remain instantly recognizable ...

Meta’s New Plus Subscriptions For Facebook, Instagram And WhatsApp Are Now Available In Malaysia

17 June 2026 | Mobile Development

For years, there have been jokes and rumours about Facebook eventually...

Steps to Launch Google Ads Campaign

15 April 2019 | General Information

Did you know that around 80 percent of companies focus on Google for p...

Find the Alien Online: Scan, Spot, and Stop Hidden Alien Impostors

20 June 2026 | Games

Find the Alien is a fast and playful hidden-object action game that tu...

Microsoft Finally Delivers a Major Storage Performance Leap in Windows Server 2025

19 December 2025 | Technical Solutions

Microsoft has quietly unlocked one of the biggest performance upgrades...

Zuma – Fast-Paced Puzzle Action and Colour Matching

03 February 2026 | Games

Zuma delivers a tightly focused puzzle experience built around speed, ...

Why More Designers Are Moving to Modern Publishing Software on Mac

04 June 2026 | Designs & Artworks

Mac has always had a strong connection with creative work. From graphi...

The Digital Prescription Era: How Malaysia’s Hospitals Are Embracing Secure E-Signature Workflows

25 October 2025 | General Information

A new chapter for clinical trust, patient safety, and digital accounta...

James Bond License to Kill – Tactical Action and Stealth-Focused Missions

03 February 2026 | Games

James Bond License to Kill offers a more restrained and methodical tak...

Malaysia’s Hari Raya Aidilfitri Will Fall on Saturday This Year

19 March 2026 | News

The wait is over for Muslims across Malaysia. Hari Raya Aidilfitri wil...

Creating a 3D Cube using 2D CSS transformations

20 April 2013 | Web Development

The impression of a three dimensional cube can be created using modern CSS techniques...

Hungry Lamu and the Short Horror Game That Turns Cute Into Uncomfortable

08 February 2026 | Games

Hungry Lamu is one of those short horror experiences that knows exactl...

IRB and MyDigital ID Team Up to Push Digital Identity Forward in Malaysia

14 March 2025 | News

The Inland Revenue Board (IRB) and MyDigital ID Sdn Bhd are stepping u...

2025 Software Development Trends and Forecasts

09 January 2025 | Web Development

As 2025 approaches, Developer explores the future of software developm...

Introducing the ViewPort Detector Web App

18 January 2026 | Application Development

When you build websites (or even just browse them), "screen size" soun...

Information Architecture: Why Good Website Content Still Fails When People Cannot Find It

07 May 2026 | Web Development

A website can have excellent content and still frustrate visitors. Tha...

Young Merlin and the Classic Fantasy Adventure That Feels Like Growing Into a Legend

07 February 2026 | Games

Some fantasy games are about saving the world from the first moment. Y...

Breaking Down ILMUchat: Why Malaysia’s Sovereign AI Project Is More Than Just Another Chatbot

06 June 2026 | Web Development

Artificial intelligence is moving so fast that every country is now as...

Why “Intuitive” Design Is Not Always the Gold Standard

27 June 2026 | Designs & Artworks

"Make it intuitive." It is one of the most common requests in des...

Zero Divide – Precision, Discipline, and a Different Kind of Fighting Game

18 January 2026 | Games

Zero Divide is a one-on-one fighting game that approaches combat with ...

Maze of Space Goblin

02 January 2025 | Games

Maze of Space Goblin is an exhilarating game that takes players on a c...

Star Wars Jedi Knight: Jedi Academy – The Ultimate Jedi Power Fantasy

27 December 2025 | Games

Star Wars Jedi Knight: Jedi Academy is widely regarded as one of the b...

Shock Troopers (Neo Geo) – The Ultimate 90s Run-and-Gun Experience

08 November 2025 | Games

When it comes to pure arcade adrenaline, few games deliver the rush, c...

Google Is Bringing Vertical Tabs and Reading Mode to Chrome

08 April 2026 | Web Development

Google is introducing a couple of new Chrome features that look design...

NVIDIA RTX 5060 Series May Launch Later Than Expected

23 March 2025 | Tech Gadgets

Gamers and PC enthusiasts waiting on NVIDIA's next mid-range graphics ...

Fake CAPTCHA, Real Malware: How a Simple Website Tried to Trick Me Into Running a Dangerous Command

13 April 2026 | Cybersecurity

Sometimes the most effective online threats are not the ones that rely...

Demon Bluff: A Dark Strategy Game Built Around Deception, Risk, And Careful Decision-Making

07 June 2026 | Games

Demon Bluff is a mysterious strategy and deception game built around r...

Malaysia Blocks Over 2.1 Billion Scam Calls and Messages Since 2022

08 August 2025 | Cybersecurity

A Growing Threat That Shows No Signs of Slowing - Online scams i...

Jurassic Park Part 2: The Chaos Continues (SNES) – The Hunt Returns

13 October 2025 | Games

In 1994, a year after the success of the original Jurassic Park (SNES)...

Happy Thaipusam 2026 From Lemon Web Solutions

31 January 2026 | Personal Blog

Thaipusam is here again, and if you're in Malaysia, you can feel it be...

Remote access without the usual headaches

21 January 2026 | Technical Solutions

Remote desktop software is one of those things you only truly apprecia...

Cloudflare Cuts Cold Starts by 10x With Worker Sharding

23 February 2026 | Web Development

If you've ever used a serverless platform and felt that occasional "wh...

The Terminator: A Gritty Action Game Built on Pressure and Persistence

15 February 2026 | Games

Some action games feel like power fantasies. The Terminator feels like...

Samsung’s New T7 And T9 microSD Cards Bring Bigger Storage And Faster Options To Malaysia

04 May 2026 | Tech Gadgets

Samsung has added two new microSD cards to its removable storage lineu...

Malaysia’s Public Servants Get AI-Powered! Will They Finally Win the Battle Against Paperwork?

05 February 2025 | News

Imagine a world where civil servants spend less time drowning in paper...

Microsoft’s Native NVMe Driver Could Be One of the Biggest Windows Storage Upgrades in Years

11 March 2026 | Technical Solutions

Microsoft's newer native NVMe path is starting to look like a much big...

WhatsApp Is Quietly Testing Paid Channel Subscriptions — Here’s What That Means

07 April 2026 | Mobile Development

WhatsApp has been steadily evolving beyond just messaging, and its Cha...

Wayfinding in Hospitals: Why Clear Navigation Matters More Than Ever

08 April 2026 | Designs & Artworks

Walking into a hospital is rarely a calm experience. Most people are n...

Google Search Gets a Brain Boost: AI, Chatbots, and the Fight to Stay Relevant

06 February 2025 | News

Move over, old-school search engines—Google is getting a major AI glow...

Google Gives Android Auto A Major Redesign With Widgets And Immersive Navigation

14 May 2026 | Mobile Development

Google has announced a major refresh for Android Auto as part of the A...

Google Quietly Ends Chrome’s Privacy Sandbox — What It Really Means for 3 Billion Users

21 October 2025 | Web Development

When Google first introduced its Privacy Sandbox, the company promised...

Heroes of Might and Magic III – The Timeless Benchmark of Turn-Based Strategy

27 December 2025 | Games

Heroes of Might and Magic III is widely regarded as one of the greates...

Google Advocates for a Worldwide Initiative to Educate Employees and Policymakers about AI

27 January 2025 | News

Facing intense regulatory scrutiny, Alphabet's Google is striving to i...

When To Stop Designing: Knowing When “Good” Is Actually Good Enough

04 March 2026 | Designs & Artworks

Designers don't struggle with starting. We struggle with stopping.&nbs...

Understanding the Value of Money: Malaysian Ringgit Inflation Calculator

24 February 2025 | Application Development

The Malaysian Ringgit Inflation Calculator is a powerful tool designed...

Donkey Kong Country – Play the Classic SNES Platformer Online in Your Browser!

02 March 2025 | Games

Swing into action with Donkey Kong Country, one of the most iconic pla...

LEMON VIDEO CHANNELS

Step into a world where web design & development, gaming & retro gaming, and guitar covers & shredding collide! Whether you're looking for expert web development insights, nostalgic arcade action, or electrifying guitar solos, this is the place for you. Now also featuring content on TikTok, we’re bringing creativity, music, and tech straight to your screen. Subscribe and join the ride—because the future is bold, fun, and full of possibilities!

ABOUT LEMON

Experienced webmaster specializing in functional, visually appealing web design since 2008, with a strong focus on programming and innovation.

Learn more

LEMON BLOG

Cloudflare Cuts Cold Starts by 10x With Worker Sharding

About the author

Mr LemonGuy

Related Posts

My Life as a Graphic and Web Designer: Pixels, Projects, and Plenty of Coffee

Happy Easter 2025 from Lemon Web Solutions

How to Improve Windows 11 Performance on Your Laptop (Without Breaking a Sweat)

Comments

Discover Topics

Stay Updated

Useful updates, straight to your inbox.

FRESH FINDS

LEMON VIDEO CHANNELS

ABOUT LEMON

QUICK ACCESS

SOCIAL MEDIA

CONTACT INFO

Useful updates,
straight to your inbox.