From Developer to Architect -Episode1 by Shifa Salheen on March 10, 2026 54 views

Everything Worked, Until Traffic Arrived

Ever heard terms like Horizontal scaling and Vertical scaling?

They appear everywhere

in architecture diagrams,

in interviews,

in system design conversations.

At first, they sound like infrastructure jargon.

Abstract. Distant. Something you’ll “learn later.”

But they aren’t.

They’re part of a transition every developer eventually goes through often without realizing it.

From Writing Code to Designing Systems

You start as a developer.

You write code.

You focus on features.

You make sure things work.

Then one day, you deploy your application.

It runs.

Users arrive.

Traffic grows.

And slowly, your questions change.

Not “How do I implement this feature?”

but

“What happens when more people use this at the same time?”

That shift, quiet and easy to miss, is where architectural thinking begins.

The Bridge Between Developer and Architect

Becoming an architect isn’t about titles, tools, or drawing diagrams.

It’s about learning how systems behave under pressure.

And one of the most important stepping stones on that bridge is Scaling.

Before traffic arrives, architecture feels optional. After traffic arrives, architecture becomes inevitable.

This is where scaling stops being a buzzword and starts becoming a decision.

When Everything Works

Every application begins in a comfortable phase.

It’s deployed on a single server.

Requests are fast.

Logs are quiet.

Dashboards stay reassuringly green.

Think of it like a small café.

One counter. One barista.

A steady flow of customers.

Orders are taken smoothly.

Coffee is served on time.

No queues. No stress.

At this stage, architecture feels invisible.

And that’s not a mistake.

Simplicity is the right choice when the system is small.

When Traffic Changes the Nature of the Problem

Over time, usage grows.

Not suddenly.

Not dramatically.

Just enough to change how the system behaves.

Back in the café, more people start walking in.

Not a crowd.

Just enough to create a queue during rush hours.

You begin to notice:

  • Slower responses during peak hours
  • Occasional request timeouts
  • CPU or memory spikes that settle on their own

The barista moves faster.

Customers are still being served.

Nothing is broken.

But something is no longer comfortable.

This is usually when developers feel the tension, not because the system failed,

but because it’s asking for a better decision.

The Two Questions That Matter

At this point, jumping straight to fixes rarely helps.

Instead, two questions guide the next step:

  1. Is the server running out of capacity?
  2. Or is all the load concentrated in one place?

In the café, the same questions appear:

Is the barista too slow?

Or is one person handling too much alone?

The answer determines how you scale.

This is the first real architectural fork in the road.

Vertical Scaling: When the Server Is the Bottleneck

If the server consistently hits its limits: CPU, memory, or disk, the problem is capacity.

In the café, the instinct is similar.

You don’t redesign the café.

You don’t open another branch.

You help the barista.

A faster coffee machine.

Better grinder.

Prepped ingredients.

The same person is now working faster.

This is vertical scaling.

Add more memory.

Increase CPU.

Upgrade the instance.

Vertical scaling works because it is:

  • Simple to apply
  • Fast to implement
  • Low in architectural complexity

But it comes with a defining limitation.

Every machine has a ceiling.

At some point:

  • The barista gets tired
  • The counter gets crowded
  • Speed alone stops helping

Vertical scaling buys time.

It doesn’t change the system’s structure.

Horizontal Scaling: When Strength Isn’t Enough

Sometimes the server isn’t weak.

It’s simply doing too much alone.

In the café, this is the moment you stop asking

“How fast can one barista work?”

And start asking

“Why is one person handling everything?”

So you add another barista.

Then another.

Each trained the same way.

Each capable of making coffee.

Customers are now distributed across counters.

Orders move in parallel.

No single person is overwhelmed.

This is horizontal scaling.

Instead of relying on one stronger server, you introduce multiple identical servers.

Each server handles part of the workload.

No single machine carries everything.

Horizontal scaling shifts the mindset from:

How strong can one server become?

to:

Why should one server handle everything

The Quiet Role of the Load Balancer

Once multiple servers exist, a new question appears:

Who decides where each request goes?

In the café, customers pause and ask:

“Which counter do I go to?”

Someone needs to guide them.

Not loudly.

Not visibly.

Just efficiently.

That role doesn’t make coffee.

It doesn’t take payments.

It only decides where the customer should go.

In system design, this role is played by a load balancer.

It doesn’t process business logic.

It doesn’t store data.

It simply routes traffic, quietly and reliably.

When it works well, no one notices it.

When it’s missing, chaos is immediate.

Choosing the Right Approach

There is no universally “correct” scaling strategy.

  • Vertical scaling solves capacity problems
  • Horizontal scaling solves distribution problems

Most real-world systems eventually use both.

The key isn’t memorizing definitions.

It’s understanding when each approach makes sense.

That understanding is where architectural thinking begins.

From Developer to Architect

Writing code teaches you how things work.

Designing systems teaches you how things behave.

Scaling lives at that intersection.

When you begin to think about:

  • How your system responds under pressure
  • where it bends before it breaks
  • What fails first
  • and how it recovers without panic

You’ve stepped onto the bridge from developer to architect.

Because anyone can build what works. Architects build what keeps working.

Next episode:

From Developer to Architect — Episode 2

The Quiet Component That Keeps Systems Calm

Understanding load balancers and routing strategies in depth

How Machines Hear: The Magic of Audio Fingerprinting

About Author

Shifa Salheen

Shifa is a coder by profession, a singer and a bibliophile by passion. She is a keen observer of nature and when not coding, she scribbles poems and quotes, adoring the vivid beauty around her . She has published a collection of poems named "Tales from the Cafe" and is soon planning to finish her debut fiction novel. Always surrounded by books, she calls coffee and books her eternal lovers.