8 System Design Principles I learned After Doing It Wrong More than 50 Times!



At Squad, we strive to build awesome products to solve customer(internal and external) needs. As a product engineer, paramount part of your job is to design and build products. Dig deep into the root cause of the problems, design solutions and implement them as the end product.

Over the course of my journey so far, here are the 8 system and product design principles that I’ve learned from other awesome people at Squad, from feedback and simply doing it not right enough multiple times.

1. What is the underlying problem that led to the feature request?

At Squad, you don’t just code the requirements into the software. As a product engineer, it’s your responsibility to remove the layers and expose the root problem that led to the feature requirement.

Get to know the root cause of the problem that you are trying to solve. Or even better, as the lean principles say “genchi genbutsu” i.e go and see it yourself.

2. How can you make the feature more robust, reliable and usable?

Once the essential feature requirements are finalized, we must press on how can we make the feature more robust, reliable and usable?

Things to ponder upon and take into consideration can be :

  1. The persona of the users that’s going to use that.
  2.  Scenarios in which that feature would be used. Ex, if in the case of fires, than show more data than needed for faster resolutions.
  3. Building in the quality in the product itself or “Jidoka” as said in lean.

3. What is the first iteration going to be?

Given the time and resources you have., what is the best possible first iteration of the product going to be? If it’s a large system or something you are building from scratch, there are always going to be iterations.

The main idea here should be to move fast and get things shipped. Good enough and shipped on time is always better than perfect and in-development forever.

4. How easy will it be to make iterations on the current feature?

The design should incorporate all the non-functional requirements to make future iterations easy.

Scale the feature? Change a component? Use a different 3rd party service? Your implementation should be flexible enough to incorporate and encourage these enhancements.

Design patterns are your best friend here.

5. What are the potential bottlenecks with scale?

Scale-land is where everyone wants to be, but it is scary. It breaks what was not supposed to break and has witnessed more horror stories than a haunted castle.

What are the potential bottlenecks that are not a problem now, but will break at 5X, 10X or 100X scale?

List them down on the feature ticket, or better document it in the code itself.

6. What’s the data that has to be captured and how will it be consumed?

Every feature in the product will need some data that needs to be captured to track it. It can be but not limited to:

  1. Action logs.
  2. Event logs.
  3. Metrics
  4. Failures.
  5. Anamolies.

What affects this majorly is how that data will be consumed? Store it in a structure that will make the consumption of data easy and efficient. Afterall, the only motive to store data is to use it.

7. How good the developer experience will be when interacting with the code base of that feature?

There can be many developers who’ll use or modify the code that you are going to write.

How will be their experience when doing that? Ex. Will the test cases you wrote, make them feel confident enough to make changes fast?

Few points to consider:

  1. Is the code well documented?
  2. Are test cases strong enough?
  3. Is the code, re-usable where it makes sense?
  4. Are functions small and code, simple to read?

8. What metrics will determine that the feature has been implemented successfully?

Finally, after all the fun-time you had creating the feature, what will determine that the feature has been implemented successfully?

The data you tracked will be of paramount importance here.

It can be the case that to track this quantitatively is not possible, but can you track the qualitatively in that case?

The idea here is that you can’t improve what you can’t measure?

Processing 100,000 requests? Fewer errors by the users? 95% work done by the new system instead of old one?

This can and will involve more stakeholders of the team and not just the developer.


Obviously, this is not the exhaustive things to take into consideration while designing a system or a product as an engineer. This just covered what I have learned so far by just doing things wrong or not right enough multiple-times.

It’s fun to build stuff! Continuously improve (“Kaizen” in lean)! Keep iterating! Keep shipping!



That’s all, folks!



Introduction to Ingressing With Kubernetes


Single responsibility is a magical notion. Whatever it touches, it makes it more manageable and efficient.

With Kubernetes, we have the power to spawn many services. As many of them as we would like. But how inbounds requests are routed among these services?

Ingressing is a powerful way to decouple routing rules with core application logic.

According to kubernetes,

Ingress is a collection of rules that allow inbound connections to reach to reach cluster services.


In this post, we’ll deploy a couple of services in the kubernetes cluster and then define an ingress to route the requests to one of them according to the rules.

By the end of this post, we’ll have a basic understanding of ingressing  and a working demo to showcase its power.

More On Ingress

To allow inbound connections to reach cluster services, ingress configures a layer 7 load balancer and provides the following:

  1. TLS.
  2. Path-based routing.
  3. Name-based virtual routing.
  4. Custom Rules

With ingress, connections can’t reach our services directly. Instead, they reach the ingress endpoint and then are routed to a service based on rules.

With this in mind, let’s move forward to a working example.

Step 1: Spawn first service and deployment

We’ll be creating two services and deployments, named cats and dogs.

In this step, we’ll be spawning our first service.

Above is the .yaml file for our cats-deployment. Run the following command to create the cats-deployment.

kubectl create -f cats-deployment.yaml --validate=false

Now, we’ll create our cats-service.

Run the following command to create our cats-service.

kubectl create -f cats-service.yaml --validate=false

As you can see in the deployment file, we are also specifying a volume associated with the container named /home/docker/cat_volume.

Run the following commands after starting your minikube VM to host a file at that volume’s path.

minikube ssh
mkdir cat_volume
echo "

cat service content

" > "index.html"

Tada! We have our first service and deployment up and running.


Step 2: Create the second service and deployment

We are going to name this one dogs.

Following the steps given, above create the deployment and service for our faithful friends dogs.

Here are the YAML files.


Step 3: Hit the endpoints of our services to see the content we just hosted on them.

Run the following command to get port numbers for the services.

kubectl get services

This will list all the services running the in kubernetes cluster along with their post numbers.

We should see something like this.


Get the port numbers and hit the browser to reach the pages of the two services we just hosted.

Use the following command to get base IP of the minikube VM

minikube ip

Here is how our two services cats and dogs are looking.



Step 4: Create the ingress for our services.

Following is the YAML file that we’ll use to create the ingress.

First, we need to start the ingress controller.

minikube addons enable ingress

With the following command, create the ingress.

kubectl create -f pets-ingress.yaml --validate=false

As we can see in the YAML file, we are doing name-based virtual routing between cats.myweb.com and dogs.myweb.com, routing them to our cats and dogs service respectively.

For the sake of our demo to work, we’ll have to add these hosts in our /etc/hosts file.

Add the following line in your /etc/hosts file.   cats.myweb.com dogs.myweb.com


Step 5: Hit the paths to see the ingress controller in action!


Congrats! Our ingress is working as expected and routing the names to their services like a routing ninja!



In this post, we got to know basics of ingressing and created a working demo to get the feel of its power.

There is a lot that ingress can do, let’s all keep exploring untill we fully learn how to harness its power.



That’s all, folks!

Deploying a nginx application using Kubernetes for Self-Healing and Scaling

Kubernetes is an open source system for automating deployment, scaling and management of containerized applications. A more technical term for it is, container orchestrator which is used to manage large fleets of containers.

Minikube is an all-in-one single node installation for trying out kubernetes on local machines. And the following post covers deploying a nginx application container using kubernetes in minikube.

If you don’t have, then this link has it all to install minikube and kubectl (command line tool to access minikube) : Download and install minikube and kubectl

Step 1 : Making minikube up and running

Ensure that minikube is running.


Step 2 : Open the minikube dashboard

Minikube comes with a GUI tool that opens in the web browser. Open the minikube dashboard with following command :


It should open the dashboard in a browser window and it’ll look something like this:


Looks cool! No?

Step 3 : Deploy a webserver using the nginx:alpine image

Alpine linux is preferred for containers because of its small size. We’ll be using the nginx:alpine docker image to deploy a nginx powered webserver.

Now, go the deployments section and click the create button, which will open an interface like below.


Fill in the details as shown in the image.

We can either provide the application details here, or we can upload a YAML file with our Deployment details.

As shown, we are asking kubernetes to create a deployment with nginx:alpine image as container and that we want 3 pods (or simply instances) of that.

A pod in kubernetes is a scheduling unit, a logical collection of one or more containers that are always scheduled together.

Go on and click that awesome deploy button!

Step 4 : Analyzing the deployment

Once we click the deploy button. Kubernetes will trigger the deployment. Deployment will create a ReplicaSet. A ReplicaSet is a replication controller that ensures that specified number of replicas for a pod are running at any given point of time.

Flow is something like this:

Deployment create ReplicaSets, ReplicaSets create Pods. Pods is where the real application resides.


As expected, we have our deployment, replica set and pods in place.

We can also, check our deployment via command line using kubectl.


Step 5 : Create a Service and expose it to the external world with NodePort

So far, we have our pods up and running. But how do we access them?

This is where a service comes into play. K8S provides a higher level abstraction called as a service that logically groups pods and policy to access them. This grouping is done via labels and selectors.

Then we expose the service to the world by defining its service type and service redirects our request to one of the pod and load balances them.

Create a my-nginx-webserver.yaml file with the following content:


apiVersion: v1
kind: Service
  name: my-nginx-web-service
    run: my-nginx-web-service
  type: NodePort
  - port: 80
    protocol: TCP
    app: my-nginx-webserver

Enter the following commands to create a service name my-nginx-web-service


We can now verify that our service is running :


Step 6 : Accessing the application

Our application is running inside the minikube VM. To access the application from our workstation, let’s first get the IP address of the minikube VM:


Now head to the address and port number of the service we got in above step.


And our app is running! Amazing, give yourself a pat now!

Taste of self-healing feature of the kubernetes system :

One of the most powerful feature of kubernetes is self-healing capabilities (just like Piccolo. DBZ, anyone?). While defining our app we created a replica set with 3 pods. Let’s go ahead and kill one pod and kubernetes wil create another one to maintain the running pod count 3.


As we can see in the image. We deleted the bottom-most pod and K8S created a new one instantly.

Such kubernetes! Much HA (High Availability)!

Taste of scaling with Kubernetes:

Now, our app is receiving a crazy amount of traffic and three nginx pods are not enough to handle the load. Kubernetes allows us to scale our deployments with almost zero effort.

Let’s go ahead and spin up a new pod.



Click OK. Now let’s go and check our pods.


As we can see in the image, we have now 4 pods running to handle the increased traffic.

Isn’t it amazing? We just horizontally scaled our application with the power of kubernetes.

This was just the tip of the iceberg what Kubernetes can do. I am also exploring the kubernetes and containerized architecture just like you, hopefully we’ll be back with another post soon with more kubernetes stuff!

That’s all, folks!


Estimation Peril: How To Estimate Software Projects Effectively(or How Not To Lie)


Consider, you are a rockstar engineer and you are given a task by your favorite person, your project manager, to show some new fields in the dashboard.

As usual, you are asked to estimate it as soon as possible. You think that well, seems like a quickie and you are tempted to estimate it a day. But you, being burnt before, decided to look at the fields that are to be added carefully. These fields are for analytics. You think, ok, let’s make it 2 days then. But being more cautious, you dig deeper and find that those analytics are not even being tracked on the app.

Now to complete the story, you’ll have to track the analytics, send them to the server, make the backend accept those and store them, show these on the dashboard, write tests etc….

What seemed a simple task is now a 1-2 week thing. Very hard to estimate. And your manager was expecting a response like, “would be done by end of day”.

What is the problem with estimates?

The main problem with an estimate is that the “estimate” gets translated into commitment. And when you miss a commitment, you breed distrust.

Most estimations are poor because we don’t know what they are for. They are uncertain. A problem that seemed simple to you on the whiteboard, turned out not to be so simple. There were non-functional requirements, codebase friction, some unfortunate bugs etc. We deal with uncertainty.

There is a rule in software engineering that everything takes 3X more time than you think it should, and this holds true even when you know this and take it into account!

Estimates can go the other way too, that is when you overestimate. This is as dangerous as underestimating.

What should an estimate look like?

An estimate should have 3 characteristics :

  1. Honest (Hardest)
  2. Accurate
  3. Precise

1. Honest : 

You have to be able to communicate bad news when the news is bad. And when the continuous outrage of your managers and stakeholders is on your face, you need to be able to continue and assert that the news is bad.

Honesty is important as you breed trust. You are not eliminating disappointment, rage and people getting mad, but you will eliminate distrust.

2. Accurate :

You are given a task and you estimate it to take somewhere between now to the end of the universe. That’s definitely accurate, it’ll be done within that time.

We won’t breed distrust, but we definitely will breed something else.

Which brings us to the 3rd characteristic.

3. Precise : 

An estimate should have just the right amount of precision.

What is the most honest estimation that you can make? I don’t know!

This is as honest as it can get. You really don’t know. But this estimation is neither accurate not precise.

But when we try to make precise estimates, we must note that we are assuming that everything goes right. We get the right breakfast, traffic doesn’t suck, your co-worker is having a good day, no meetings, no hidden requirements, no non-functional complexities etc.

Estimating by work break down

The most common way to estimate a complex task is to break it down into smaller tasks, into sub-tasks. And then those sub-tasks into sub-sub-tasks and so on until each task in hand is manageable and ideally not more than 4 hours of work.

Imagine this forming a tree, with executable tasks at the bottom as leaves. You just estimate the leaves and it all adds up.

This approach works, but there are 2 problems :

  1. We missed the integration cost
  2. We missed some tasks

There is a fundamental truth to work break down structure estimates:

The only way to estimate using work break down chart accurately, to know what are the exact sub-tasks, is to implement the feature!

What to expect from an estimate?

Estimates are uncertain. There is no guarantee that your estimate will work itself out. And that’s OK. It’s your manager’s job to manage that risk. We are not asking them to do something outside of their job.

The problem arises when you make a commitment. If you make a commitment, you must make it. Be ready to move heaven and earth to make it. But if you are not in a position to make a commitment, then don’t make one.

Because he’s going to set up a whole bunch of dominos based on that commitment, and if you fail to deliver, everything fails.

Some interesting links :


Uncle Bob on Estimates: https://www.youtube.com/watch?v=eisuQefYw_o

Happy Estimating!

That’s all, folks!


Clean Code Chapter 1&2: Clean Code & Meaningful names

I have started reading the book Clean Code by Robert C. Martin, which is considered to be a industry standard for writing maintainable and elegant code.

Because this book is such a heavy read, and each chapter is full of content and a knowledge bank in itself, for personal reference I’ve decided to summarise each chapter in a set of blog posts.

Chapter 1 : Clean Code

This was more like chapter 0. Author describes what is clean code and cost of maintaining it. How clean code is directly related to team productivity and what makes clean code clean.

It contains views on clean code by many of industries best known people like Bjarne Stroustrup, Michael Feathers etc.

One of my favourite definitions form the book covers it best :

I like my code to be elegant and efficient. The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance, error handling complete according to an articulated strategy, and performance close to optimal so as not to tempt people to make the code messy with unprincipled optimizations. Clean code does one thing well – Bjarne Stroustrup

Chapter 2 : Meaningful Names

Names are everywhere in software. We name our variables, our functions, our arguments, classes, and packages. Because we do it so much, we should do it well.

1. Use intention revealing names : 

The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used.

int d; // time elapsed

Here d reveals nothing. A better name would be

int timeElapsedSinceCreation;

2. Avoid Disinformation :
Programmers must avoid leaving false clues that obscure the meaning of code. We should avoid words whose entrenched meanings vary from our intended meaning.

int accountsList;

Should only be named so if it is a actually a list data structure that’s used to store the accounts. Not an array or set.

3. Make meaningful distinctions
Entities named different, should be different, mean different.

If we have classes called




, you have made the names different without making them mean anything different. Info and Data are indistinct noise that doesn’t differentiates what they actually mean.

4. Use pronounceable names

Makes communicating about the code easy.

long genydhms;

is not a good name.

long generationTimestamp;

is a better choice

5. Use searchable names
Avoid single letter variables and constants as they are difficult to search.

6. Avoid Encodings

Hungarian notations, member prefixes, interface and implementations should be avoided.

It just adds another burden to remember the encoding format being used.

7. Avoid mental mappings
Readers shouldn’t have to mentally translate your names into other names they already know.

int r;



is lower cased url with host name removed adds to much requires too much mental juggling and mapping when working with the code.

A better name would be,

int urlWithoutHostName;

8. Class Names
Classes and objects should have noun or noun phrase names like Customer, WikiPage, Account, and AddressParser.
Avoid words like Manager, Processor, Data, or Info in the of a class. A class name should not be a verb.

9. Method Names
Methods should have verb or verb phrase names like postPayment, deletePage, or save.

10. Don’t be cute

If names are too clever, they will be memorable only to people who share the
author’s sense of humor, and only as long as these people remember the joke.

Don’t tell little culture-dependent jokes like eatMyShorts() to mean abort().

11. Pick one work per concept
Pick one word for one abstract concept and stick with it.

It’s confusing to have a controller and a manager and a driver in the same
code base. What is the essential difference between a DeviceManager and a ProtocolController?

11. Use solution domain names
It’s OK and preferable to use names from computer science and programming domains.
In transctionObserver

the word observer means a great deal to person who knows the observer pattern.

12. Use problem domain name
The code that has more to do with problem domain concepts should have names drawn from the problem domain.

int mriRecord

In a healthcare app will give a great deal of context than just

int record


13. Add meaningful context
Enclose names in well named functions, classes, namespaces, etc.

String state;

In a class called FiniteStateMachine will mean different that in a class called Address.

14. Don’t Add Gratuitous Context
In an imaginary application called “Gas Station Deluxe,” it is a bad idea to prefix every class with GSD.
Frankly, you are working against your tools. You type G and press the completion
key and are rewarded with a mile-long list of every class in the system. Is that
wise? Why make it hard for the IDE to help you?


This was part one of a 16 part series on the book Clean Code by Robert C. Martin, where each post covers a gist of a single chapter.



Philosophy Behind The Offensive Programming


Recently I was listening to a podcast and there was this really smart guy Piwai talking about something that instantly captivated by attention. That was the coining of the term Offensive Programming.

What is offensive programming?

Well, you can find the literature on  Wikipedia and also I am not the best person to explain that. So check that out please. But fundamentally, offensive programming refers to a style of programming that is exact opposite of the more famous counter-part the defensive programming.

Defensive programming refers to coding style which adheres to dealing gracefully with conditions that should not happen.

Offensive programming on the other hand, well just tells you to let the app crash. Don’t try to recover, don’t try to handle the exception, just log the stack trace and crash.

The reason behind this is that in reality the problem can be much bigger and somewhere else in the code, as a side effect of you are getting this error in first place. This forces you to fix the problem at the source and will possibly result in a healthier code base.

When it makes sense to be offensive?

This was my exact concern while I was listening to this podcast. Thankfully, Piwai answered that himself. I also, talked about it with a really smart guy at the office and he also made the same remarks.

So at Square (the company who do payments and author libraries) what they do is, they stick to a defensive style of programming  for interfaces and parts of code that deals with external interfaces and/or user interactions. Basically, something that is not in your control.

But, for the internal interfaces, where the classes you wrote are going to interact with each other, you don’t have to be that paranoid about that. This is where he (Piwai) said you should switch to the offensive approach. You have full control over the classes you wrote, and the expected behaviour is in your control. If it fails to do so, it’s better to just crash and let the problem to be fixed at the source.

That is the exact reason he said at Square, they make very liberal use of assertions in the code. Assertions are not forgiving at all.

Example Please!

I would attempt to point to examples here, one that the Piwai himself talked in very brief and the one that I’ve encountered myself where I thought it made sense.

In this example, say we are handling credit card objects. There is no point to internally validate the credit card object every time you deal with it.

As soon as we get a credit card, we decorate it with a validated credit card. That’s all the defensiveness we had to offer.

Now internally, we go offensive and throw exceptions or assertions every time we encounter an invalidate credit card object.

The code below is not perfect, but can give you an idea.

class ValidatedCreditCard extends CreditCard{

    CreditCard creditCard;
    ValidatedCreditCard(CreditCard creditCard){
      // Handling external user interactions defensively.
      catch (CreditCardValidationError e) {
        // Handle and try to fix the error
      this.creditCard = creditCard;

public static void main(String[] args){

    CreditCard c  = getCreditCardFromUser();
    c = ValidatedCreditCard(c);
    // Time to go offensive
    // ...
    if (c == null){
      throw new CardInvalidException();

Another example I can think of is a much simpler one and more relatable.
Suppose, we have a utility function that uploads a file to s3.
It would make sense to follow offensive programming style and just throw an exception if somehow they file or the key reaching the function is None.

def upload_file_to_s3(file, key):
    if file is None or key is None:
        raise TypeError


Few more tips from the podcast

1. How to start with offensive programming?

Best way is to start putting assertions in the code, where you think is suitable. Yeah, we’ll experience more crashes and that’s awesome!

Because now we know that we have a problem.

2.  We feel more confident about the code base:

We just know that, this method doesn’t try to handle nulls, thus I can confidently say that it was not null or it would’ve crashed.

3. Do incremental roll outs.

When you ship a code, roll it out like for 1% of users. We’ll have a ton of crash reports, and that’s good! I mean not for the 1% users but they are taking one for the team!

4. Crash at preventable errors and recover from expect-able errors :

Preventable errors are invalid arguments, NPEs etc. Go offensive on these.

Expectable errors are like resource depletion, invalid user inputs etc.

Try to recover from these.


Overall, it was nice to listen to a guy who works at a company like Square talking about how they use offensive programming for a healthier code base. And if Square is doing something, we all can learn something from that!