Luis Silva
DevOps, Cloud Expert
Quick dive into Microservices: benefits, challenges, real-life examples, and best practices
2022-10-25
Microservices architecture is something that is often mentioned as the architecture to follow, specially in cloud based solutions. But why is it so, and is it really the best option all the time? In this article, I'll guide you through some of the good and the bad, show some real life use cases of it in action, along with best practices to follow if it is indeed the right choice for you.
Brief history
Software was done traditionally as monoliths. A product consisted of a multitude of features that lived inside the same codebase, were maintained and operated as a huge, concise block. Eventually, due to problems such as single points of failures, scaling issues and coupling, came the need to split up the software into smaller, more manageable pieces. Along this path, came SOA (Service Oriented Architecture), that relied on the web to interface between smaller services. As SOA became finer grained and service sizes became smaller, also the ecosystem around it changed. With the introduction of PaaS (Platform as a Service) and the cloud, it became easier to create very small, focused services. Thus microservices started gaining traction.
Advantages of microservices advantages over Monolithic architecture
So what do microservices bring to the table that make them such an exciting alternative to monoliths? Let's take a look at some sections where there's clear advantages.
Ownership
When the an application is split into smaller parts, you can have teams or individuals focused on specific parts of the application. This increases the ownership sense: As an individual contributor, you own that microservice, it's yours. It's never "someone else's problem" when it comes to your microservice. This can bring a sense of pride when your microservice works flawlessly, as well as focusing expertise on certain domain areas.
Scalability
Easier horizontal scaling as opposed to vertical scaling. When the service to scale is smaller, you can fine-tune scaling better, achieving better service availability and cost reduction. Containers (think Docker), offer a lot to better scalability, as scaling up or down means adding or removing copies of containers that needs scaling, with potentially no need to add more servers.
Blast radius reduction
Having the application split into tiny pieces makes it easier to repair or replace certain parts. If a bug caused a microservice to crash, only a localised part of the whole service will be impacted. Recovery or mitigation become simpler, thus reducing MTTR (Mean Time To Recovery).
API-first design
The way microservices interact with each other is through APIs. It's essentially a contract where you tell other services what you can do, and what you need to do it. In microservices, APIs must be 'first class citizens'. Forcing services to always have an API, from first design, educates the design to follow from the start. The service must fulfil the API needs, not the other way around.
Technological freedom
Since communication is done through APIs, the technology used inside each microservice can be tailored to the needs of the functionality. Need a highly performant daemon service? Maybe C is the right language. Need a data analysis service? Perhaps Python is more adequate for that, with its specialised data analysis tools.
Disadvantages of microservices architecture
The advantages sound great! So why shouldn't microservices be used everywhere?
Added operational effort
Having a microservices architecture means that each service will have dedicated infrastructure, be it cloud resources or physical servers. Each service will have an independent resource stack, and each stack will have an independent release cycle. In case you follow a Continuous Integration/Continuous Delivery for release, this means a specific release pipeline for each microservice. The added operational effort is not only during setup stages, but it follows through the lifespan of the service. Development teams need to manage specific code bases and repos, code changes, approval workflows, automated tests, alarms, deployment and rollback procedures. At scale it is common to have development teams own 1 or a couple of microservices, and this complexity is already accounted for in a standard 6-8 people development team, but in smaller teams and companies there is an added DevOps effort.
Increased communication complexity
This concern is twofold: first, there's a cost to managing communication between microservices. Distributed systems are inherently more complex to set up. They require an increased development time to define APIs, operations and clear access permissions between resources. Additionally, communication changes are more cumbersome to perform. They require new operations, methods or API versions to be released while keeping support for existing clients and offering clear deprecation paths and procedures. The second part of this topic is organisational communication. Once you have teams owning specific microservices, any project or feature request involving multiple microservices requires inter-team communication. In big projects or milestones these efforts might even need roadmap alignment. Although these orchestration constraints can also happen in monolithic systems, there's a less strict separation of concerns there, meaning the politics of who-owns-what are more flexible and potentially less impactful.
Service discoverability
With many services online there is another component that is vital in architecture, a Service Registry that manages service availability, health status and connections. For instance based microservices, this can take the form of a load balancer, whereas with containers, container orchestration tools like Docker Swarm or Kubernetes have this functionality built in.
Testing
This can be both an advantage and disadvantage and it is important that you follow microservices architecture best practices to avoid some of the testing pitfalls. On the one hand it is simpler to test a clear domain with a smaller test suite. However, to perform end-to-end testing or user acceptance testing (UATs) you need to either mock other services or deploy the full system. At scale, UATs are hard to perform for individual teams. They are usually done in environments that already contain the full system, for example a Beta stage where all teams deploy their individual services but already have service wiring in terms of communications and interoperability.
Changes in application design
When approaching an application, thinking in terms of microservices architecture changes how to design and develop components. It might give added flexibility and modularity but it also introduces additional constraints and challenges.
Sometimes application size doesn't translate to complexity and not all architectures are suited to follow a microservice approach. It's not a silver bullet! As anything in software development, there's a balance to strike between development effort and maintainability, a give and take.
Use cases
Dropbox
Dropbox is a cloud storage service that allows saving files in shared folders sync them to multiple devices.
Dropbox started with a monolithic backend service, called Metaserver. Over the years, as the Dropbox development team increased in size, the pains of working in a monolith increased.
There were many efforts to destructure this monolith into smaller pieces, not all of them successful. An initial effort failed due to reliability issues with the new solution.
After some reassessment, Dropbox decided to use a Hybrid approach. Instead of moving everything to microservices, and suffer the downsides of having microservices, they decided to structure all simple services inside a service called Atlas, while big components would stay as they were, owned by large teams. The reasoning for creating Atlas is, most of the small services are not critical enough to warrant caring about supporting services for each of the microservices independently. Atlas would manage scaling, alarming and other concerns of these services altogether.
Atlas works as a managed service, similar to AWS Fargate, where developers should only care about the code. The target of Atlas is to "provide the majority of benefits of SOA, while minimising the operational costs associated with running a service."
Uber
Uber is a transportation company with an app that allows passengers to use a taxi-like service while also allowing users to register as drivers to charge fares and get paid.
Uber started off as 2 monolithic services. As the company grew from dozens to hundreds of engineers, the pains of having a codebase started to appear, with coupling and availability risks as some of the cited. Uber claims that they could not reach the level of scale and quality offered today without the microservices architecture.
The interesting part about with this use case comes when going from hundreds to thousands of engineers. In that part of their growth, system complexity started to arise as a critical problem. A simple feature can touch multiple services, meaning additional communication with multiple teams, design and code reviews.
To tackle the increased complexity, Uber took steps to reduce granularity with DOMA (Domain Oriented Microservices Architecture). How does DOMA work in a nutshell?
- Organize microservices in collections of services, named domains.
- Organize domains in collections, called layers. Domains inside layers have dependencies in common
- Create gateways that serve as single points of interface with domains.
- Domains should be agnostic from other domains. To share context, Uber uses an extension architecture to create extension points between domains.
With the adoption of DOMA, Uber claims to have been "able to reduce onboarding time by 25-50%".
Best practices
Single responsibility principle
Each service should handle a single logical domain and you should establish clear boundaries between services.
Guarantee decoupling between services
Is your service API exposing internal concepts? Does the client need to know any inner property to make calls on handle responses? If you cannot replace one service internal logic, without impacting clients, that is a red flag that your architecture in coupled. It is one thing to deploy different services and another to have them decoupled.
Don't persist communication artefacts
It might be tempting to persist objects directly from the API controller, but if you persist communication objects, you cannot change persistence layer without impacting clients.
Eventually consistent persistence layer
It is a common pattern for micro-services to communicate via Publish/Subscribe pattern and handle notifications asynchronously. In many scenarios, it is simpler and more accurate to follow an eventually consistent approach, where an event notifies the service, but event details are ignored, and query the publisher as the source of truth. This way the end result of processing a notification stream will be the same regardless of message order.
Extend the microservice to the data storage
Microservices expect separation of concerns even at the data storage level. If this doesn't happen, the coupling persists. If there's ever an issue with the database, all microservices depending on that database are impacted.
When possible, split the data persistence layer for each microservice. As with the code itself, this grants flexibility to choose a specific solution that works best for the target data, instead of having to work for all use cases in the monolith.
Use smart endpoints, dumb pipes
The idea with this statement is to reinforce the idea that the microservices should own all business logic. When it comes to communication, knowing how to communicate should live inside the microservice (the endpoints), not in the communication protocol (the pipe). This is why microservices tend to use "simple" communication protocols, such as REST and message queues, as opposed to BPEL or Enterprise Service Bus.
When microservices own all the business logic, there's no need for another external entity that needs to change when your microservices does, so less moving parts to worry about.
Monitoring is a must
A side effect of having a monolith is that, when something breaks, it's very likely you will notice it, given the blast radius. With microservices, not so much. To guarantee the best availability to your end users, a thorough monitoring and alarming process, based on SLAs, must be adopted. Key monitoring targets include latency, number of errors, resource usage (CPU, Memory, Disk Space). Having the right level of monitoring for each service will allow the service to react to unusual situations (i.e. fatal bugs that require rollbacks, or traffic spikes that require scaling or rate-limiting) in a timely manner and leave your customers happy.
Use an API Gateway
There are several benefits that arise when putting an API Gateway in front of a microservice architecture.
One is the reverse proxy nature of an API Gateway. It hides internal microservice endpoints under a single public endpoint. This allows flexibility when changing the underlying services. Consider migrations, where you're splitting the monolith up into smaller services, or you are building a brand new service to replace another one. The flip from old to new can be hidden behind an API Gateway, where the public endpoint for the service doesn't change, but the underlying service does.
Another benefit is centralising some of the cross-cutting concerns that API Gateway manages well. API Gateways already handle functionality such as authentication, authorization, response caching and throttling, so extracting this from each microservice into an API Gateway seems logical to keep microservices focused. Consider the API Gateway as its own microservice, responsible for tasks related to requests.
Logical architecture and physical architecture may differ
Microservices are a logical concept. The implementation of such a concept doesn't have to be a 1-to-1 copy of the design. Due to size or logical boundaries, you may have multiple services enclosed in the same microservice. To exemplify this, let's conceptualise an example of 1 microservice composed of 2 services. If we have a User Search service and a User Details service, that have isolated codebases, but interact with the same database. It may make sense to logically call this the User Data microservice, while keeping the two service separate. This keeps them together within the same business logic boundary, but allow them to scale independently.
How do I start?
So you figured out the a microservices architecture is the right call for your project. You've designed the architecture, and are ready to implement. I would suggest not reinventing the wheel here. Microservices have been around for a while, and there's tools and services you can use to make the experience better. A lot of these have the best practices proposed in mind, so less stressing about that. So let's take a look at some tools to get you started.
Docker
Docker is an open source platform used to build and run distributed applications, in the form of containers. Although containers are not synonymous with microservices, their usage shares a lot of the advantages of using microservices: isolation and scalability being two of them. Most of the tools below expect applications to be containerised, making Docker or other containers a must-use.
Amazon ECS
Amazon ECS is Amazon's managed container orchestration offering. It's works a layer above Amazon EC2, where you don't deal with instances directly, but you need to manage instances, which includes scaling, monitoring, patching. Because of this, ECS is free, and you pay only for the EC2 instances you use.
Amazon Fargate
Fargate differs from ECS on how the containers run. With Fargate, task execution is serverless, so no need to worry about EC2 instances, that's transparent to the user. This means, there's far less operational effort with this solution, although you also have less flexibility. In terms of pricing, you pay per task execution time instead of per instance, which may seem like an increase in cost at first glance, but it's possible to have a badly implemented ECS solution cost more than Fargate.
Kubernetes
Kubernetes is an open source container orchestration framework. Originated from Google Borg, it's considered the standard in container orchestration. Many of the features offered by Kubernetes out-of-the-box offer a lot of the requirements for microservices, such as service discovery and load balancing.
You can think of Kubernetes in a lower layer of abstraction compared to Amazon ECS, nothing in Kubernetes is managed for you, you will need to setup all the details. An advantage of this is that Kubernetes is vendor agnostic, so you can integrate with any cloud vendor, or even use it on-premises with your company infrastructure.
If this level of fine grained configuration is not what you need, but you still want to use Kubernetes, a managed Kubernetes service like Amazon EKS may be the right choice.
Amazon EKS
Amazon EKS is a managed Kubernetes service. Instead of using Amazon's proprietary orchestration like ECS, it uses a Kubernetes management layer for the same purpose. This makes it the more seamless transition if you're already using Kubernetes, but can be more difficult to configure if not. Choose EKS if you plan to go multicloud in the future (since Kubernetes is vendor agnostic), and expect to scale to a large degree.
AWS Lambda
The one exception on the list to using containers. AWS Lambda is a Function as a Service offering. They offer arguably the easiest path to start using microservices, in terms of ease of development and setup. AWS Lambda is particularly useful for handling short running tasks, such as APIs, or very simple tasks. In those cases, the complexity of containers plus manages orchestration services may not be worth the effort.
Conclusion
It's important to not pick microservices architecture because it's the new shiny thing on the industry, but to understand the benefits and drawbacks and choose accordingly. In this article we analysed the good and the bad about microservices, and offered some advice on how to begin and do it right from start.