- Environment Problems
- Core Library Development
- Team Morale and Momentum
- Service Size and Granularity
- Resource Allocation
- Contract Versioning
- API Consistency and Design
- Data Modeling
8 Months Ago…
The development team at Integrate is a textbook example of a team that hit a breaking point with a monolithic architecture. Our architecture and process made it increasingly difficult to keep up with the intense demands of a rapidly evolving market. It was evident that something needed to change.
We did our due diligence and research and assessed what our options were and ultimately landed on micro-services with a sprinkle of altering our processes.
Fast-Forward 8 Months…
Well…the good news is that it feels like we are somehow right on schedule…(not really, but are you ever in software?). All jokes aside, we are where we need to be and have learned an incredible amount about microservices, processes and ourselves.
Anybody who has put even a few minutes into researching microservices has likely come across some of the following articles…
- Martin Fowler on Microservices
- Microservices: Four Essential Checklists When Getting Started
- Microservices in Production: The Good, The Bad and The It Works
- Microservices at Netflix: Lessons for Architectural Design
FYI…I just grabbed those in order from a google search of “microservices”, but I actually did use these and many other resources along the way and so should you.
We did a lot of research and felt we had a pretty good ideas of the common pitfalls and complexities that would be introduced migrating to microservices, and for the most part that research helped us along the way thus far. However, there have been some hiccups along the way that I am surprised did not make it into many (or any in some cases) of these lists.
Our team wanted to put together a brief list of what we felt were our top 10 ‘Gotchas’ thus far in hopes that they might be useful to others researching microservices as we did.
Top 10 ‘Gotchas’
1. Environment Problems
It doesn’t matter if it is your local desktop, staging or production. You will run into issues along the way with migrating from environments that are suitable for monolithic applications to ones that can handle distributed. It may be lost messages on your service bus, or you not being able to get your CI to build your services consistently, but something will happen.
Ramp up your DevOps skill set or hire somebody because you are going to need it. Embrace the reality that the ‘server guys’ now play an integral role in you and your teams successes and failures.
2. Core Library Development
This was a tough one for us to figure out, and nobody seemed to say much on it in other posts so here it is.
Figure out your core library dependency chain and try to structure you team(s) in a way that will reduce chances of any teams having to wait on each other. The time spent waiting for your core service to become stable adds up quickly and is frustrating to everybody.
Give some thought to how you will manage your packages and try to find a suitable package manager. Much like the logging, we did not have to put much time into this as NuGet is the de-facto .NET package manager.
3. Team Morale and Momentum
This was another one that crept up on us. Asking a team of developers to do things differently is never an easy task. Asking them to change the way they communicate and solve problems is an added burden.
In the beginning the many failures will far exceed the few positives that your team will have early on. This can get to a team, and it got to ours. Look for the short wins and don’t get worrisome when any one problem rises, regardless of its impact on momentum.
4. Service Size and Granularity
The time spent discussing whether a service is too big or too small, or should be broken down into smaller services adds up quickly. If you find a golden solution for determining this, then shoot me an email. Otherwise, go with your gut and be open to the idea that your gut is occasionally wrong.
But really…don’t spend too much time discussing one way or the other. The true solution will expose itself with time.
5. Resource Allocation
With any big decisions impacting your business, you should consider whether or not you have the resources to see it through. Resources may be personnel, hardware or even training on new technologies. We had this in mind from the beginning, but it still seemed to catch us off guard
Depending on whether you introduce microservices in small chunks or all at once might influence the amount of resources needed. To avoid this causing any catastrophic problems for your team, make sure you are on the same page as your organization in regards to the resources you are expecting to need.
6. Contract Versioning
I have to tread lightly so I don’t start a troll battle between versioning camps…
This is partially related to #2 on the list because of the amount of time we spent during our core service development introducing breaking changes until we saw them stabilize.
Choose a versioning strategy for your contracts and services that will position your team to minimize the amount of downtime caused by frequent breaking changes early in microservice development. For us, the answer so far has been Semantic Versioning, but there are many methods that may make more sense for your team.
Additionally, consider who will be consuming your contracts and services and set expectations up front…If you have no intentions of third parties consuming your services, then don’t trouble yourselves.
7. API Consistency and Design
The consistency of your APIs can significantly impact the time it takes for you to successfully get microservices to production. In order to avoid any single API Ambassador having to manage the consistency and standards of your API design, put aside time early on to discuss how you can keep your API consistently through frequently updated documentation and defined standards.
Take into consideration internal and external consumer flow. We have operated under the assumption our API would include both internal and external API Gateways. This is important to us because if we designed our APIs around a single consumer flow, then we would hit a breaking point in the very near future.
There is an excellent article written by Netflix on accounting for multiple consumer flows and how it impacts your API design, Embracing differences inside Netflix.
Also,determine whether you are using PUT or PATCH for partial updates because we spent way too long in this.
8. Data Modeling
Eventually, you will need to determine how your data will be maintained between microservices. Hopefully, this happens sooner than later.
How will your consumers retrieve data that is assembled from multiple services? Will each of your services maintain a copy of the data to avoid expensive round trips?
Whether you plan on taking a Domain driven approach or you have other methods, do your best but to try to pick a strategy and stick with it. You will find out whether or not it is right soon enough.
We are still working to figure out what works best for us us but we haven’t ran into any significant issues on either side of the spectrum that would convince us we are right or wrong thus far.
If you haven’t read up on Conways Law then I would suggest reading it. If you don’t believe it… then start believing it.
Chances are, migrating to microservices will require at least some modifications to your team structure, if not significant. With these changes come communication bottlenecks and breakdowns. Many of the communication channels that your team(s) may have utilized before may be obsolete in a distributed environment. More importantly, you may want to place constraints on certain channels to facilitate autonomous thinking within teams. Companies like Amazon has paved the way for these added constraints and have realized significant benefits in service design.
A popular rant by Steve Yegge is scattered over the internet that reflects how serious a company such as Amazon enforces constraints on communication between its service teams.
As a team,we have only recently began to see the reasoning behind these methods, but are learning quickly that the level of autonomy that is needed to scale our teams in a distributed environment is significant and setting these constraints can help facilitate this evolution.
Debugging distributed applications is much more complex than a non-distributed. The IDE and tool support for debugging is horrid at best; and you will find yourself on parts of the internet that you do not want to be trying to fix issues. (Seriously, anything past the 3rd page on Google is just creepy…)
Logging can help bring order to the distributed chaos that are microservices. There are many centralized logging solutions out there and they hardly differ from one another with the exception of support and pricing. Get used to the idea that logging will play a crucial role in your team diagnosing issues.
Very little time was spent by our team choosing how we should implement logging. We defined our needs upfront and ended up going the open source route with Apache Log4Net and Graylog2.