Service discovery is a difficult but essential part of modern application development and delivery. Below, I’ll take a look at what service discovery means, why it’s important and which tools can help your DevOps team master it.
What is service discovery, and why does it matter?
The concept of service discovery is simple enough. As its name implies, the process involves allowing your apps or servers to identify which services are available on the network, and which IP addresses and ports are associated with them.
Service discovery is therefore essential for efficient inter-app and inter-device communication. Yes, you could identify service information manually, but that would entail a huge amount of work—so huge that on modern scaled-out infrastructure it would simply be an impossible task for human admins to handle. So instead of trying to do things that manual way, you rely on service discovery tools to collect service information for you.
And to be clear, I’m not talking just about traditional networking here. Identifying hosts on the network is one form of service discovery. But if you’re a DevOps team building a microservices-based app (which you probably are if you’re using Codefresh to automate your development pipeline), then the main reason why service discovery matters to you is that it’s essential for allowing apps to exchange information via APIs. And since APIs are the glue that holds modern clustered infrastructure together, you can’t really live without them.
Service discovery challenges in the DevOps age
Once upon a time, service discovery was not particularly challenging. It was easy enough to automate by building databases of service information and updating them periodically.
But that approach no longer works in the DevOps age. Modern infrastructure is so scalable and changes so quickly that keeping track of service information inside a periodically updated database doesn’t work. Instead, you have to be able to discover services in real-time, and update your service information continually. DevOps is all about continuous workflows, after all.
If you’re into metaphors, here’s another way to understand database-based service discovery: In the days of traditional infrastructure, service discovery worked like a phonebook. Application services within your infrastructure ran on fixed, known addresses and ports, which did not change. So even if your registry of service information was not regularly updated, you could still count on it to work — just as a phonebook that is ten years old would still contain mostly accurate information, because people do not change phone numbers very often.
But today, the phonebook approach fails because the infrastructure changes too often. Virtual hosts are added and subtracted from the network at dizzying rates as containers or virtual servers spin up and down in response to shifts in infrastructure demand. Plus, because modern infrastructure makes heavy use of software-defined networking, and because Dockerized apps tend to pay little heed to traditional conventions for configuring port numbers for particular services, discovering IP addresses and ports based on expectations about common practices is not reliable.
Modern service discovery
So, how do you solve this conundrum? You use a service discovery tool that is designed to handle the complexities of modern infrastructure. In this section I’ll discuss some of the leading tools available, and what you need to know about how they work.
Service discovery models
Before choosing a service discovery tool, you should understand that there are two basic approaches to service discovery in modern architectures. The first is client-side discovery, which means that you rely on clients to request and collect information about services. Service discovery in this model is decentralized.
The other approach is server-side service discovery. That means that centralized servers keep track of service information and push it out to clients.
Technically speaking, there are few important differences between these two models. They both achieve the same effect. The main difference is the level of centralization and connectivity involved. If you have a large network of decentralized clients that don’t always have reliable connections to the main server—which could be the case if you develop a mobile app, for instance—then client-side service discovery will probably work better for you. On the other hand, if you’re running private on-premise cloud and can count on near-100% availability of all hosts, server-side client discovery should work fine, and might be simpler to implement than the alternative.
Service discovery tools
The more interesting issue to consider is which service discovery tool might work best for you. Here’s a rundown of the leading tools available for containerized infrastructure, and brief notes on how each works:
-
Consul: This tool from Hashicorp offers a simple way for services to discover one another via HTTP and DNS. It’s designed for massively scalable infrastructure and high-availability, which make it a good solution if that’s what you are working with. It may be overkill if your infrastructure is relatively small, however.
-
etcd: A well-vetted service discovery tool developed primarily by CoreOS. It’s the most obvious service discovery solution for you if you already use other tools in CoreOS’s software universe, like Kubernetes.
-
Apache Zookeeper. Originally developed for use with Hadoop, the big data platform, this is a server-side service discovery tool. It’s a good choice if, for the reasons outlined above, you’re interested in a centralized service discovery solution.
Docker Swarm
The fourth tool worth noting is Swarm, Docker’s container orchestrator. But instead of placing Swarm on the list above, I’m dedicating a special section to it. That’s because Swarm itself is not a service discovery tool, but it integrates with any of the three tools outlined above to do service discovery.
So why mention Swarm at all, you ask, if it just uses external backends for service discovery? Because Swarm is now baked into Docker as part of the Docker 1.12 release. That means that there’s a good chance you’re going to be using Swarm for the orchestration of your app containers, and you should understand how Swarm does service discovery. That’s all explained in detail by Docker here, so I won’t rehash the information, but I note it to help you understand how all of these pieces of your containerized infrastructure fit together.
Conclusion
Service discovery for modern infrastructure presents new challenges, which arise from the flexible and scalable nature of the infrastructure itself. But those challenges have solutions, and you can take your pick from among several tools when deciding how best to perform service discovery in your environment.