Unless you’ve been hiding in a container for the past few months, you’ve probably heard of Kubernetes (often called k8s), the best container orchestration tool around. K8s configuration is a bit more involved than a simple Docker run command or Compose definition. However, in return for this complexity, you get a cluster that is fault tolerant, self-healing, and auto-scalable. If you are looking to move from native Docker tooling to k8s, I’ve created a tool for you!Keep reading
A few weeks ago we were alerted that Docker Swarm was using over 8GB of RAM. Our investigation led us to discover an unexpected factor that determines its memory usage. After a bit of graphing and math, we were able to locate the code behind this unexpected behavior.Keep reading
In my previous post, I went over how our event-driven architecture allows us to rapidly ship new features. This post covers how we used this model to ship a new feature: Docker Compose support.
Docker Compose enables our customers to build environments on Runnable using the same configuration they use to deploy to production, staging, or wherever they currently use Docker Compose. The best part? They get this with no additional setup.Keep reading
It’s important that we provide users with the best experience. Part of that means that our service is available through hardware failures. And when things do go wrong, we need systems in place to monitor key metrics and send alerts to services and our team. Initially, we chose Datadog as our monitoring solution because it was easy to set up, and it provided integrations to services that we used. Then we started scaling our customers’ infrastructure to keep up with demand and saw our infrastructure go from 5 to 500+ servers. This didn’t jive with Datadog’s per-server cost model, as it increased our bill from $75 to $7,500+ per month. In order for us to move away we needed something that provided auto-discovery of new servers, collected host and container metrics, alerted us on abnormal conditions, and had an easy way to visualize data. We turned to the open-source world and discovered Prometheus, a monitoring solution built by SoundCloud.Keep reading
When we first started deploying containers across multiple servers, we managed scheduling ourselves. We had to maintain cluster state and determine the best place to schedule a container. We had a solution, but it was not elegant or pretty. When Swarm came out, it promised to solve our scheduling woes. Unfortunately, using it in production hasn’t been as straightforward as we’d hoped. In this post, I’ll cover the problems we encountered and how we worked around them.Keep reading
Using the right patterns to communicate between microservices can help scale your application and solve most distributed systems problems. We started with direct HTTP calls for all communication, but decided to move to an event-driven system. This system changed the way we thought about interactions between services, forced scalable patterns, and increased our resilience.
We moved to using events over traditional HTTP communication for a few reasons. First, it forced decoupling of services. From our experience with HTTP, one service would make calls to every service it needed to, and that meant the original service would need a client library for every service it communicated with. The client library would ensure errors would not stop or block functionality, and would be consistent with each service.
We run hundreds of thousands of containers across hundreds of servers a day. One of the biggest challenges we face is how to efficiently schedule containers. In this sense, scheduling is managing the allocation of containers to a set of servers in order to keep things running smoothly. Because the containers we schedule are components of our customers’ applications, we have to schedule them with no prior knowledge of their performance characteristics.Keep reading
Building and compiling code can take a huge hit on our time and resources. If you have dockerized your application, you may have noticed how much of a time-saver Docker cache is. Lengthy build commands can be cached and not have to be run at all! This works great when you’re building on a single host; however, once you start to scale up your Docker hosts, you start to lose that caching goodness.Keep reading
Integration testing and debugging many microservices can be painful. Often, I need to debug a service on a staging environment. This article shows how we use Docker for Mac with Weave (an overlay network) to connect our local machine to our remote staging environments.
In my workflow, I usually create a WIP git commit, push it to staging, and try to debug with the Ubuntu server’s limited tools. I could set up a bunch of SSH tunnels to connect to all the remote services, but our stack changes too frequently and finding IP addresses for each service is a pain. My dev machine is a Mac, so most of the tools we use locally don’t work on Linux.
I wanted something better that would speed up my dev and debugging flow. The first thing I tested with the Docker for Mac beta was Weave integration. Weave creates an overlay network to connect containers across multiple hosts together, which is very useful when distributing containers across a Swarm cluster.