bitly’s stack isn’t fancy. We use a combination of languages and tools that interact well with each other and promote philosophies such as explicitness, simplicity, and robustness. Most of our services are Python with many core components in C (of which some are open sourced under simplehttp). We believe whole-heartedly in a services oriented approach where components do one thing and do it well.
C is a fantastic language, one which I highly encourage all developers to learn and experiment with as it pulls away the veil and provides insight into all the things that are happening behind those seemingly innocent lines of a scripting language. With great power comes great responsibility and sometimes that power comes at a cost – primarily in terms of development and debugging time and thus the time it takes for you to ship your product.
Python has also treated us well. Its readability, standard library, and obviousness make it a great language to work with on a team. It performs well and is flexible to the point where it can be used successfully in most any situation – whether it be data science, network daemons, or one off shell scripts.
We identified early on that Go had all the makings of a language that could supersede some of the places we would have traditionally turned to C and some of the places where we wanted to move away from Python.
For example, we’ve built many network daemons in C that don’t require strict, absolute, control over memory. This is where Go shines. The “cost” of a garbage collected language in these types of situations is overshadowed by the huge benefit to development time and flexibility. Code becomes more readable as a natural side effect of being able to focus more on the “meat” of the problem instead of having to worry about low level details. Mix in the “batteries included” standard library and you’re cookin’.
Also, all of our C apps are built on libevent - i.e. a single-threaded event loop with callbacks that enables us to efficiently handle many thousands of simultaneous open connections. We use the same technique in Python via tornado, but the amount of work a single Python process can do is limited largely by runtime overhead. Additionally, this style of code can be hard to read and debug when you’re not used to it. Go’s lightweight concurrency model allows you to write in an imperative style while providing a built-in way to leverage the multi-core architecture of the computers we operate on.
Just as in C, Go promotes a style of very explicit and localized error handling. While sometimes verbose it forces a developer to really think about how their program might fail and what to do about it. Unlike C, Go has the built-in capability of returning multiple values from a function, resulting in clearer and more concise error-handling code.
The points above are mostly low-level. Go’s true power comes from the fact that the language is small and the standard library is large. We’ve had developers start from zero experience with Go to writing working, production, code in a day. That is why we’re so excited about its place in our stack.
Think about it like this… if you can write something in Go just as fast as you could in Python and:
- gain the speed and robustness of a compiled, statically typed language without all of the rope to hang yourself
- clearly express concurrent solutions to parallelizable problems
- sacrifice little to nothing in terms of functionality
- unambiguously produce consistently styled code (thank you
What would you choose?
And in an effort to encourage the team to experiment and learn about Go we created the “Go Gopher Award”! The most recent engineer to learn the language and get code into production gets to have the Go Gopher Squishable sit with them at their desk. This is an extremely prestigious award.
Go Isn’t Perfect
Let’s talk about garbage collection. It isn’t free. Go’s garbage collector is a conservative, stop-the-world, mark-and-sweep GC. Practically speaking this means that the more careless you are in what, how many, and how often you allocate the longer the world stops when the Go runtime cleans things up. Unsurprisingly, significant performance gains can be had by being more careful… not only in raw speed but in the consistency of it. Fortunately Go provides fantastic hooks (pprof) to introspect the runtime behavior of your application. It’s generally obvious when your bottleneck is the GC.
Interacting with JSON can be a bit clumsy. It’s not terrible… just isn’t as elegant and readable as the alternative in Python. Truthfully it’s quite impressive that it’s as good as it is considering the type safety of the language. We’ve made an effort to make this even easier and open sourced simplejson, a package that exposes a clean, chainable, API for interacting with JSON where the format may not be known ahead of time.
Packaging has also given us some trouble. This is an area where we’ve probably gone the most back-and-forth and are admittedly still in the early stages of finding what works for us. The biggest problem is somewhat related to its purported greatest strength, i.e. qualifying the full path when importing an open-source package. This has led to a whole crop of third-party utilities that work around this by re-writing fully qualified import paths to something the user specifies (and installing the package as the same, modified, name). We use go-install-as and our policy is to install non-standard packages under our own custom import paths to ensure that our apps are always compiled against the version and source we’ve strictly enforced to be installed on a machine.
One notable feature missing from the HTTP client in Go’s standard library is the ability to set timeouts. It is further complicated by the inability of the API to provide easy access to the connection (to do it yourself). We set out to bridge the gap and wrote go-httpclient (with the expectation that these features will eventually make it into the standard library).
Oh, and one night after way too much drinking they decided to use a radical syntax for
formatting time as strings (instead of your standard
strftime parameters). We open sourced
go-strftime to remedy that.
All in all these “issues” are minor and ultimately we felt that the benefits far outweigh the tradeoffs – Go meets our needs and philosophies as software engineers.
Go In Production
Where are we actually using Go in production? Let’s take a step back and talk about how data moves through our infrastructure first.
In the spirit of keeping things simple, fault tolerant, and asynchronous, most everything that gets
written to downstream systems is done so via message queues (formatted in JSON with all
communication over HTTP). Some of these message queues process volumes in the thousands per second.
We call the workers
queuereaders, all of which were originally written in Python. At the lower
level are two C apps – simplequeue, an HTTP interface to an in-memory data agnostic queue, and
pubsub, an HTTP interface for one-to-many streaming.
As volume increases, the overhead of processing each message becomes more significant. Therefore, these applications emerged as one key area where Go could replace Python. We started by re-writing some queuereaders in Go by building matching libraries to support the same API we expose in Python. Also, whereas in Python we would run multiple processes, in Go we leverage channels and goroutines to write straightforward code that parallelizes HTTP requests to downstream systems or aggregates writes to disk. These same features make it really easy to batch retrieve messages from the queue while still processing them independently, reducing round-trips.
By experimenting with non-critical systems at the edge of our infrastructure we were able to safely and methodically learn about Go’s behavior in production. Needless to say, we’ve been very happy. We’ve seen consistent, measurable performance gains without sacrificing stability.
We also wanted a bit more functionality out of the core combination of
so we decided to build a successor with the goal of improving message delivery guarantees,
maintaining the spirit and simplicity of what we already had, increasing performance, solving some
configuration and setup issues, and providing a straightforward upgrade path.
We realized Go would be perfect for this project. We’ll talk about it in depth in a future blog post (and don’t worry, it will be open source). Is it any good? Yes.
Finally, we’ve also open sourced a few other Go related items…
- go-notify - a package to standardize one-to-many communication over channels
- file2http - a utility to spray a line-oriented file at an HTTP endpoint
As always, bitly is hiring.