What I learned in 2017 Writing Go

2017-12-31

A little over a year ago, I joined Cloud Foundry to work on Loggregator, Cloud Foundry's application logging component. Its core concern is best-effort log delivery without pushing back on upstream writers. Loggregator is written entirely in Go.

After spending more than a thousand hours working with Go in a non-trivial code base, I still admire the language and enjoy using it. Nonetheless, our team struggled with a number of problems, many of which seem unique to Go. What follows is a list of the most salient problems.

Project Organization

Cloud Foundry was an early adopter of Go at a time when few people knew what idiomatic Go looked like or knew how to structure a large project. As a result, a year ago Loggregator suffered from a haphazard organization which made understanding the code difficult, let alone identifying dead code paths or places for possible refactoring. There seemed to be a tendency to extract tiny packages first instead of waiting for a shared concern to emerge from the code and only then extracting a package. There were many examples of stuttering between package names and types. Worst of all, there was little reusable code in the project.

Given the code's state of organization, Peter Bourgon's advice on how to organize Go code has been invaluable, as is the rest of his material on best practices. Likewise, the Go blog's post on package names provides many helpful guiding principles. For especially large projects, the distinction between cmd and pkg has become a best-practice. See, for example, Moby, Kubernetes, and Delve. More recently, there is the excellent Style guideline for Go packages.

When just starting a project, the pkg package seems unnecessary. I prefer to begin with a cmd directory and with whatever go files at the top level, as if the project were a library. As the project grows, I like to identify packages, which start out as peers of the cmd package. When the time seems right, it is easy to move those various peers of cmd into a pkg package.

Small main functions

Just as a poorly organized project results in a ball of mud, a careless approach to a main function can result in needless complexity. Compare two versions of the same main function: before and after. One version is over 400 lines. The other is about 40 lines. That's an order of magnitude. One will be easy to change. The other will not be. Delve is exemplary in its clean and focused main function.

A main function should be a particular invocation of library code. That means collecting any input necessary for the process and then passing that input to library code. This style of main functions is more likely to result in testable and composable code.

Dependency Management

Dependency management has been a perennial topic in the Go community. Loggregator has used git submodules to vendor dependencies. The approach works, but it's also cumbersome. Spending some time with Rust has reminded me how sorely Go needs an officially supported dependency management tool as part of the Go toolchain. The work on dep is encouraging.

Keeping Go Meta Linter Happy

Without running Go Meta Linter regularly, all sorts of mistakes will creep into a code base. In particular, I have discovered the value of Package Driven Development, i.e., writing code that looks good when running godoc some-package, a practice which shares a history with conventions in the Python community. The documentation for a package should be easy to understand, it should be intention revealing, and it should be meaningful.

Over the course of the year on numerous occasions I lamented the lack of documentation for Loggregator internals, which slowed down the process of understanding even further. Fortunately, our team has come to share the view that documentation is important and has been gradually working to ensure all files within the project pass the Go Meta Linter.

Writing Performant Code starts with measuring

Go is capable of fast performance. It is tempting to prematurely optimize code with the idea that a particular design is "faster." In fact, until you have measured current performance and determined that current performance is inadequate, "faster" is a totally meaningless word.

Such a statement is hardly controversial, and yet I have worked with numerous well-intentioned individuals who immediately reach for sophisticated designs on the dubious grounds of their being "faster." Fortunately, there is a strong interest in the discipline of writing high performance code. See, for example, Dave Cheney's High Performance Go Workshop or Damian Gryski's in progress book on Go Performance.

Having a shared nomenclature for testing

There seems to be a consensus that writing well-tested code is important. What is lacking, though, is a clear understanding of the differences between test-doubles, mocks, spies, stubs, and fakes. Uncle Bob has explained what each of these terms mean in his The Little Mocker. On Loggregator, we had a mix of all these terms and they were rarely used correctly. It may seem pedantic to insist on using these terms correctly, but then again, software engineering leaves little room for ambiguity, so why would we not use the same standards for our choice of words? In my view, a shared nomenclature -- what has elsewhere been called a "ubiquitous language" -- is the first and most important thing for a team of engineers.

The Basics

Finally, both for people new to Go and people experienced with Go, I continue to find immense value in the following posts.

Edit

Thanks to Jason Keene for reading through this post and pointing out GoDoc's relationship to Python.