With tens of millions of Python invocations every day, what’s a packager to do? The answer selected must account for the insanity of a deeply-heterogeneous production environment: different kernels, different OS distributions, even different versions of system Python. It also needs enough context to carry with it a consistent reference point for when it was packaged relative to the work of thousands of engineers in a single, unified source control tree. Lastly, at Facebook scale with hundreds of thousands of servers, every byte sent over the network and stored on disk counts, and every CPU cycle wasted can create a myriad of challenges associated with data center operations management.
Sure, it’d be easy to show what a beautiful, easy packaging format we’ve developed at Facebook, and sing its praises, but that’s not what this talk is about. Instead, we’ll get into the nitty-gritty, and talk about hard tradeoffs that happen when developing a system in the real world. This is an in-depth look at how Facebook’s packaging has evolved, warts and speed bumps included. Some of the design goals we addressed (and/or issues we hit!) included:
* Synchronizing and pinning versioning of underlying compiled libraries across related tools
* Running the package as a transparently-Pythonic command-line utility, à la shebang, so we could do in-place replacements of packages written in other languages (e.g. C++)
* Optimizing package size transferred across the network
* Optimizing package sizes on disk
* Minimizing package launch times
* Handling packages launched from a network-FS location that subsequently goes offline
Attendees will learn about the different tools and techniques we used to solve these challenges, as well as the reasoning behind any trade offs that were made.
Dan Reif is a Production Engineer on the MySQL Infrastructure team at Facebook, which has automated the management of tens of thousands of MySQL servers in the largest deployment in the world, almost exclusively with Python. Previously, he was Director of Engineering at a managed hosting company.