Let’s Encrypt Infrastructure

We occasionally get questions about what Let’s Encrypt’s operations infrastructure is like. Here’s a quick overview.

Let’s Encrypt’s services are operated on dedicated infrastructure with stringent physical access controls. We currently have about 38 rack units of hardware, consisting primarily of Hardware Security Modules (HSMs), compute nodes, storage, switches, and firewalls. There is quite a bit of physical and logical redundancy to protect us from failures.

The hardware is split between two sites. These two sites are separated such that it’s very unlikely that a major event could bring down both sites. At each site, our hardware is located inside a special secure room inside a datacenter. These special rooms require extra authentication, and cannot be entered alone.

We primarily use Linux for operating systems. We make heavy use of configuration management to automate deployments; our goal is that nothing be deployed or configured manually in our environment. We are even working to bring systems not typically manageable in this way under this paradigm. As a result we can re-deploy identical environments in a matter of minutes and there are no surprises.

Our API endpoints and OCSP services are proxied by Akamai. This gives us powerful traffic management capabilities, including DOS mitigation and caching. This greatly increases our confidence that we can keep our services up and running in extreme traffic conditions.

Our infrastructure is constantly under internal review, but we also rely on audits to help ensure safety and correctness. We go through WebTrust audits to ensure that we’re complying with the Baseline Requirements and meeting or exceeding the expectations of the Web PKI community. We also have security audits, including penetration tests, performed by a separate entity. Both audit types provide us with valuable feedback.

Our operations team has worked incredibly hard over the past year to get this infrastructure ready and we’re pleased with the results so far.

MozJPEG 3.0 Released

Today we’re releasing mozjpeg 3.0, featuring a return to ABI compatibility with libjpeg-turbo, compression improvements, deringing for black-on-white text, and a large number of other fixes and improvements.

Code written against earlier versions of the mozjpeg library will need to be updated, see notes here. While this might be a bit of a pain, it’s a worthwhile change because applications can now easily switch between mozjpeg and libjpeg-turbo.

Thanks to Frank Bossen, Kornel Lesiński, Derek Buitenhuis, and Darrell Commander for their help with this release.

Here’s hoping you enjoy smaller JPEG files in the new year!

Let’s Encrypt

Today we announced a project that I’ve been working on for a while now – Let’s Encrypt. This is a new Certificate Authority (CA) that is intended to be free, fully automated, and transparent. We want to help make the dream of TLS everywhere a reality. See the official announcement blog post I wrote for more information.

Eric Rescorla and I decided to try to make this happen during the summer of 2012. We were trying to figure out how to increase SSL/TLS deployment, and felt that an innovative new CA would likely be the best way to do so. Mozilla agreed to help us out as our first major sponsor, and by May of 2013 we had incorporated Internet Security Research Group (ISRG). By September 2013 we had merged a similar project started by EFF and researchers from the University of Michigan into ISRG, and submitted our 501(c)(3) application. Since then we’ve put a lot of work into ISRG’s governance, found the right sponsors, and put together the plans for our CA, Let’s Encrypt.

I’ll be serving as ISRG’s Executive Director while we search for more permanent leadership. During this time I’ll remain with Mozilla.

Too many people to thank for their help here, many of whom work for our sponsors, but I want to call out Eric Rescorla (Mozilla) and Kevin Dick (Right Side Capital Management) in particular. Eric was my original co-conspirator, and Kevin has spent innumerable hours with me helping to create partnerships and the necessary legal infrastructure for ISRG. Both are incredible at what they do, and I’ve learned a lot from working with them.

Now it’s time to finish building the CA – lots of software to write, hardware to install, and auditing to complete. If you have relevant skills, we hope you’ll join us.

Simple Code Review Checklists

What if, when giving a patch r+ on Mozilla’s bugzilla, you were presented with the following checklist:

You could not actually submit an r+ unless you had checked an HTML check box next to each item. For patches where any of this is irrelevant, just check the box(es) – you considered it.

Checklists like this are commonly used in industries that value safety, quality, and consistency (e.g. medicine, construction, aviation). I don’t see them as often as I’d expect in software development, despite our commitments to these values.

The idea here is to get people to think about the most common and/or serious classes of errors that can be introduced with nearly all patches. Reviewers tend to focus on whatever issue a patch addresses and pay less attention to the other myriad issues any patch might introduce. Example: a patch adds a null check, the reviewer focuses on pointer validity, and misses a leak being introduced.

Catching mistakes in code review is much, much more efficient than dealing with them after they make it into our code base. Once they’re in, fixing them requires a report, a regression range, debugging, a patch, another patch review, and another opportunity for further regressions. If a checklist like this spurred people to do some extra thinking and eliminated even one in twenty (5%) of preventable regressions in code review, we’d become a significantly more efficient organization.

For this to work, the checklist must be kept short. In fact, there is an art to creating effective checklists, involving more than just brevity, but I won’t get into anything else here. My list here has only four items. Are there items you would add or remove?

General thoughts on this or variations as a way to reduce regressions?

New Gecko OS X Widgets Module Owner: Steven Michaud

I’ve been Gecko’s OS X (Cocoa) Widgets module owner for a long time. The code in this module is Gecko’s OS X compatibility layer.  Today I’m handing over the reigns to Steven Michaud.

We’re fortunate to have such a capable and active new owner. Steven has been working on this code with me since before Firefox started using it by default in 2006, and in more recent years he’s been doing the bulk of the maintenance work. In addition to knowing the module well, he’s got great debugging skills. Those skills are joined by some serious perseverance, which means he can get to the bottom of just about any bug.

Thank you, Steven!

Informal Test: Building Firefox with VMWare vs. VirtualBox

I needed to set up a Linux virtual machine on a Windows host for working with Firefox OS. I don’t like working in slow VMs, so I did an informal test of VMWare vs. VirtualBox. I prefer to use open-source software when I can, so if VirtualBox is as fast as VMWare, or close, then I’ll just use VirtualBox.

Mozilla developers often work with VMs, and minimizing build times is a frequent topic of discussion, so I thought I’d post my results in case anyone finds them useful. If you have anything to add on the subject, please do so in the comments.

Host Software and Hardware:

  • Windows 7 Professional 64-bit, all updates applied as of Sept 12, 2013
  • Intel Core i7-3930K CPU @ 3.2 GHz, 6 cores
  • 16gb RAM

Guest Software:

  • Ubuntu 13.04 64-bit, fully updated as of Sept 12, 2013
  • Approximately the minimum number of packages installed in order to build Firefox

VirtualBox Config:

  • VirtualBox 4.2.18
  • 4 CPUs (VirtualBox does not have CPU vs. core distinction that VMWare does)
  • 6002mb RAM assigned to VM

VMWare Config:

  • VMWare Workstation 10, purchased
  • 2 CPUs with 2 cores each, for a total of 4 cores
  • 6004mb RAM assigned to VM

The test was was to create a Firefox nightly debug build (latest code as of Sept 12, 2013) with the command “time make -f client.mk”, using the “-j4” flag for make.

VirtualBox Build Time

  • real: 28m53.005s
  • user: 88m32.932s
  • sys: 10m6.376s

VMWare Build Time

  • real: 29m31.595s
  • user: 89m22.548s
  • sys: 11m6.192s

Given these results, I’m just going to use VirtualBox. I’ll note, however, that graphics performance (and subsequently, UI responsiveness) in VMWare does seem to be noticeably better than VirtualBox. I don’t care, but if you do, VMWare’s advantage here is fairly noticeable even in simple interactions. VirtualBox is not bad though. Also, I didn’t test device support or too many different VM configs for the sake of tweaking performance. Take my results as a rough guide at best.

I did test VirtualBox with fewer CPUs, to test a rumor I’ve heard in the past that adding CPUs to a VM doesn’t improve performance much, or could even hurt it. That rumor seems to not be true, or is no longer true, at least with VirtualBox and my setup. Build times went from 101m, to 52m, to 28m, for 1, 2, and 4 CPUs, respectively.