Secure Central Hosting for the Digital Analytics Program

Aug 14, 2015

The U.S. government’s Digital Analytics Program (DAP) collects Web traffic and analytics data from across the federal government. That data flows into a very large central account, and some of that data is automatically made public in real time at analytics.usa.gov.

A screencapture of the anaylytics.usa.gov dashboard on August 14, 2015

To accomplish this feat, participating federal websites need to add a [CODE] reference to a standard bit of JavaScript code. Until now, the only option agencies have had is to host this JavaScript file themselves, like this:

<script src="/js/federated-analytics.js" id="_fed_an_ua_tag"></script>

While this approach allows agencies more control, it makes it seriously challenging for DAP to ensure that security improvements and other bug fixes are quickly distributed to participating websites.

To address this, DAP has set up a centrally hosted URL at dap.digitalgov.gov containing the most current DAP collection code, which agencies can reference like this:

<script src="https://dap.digitalgov.gov/Universal-Federated-Analytics-Min.js" id="_fed_an_ua_tag"></script>

By adding this tag and following DAP’s guidance (PDF, 273 KB, 7 pages, February 2015) to add parameters identifying your agency, a federal website will begin reporting its Web analytics to DAP and will be guaranteed to always be using the latest, greatest, most secure DAP code.

Securing Visits to Federal Websites

The beginning of a secure https URL shown in an web browser's address bar; the s on https and padlock are red.

Hosting a widely-referenced piece of JavaScript introduces its own security concerns, because any change to that JavaScript will immediately affect all federal websites that reference it. It’s extremely important that the JavaScript on dap.digitalgov.gov not be modified by an attacker.

This isn’t a theoretical concern: in March of 2015, a large Chinese network took advantage of a centrally hosted analytics JavaScript file that was served over an insecure connection to rewrite its contents and turn visitors’ browsers into attack bots. Any network, from a coffee shop to a global ISP, can easily attack insecure connections in this way.

On the web, the way to prevent this kind of attack is to use HTTPS, which encrypts and secures the connection between a visitor and the JavaScript code.

dap.digitalgov.gov uses strong HTTPS as well as HTTP Strict Transport Security (HSTS), which adds some additional protections.

The recent federal government policy on HTTPS requires HTTPS and HSTS for all new federal websites and services. (For agencies: DigitalGov University recently produced educational videos on an Introduction to HTTPS and Implementing HTTPS.)

Breaking the Protocol-Relative URL

Many people on the Web are accustomed to using protocol-relative URLs, which have long been promoted as a best practice. They look like this:

<img src="//domain.com/img/logo.png" alt="" />

This means that the URL will inherit the protocol of the containing page. If the embedding website uses HTTPS, then the image will be fetched over HTTPS, and likewise for HTTP. When HTTPS was considered optional for many sites, this made some sense. However, in 2015, protocol-relative URLs are considered an anti-pattern and are discouraged.

Because dap.digitalgov.gov is a potential high-value target, the Digital Analytics Program did not want to support plain HTTP connections at all—even for a redirect. Though HTTP redirects are helpful, they are still an opportunity for attack. HSTS is designed to help with this, but an even more secure solution is to simply disable HTTP altogether.

This generally isn’t a viable solution for websites, because users type bare domains like “whitehouse.gov” into browser location bars, and browsers generally have to assume plain HTTP as a first try in these situations. But because dap.digitalgov.gov is a brand new subdomain used only as a third party service, DAP can set a higher standard by breaking the protocol-relative URL when used on a plain HTTP site.

The simplest solution is to refuse http:// connections entirely by closing port 80 and not allowing browsers to connect at all, but this was not a viable option for DAP’s hosting provider. However, returning an error code instead of a redirect for plain HTTP connections would not be in compliance with federal HTTPS policy.

We solved the issue by combining a redirect with an error code. Any HTTP requests to a file on http://dap.digitalgov.gov will redirect the user to https://dap.digitalgov.gov/403, which then returns a 403 error code.

$ curl --head http://dap.digitalgov.gov/Universal-Federated-Analytics-Min.js
HTTP/1.1 301 Moved Permanently
Location: https://dap.digitalgov.gov/403

This ensures that data can only be collected over HTTPS, and breaks any HTTP or protocol-relative URLs participating agencies might accidentally use when integrating their websites into the Digital Analytics Program.

While DAP’s solution ended up being straightforward, this approach is surprisingly novel—most common third party services today don’t even require HTTPS.

But on today’s Internet, they should, and the Digital Analytics Program is leading by example.

Eric Mill is an 18F team member.

Originally posted by Eric Mill on Aug 14, 2015

GSA | Washington D.C.

Aug 14, 2015