GitHub’s CSP journey

We shipped subresource integrity a few months back to reduce the risk of a compromised CDN serving malicious JavaScript. That is a big win, but does not address related content injection issues that may exist on GitHub.com itself. We have been tackling this side of the problem over the past few years and thought it would be fun, and hopefully useful, to share what we have been up to.

Just to get everyone on the same page, when talking about “content injection” we are talking about:

  • Cross Site Scripting (XSS) – Yup, the most common web vulnerability of the past, present, and foreseeable future. Given its prevalence, many developers are familiar with XSS and the obvious security consequences of allowing injected JavaScript to execute on your site.
  • Scriptless attacks – This is a more nuanced issue and is frequently not considered since people are too busy fending off XSS. But, as has been documented by Michal Zalewski in “Postcards from the post-XSS world”, Mario Heiderich (et al) in “Scriptless Attacks –
    Stealing the Pie Without Touching the Sill”
    , and other related work, preventing XSS does not solve all of your content injection problems.

GitHub uses auto-escaping templates, code review, and static analysis to try to prevent these kinds of bugs from getting introduced in the first place, but history shows they are unavoidable. Any strategy that relies on preventing any and all content injection bugs is bound for failure and will leave your engineers, and security team, constantly fighting fires. We decided that the only practical approach is to pair prevention and detection with additional defenses that make content injection bugs much more difficult for attackers to exploit. As with most problems, there is no single magical fix, and therefore we have employed multiple techniques to help with mitigation. In this post we will focus on our ever evolving use of Content Security Policy (CSP), as it is our single most effective mitigation. We can’t wait to follow up on this blog to additionally review some of the “non-traditional” approaches we have taken to further mitigate content injection.

What is CSP? Why & How to Add it to Your Website.

For example, a common way to steal logins using CSS is by sending a request for a background image or font to an evil URL such as  where a is the letter you typed into the password login field. When you would type the next letter of your password, the evil CSS script would send another request but with that letter instead of a. The evil site then logs these requests to determine your username & password. By allowing unsafe-inline for our style-src, someone could inject this evil code. Fortunately, their code wouldn’t work since our CSP doesn’t allow img-src & font-src from the evil example site.

You are also not in bad company by doing this. A lot of sites, including GitHub & security professional Troy Hunt’s blog use unsafe-inline. Facebook uses unsafe-eval & even requires it for some of their SDKs. Anyone using Google Tag Manager for analytics will also have to reduce their CSP security. I must confess as well. I use GatsbyJS for my personal blog & there are issues that need to be fixed before I can remove unsafe-inline.

Individuation (Richard Rohr)

Just what are those inner imperatives that rise to support us and challenge us in the journey of the second half of life? Perhaps Jung’s most compelling contribution is the idea of individuation, that is, the lifelong project of becoming more nearly the whole person we were meant to be—what [God] intended, not the parents, or the tribe, or, especially, the easily intimidated or inflated ego.

While revering the mystery of others, our individuation summons each of us to stand in the presence of our own mystery, and become more fully responsible for who we are in this journey we call our life. So often the idea of individuation has been confused with self-indulgence or mere individualism, but what individuation more often asks of us is the surrender of the ego’s agenda of security and emotional reinforcement, in favor of humbling service to the soul’s intent. . . .

The agenda of the first half of life is predominantly . . . framed as “How can I enter this world, separate from my parents, create relationships, career, social identity?” Or put another way: “What does the world ask of me, and what resources can I muster to meet its demands?” But in the second half of life . . . the agenda shifts to reframing our personal experience in the larger order of things, and the questions change. “What does the soul ask of me?” “What does it mean that I am here?” “Who am I apart from my roles, apart from my history?” . . . If the agenda of the first half of life is social, meeting the demands and expectations our milieu asks of us, then the questions of the second half of life are spiritual, addressing the larger issue of meaning.

The psychology of the first half of life is driven by the fantasy of acquisitiongaining ego strength to deal with separation, separating from the overt domination of parents, acquiring a standing in the world. . . . But then the second half of life asks of us, and ultimately demands, relinquishment—relinquishment of identification with property, roles, status, provisional identities—and the embrace of other, inwardly confirmed values.

Microsoft, Facebook, trust and privacy

I’ve been reminded of this ancient history a lot in the last year or two as I’ve looked at news around abuse and hostile state activity on Facebook, YouTube and other social platforms, because much like the Microsoft macro viruses, the ‘bad actors’ on Facebook did things that were in the manual. They didn’t prise open a locked window at the back of the building – they knocked on the front door and walked in. They did things that you were supposed to be able to do, but combined them in an order and with malign intent that hadn’t really been anticipated.

It’s also interesting to compare the public discussion of Microsoft and of Facebook before these events. In the  1990s, Microsoft was the ‘evil empire’, and a lot of the narrative within tech focused on how it should be more open, make it easier for people to develop software that worked with the Office monopoly, and make it easier to move information in and out of its products. Microsoft was ‘evil’ if it did anything to make life harder for developers. Unfortunately, whatever you thought of this narrative, it pointed in the wrong direction when it came to this use case. Here, Microsoft was too open, not too closed.

Equally, in the last 10 years   – that is is too hard to get your information out and too hard for researchers to pull information from across the platform. People have argued that Facebook was too restrictive on how third party developers could use the platform. And people have objected to Facebook’s attempts to enforce the single real identities of accounts. As for Microsoft, there may well have been justice in all of these arguments, but also as for Microsoft, they pointed in the wrong direction when it came to this particular scenario. For the Internet Research Agency, it was too easy to develop for Facebook, too easy to get data out, and too easy to change your identity. The walled garden wasn’t walled enough.

.. Conceptually, this is almost exactly what Facebook has done: try to remove existing opportunities for abuse and avoid creating new ones, and scan for bad actors.

Microsoft Facebook
Remove openings for abuse Close down APIs and look for vulnerabilities Close down APIs and look for vulnerabilities
Scan for bad behavior Virus and malware scanners Human moderation

(It’s worth noting that these steps were precisely what people had previously insisted was evil – Microsoft deciding what code you can run on your own computer and what APIs developers can use, and Facebook deciding (people demanding that Facebook decide) who and what it distributes.)

  • .. If there is no data stored on your computer then compromising the computer doesn’t get an attacker much.
  • An application can’t steal your data if it’s sandboxed and can’t read other applications’ data.
  • An application can’t run in the background and steal your passwords if applications can’t run in the background.
  • And you can’t trick a user into installing a bad app if there are no apps.

Of course, human ingenuity is infinite, and this change just led to the creation of new attack models, most obviously phishing, but either way, none of this had much to do with Microsoft. We ‘solved’ viruses by moving to new architectures that removed the mechanics that viruses need, and where Microsoft wasn’t present.

.. In other words, where Microsoft put better locks and a motion sensor on the windows, the world is moving to a model where the windows are 200 feet off the ground and don’t open.

.. Much like moving from Windows to cloud and ChromeOS, you could see this as an attempt to remove the problem rather than patch it.

  • Russians can’t go viral in your newsfeed if there is no newsfeed.
  • ‘Researchers’ can’t scrape your data if Facebook doesn’t have your data. You solve the problem by making it irrelevant.

This is one way to solve the problem by changing the core mechanics, but there are others. For example, Instagram does have a one-to-many feed but does not suggest content from people you don’t yourself follow in the main feed and does not allow you to repost into your friends’ feeds. There might be anti-vax content in your feed, but one of your actual friends has to have decided to share it with you. Meanwhile, problems such as the spread of dangerous rumours in India rely on messaging rather than sharing – messaging isn’t a panacea. 

Indeed, as it stands Mr Zuckerberg’s memo raises as many questions as it answers – most obviously, how does advertising work? Is there advertising in messaging, and if so, how is it targeted? Encryption means Facebook doesn’t know what you’re talking about, but the Facebook apps on your phone necessarily would know (before they encrypt it), so does targeting happen locally? Meanwhile, encryption in particular poses problems for tackling other kinds of abuse: how do you help law enforcement deal with child exploitation if you can’t read the exploiters’ messages (the memo explicitly talks about this as a challenge)? Where does Facebook’s Blockchain project sit in all of this?

There are lots of big questions, though of course there would also have been lots of questions if in 2002 you’d said that all enterprise software would go to the cloud. But the difference here is that Facebook is trying (or talking about trying) to do the judo move itself, and to make a fundamental architectural change that Microsoft could not.