Bringing Security along on the CI/CD journey

This is the third article I’ve written for r2c, a security startup I consult for. (Read the others here.) They’re building semgrep, a code search tool that understands Python syntax (and many other languages).

This article was originally published on r2c’s blog, and they’ve given me permission to cross-post it here. Thanks to Grayson Hardaway, and Pablo Estrada at r2c for their contributions to this piece, and to r2c in general for the opportunity to write it!

Introduction

I’ve spent a good deal of my career with my feet in two different worlds. I came up as a web developer and helped create a popular web framework (Django). And I’ve spent a sizable chunk of my career working in information security. Unfortunately, I’ve seen these two roles clash far too often. Engineering often sees Security as standing in the way of delivery, or as creating meaningless busywork. Security thinks Engineering is irresponsible, willing to ship broken or vulnerable code.

I’ve spent more than a decade trying to bridge this gap. There’s no silver bullet. But, over and over, I’ve seen one practice be quite effective: automated security tests — and particularly integrating security checks and tests into an existing CI pipeline.

Bringing Security along on the CI/CD journey

Engineers have largely embraced CI as a critical part of quality assurance, and that a robust test suite that runs on every commit is a huge enabler of velocity. We can be bold about making changes, knowing that the test suite will catch us if we’ve messed up. A mature CI/CD pipeline is a reliable litmus test for good software.

Historically, though, Security has been left out of this CI/CD journey. We’ve relied on manual security assessments; hands-on exercises like threat modelling and threat hunting; and bespoke penetration tests. These are important activities that will always have a place in a mature product security lifecycle, but they increasingly are difficult to integrate into an agile delivery model that relies on incremental changes and automated tests. Much of the friction between modern engineering and security teams can come down to this impedance mismatch.

So, to get Security and Engineering playing well together, one massively useful tool is getting security work integrated into continuous delivery. When done right, Security and Engineering work together to produce automated checks that cover security issues in the same test suite that’s already in CI. This maintains delivery cadence, gives confidence about the security of the product, and — most importantly — gives a place where Security and Engineering collaborate, rather than conflict, to produce secure code.

How does this look in practice? Each organization is different, but there’s a typical progression of maturity that Security and Engineering orgs go through as they build a continuous integration and automation pipeline:

Level 1: Security finds problems; Engineering fixes them
Level 2: Security and Engineering collaborate to produce test cases and remediations
Level 3: After the issue is fixed, Security and Engineering collaborate to find systemic fixes and develop checks
Level 4: Security and Engineering now also proactively look for new classes of issues and create systemic checks before an actual problem occurs

For the rest of this post, I’ll walk through each of these phases with a specific example about a team that systemically fixed an issue with logging sensitive tokens.

Level 1: Security finds problems; Engineering fixes them

This is (unfortunately) how many organizations operate. Nobody really works together: Security is off in one corner looking for vulnerabilities (or, worse, waiting for a breach and then responding!). When they find one, they tell Engineering, who (hopefully) fixes the problem.

Let’s begin the example and see how this could shake out in practice. A few months ago, Nathan Brahams wrote about systemically fixing an issue around accidentally logging sensitive tokens. His article illustrates the steps a quite mature security/engineering organization would take, but I’ll use this issue as a jumping-off-point to imagine how teams earlier in their journey might approach discovering a similar issue.

So, imagine a budding team has found this security issue: SQLAlchemy debug logging is turned on and sensitive tokens are being logged. The issue is fixed with a clever technique that uses a custom ObfuscatedString column that prevents SQLAlchemy from logging the token’s value. So, we just swap in ObfuscatedString for the token column. Problem solved, right?

Well, while this does fix the issue, it has many problems:

Verification: If Engineering just rolls out this fix, how do we verify that the issue is actually fixed? Usually at this level of maturity, the Security team will manually verify the fix, but that’s error-prone. It’s also slow; if the fix didn’t work or is incomplete, the cycle has to repeat itself.
Regression: If another engineer, some time later, doesn’t understand this ObfuscatedString, it could get reverted or modified in a way that re-introduces the issue. How would we know if this happens? (Spoiler alert: with an automated test, which we’ll discuss in the next section.)
Is this a one-off issue, or is it a systemic issue? Are there other sensitive values elsewhere that might be logged?
Conflict: Most importantly, this workflow sets up conditions ripe for conflict between Security and Engineering. It creates a dynamic where it’s easy for Engineering to feel like Security’s just creating work for them, and where Security can feel ignored or powerless to fix problems. I’ve never seen this model produce a really healthy relationship; just ones with varying levels of dysfunction. Even when Security helps write the fix, the lack of any sort of robust verification or systemic analysis means it’s very likely they’ll need to come back later with this issue or a similar one again, which both parties will resent.

Level 2: automated tests

The next rung on the maturity ladder is one that’s becoming increasingly common: instead of just fixing a security issue, Security and Engineering will collaborate on producing a test case (and often the fix). Following along with the example above, we might write a test case that sets up a test model with an ObfuscatedString column, captures some logs, and verifies that the value is correctly obfuscated.

This relatively simple addition fixes a bunch of problems we had before:

Verification: If the test case passes, we can be confident the security issue is fixed.
Regression: Because this test case is part of our test suite, if it ever regresses the test case will fail, and we won’t risk re-introducing it to production.
Collaboration: Security and Engineering are now working more closely together, increasing the chances both teams will see this as “our issue” and “our fix”, not “their problem”.

But: we still lack any sort of understanding of whether this issue occurs elsewhere, or any sort of holistic fix for the entire class of issues. In the case of logging sensitive tokens, it’s easy to imagine this issue occurring elsewhere. So when — inevitably — a similar issue occurs elsewhere, we’re likely to be bitten again. And this, in turn, will continue to produce the kind of resentment that can be so damaging to Security/Engineering working well together.

Level 3: systemic fixes and checks

The next step, then, is for Security and Engineering to work together to find systemic problems and fixes. Things start out as above — it’s important to fix the specific vulnerability first, before getting fancy! But after the specific fix is in, Security and Engineering come together to figure out if this is a systemic problem. If so, they work to develop a check or a fix.

Sometimes this can be a fairly simple holistic fix. For example, I wrote about ReDoS last time. Discovery of a ReDoS vulnerability might lead to discovering other similar potential problems. That in turn could lead to the decision to switch to re2, which isn’t vulnerable to ReDoS.

But much of the time, the systemic fix is more complex — there isn’t a simple drop-in replacement that eliminates a class of vulnerabilities. That’s true of this sensitive-logging issue: we don’t have any sort of logging module that can magically know when variables are sensitive, and obfuscate them.

This is where code scanning tools like Semgrep come in. They are a terrifically important part of a mature product security workflow. Traditional testing practices — unit tests, integration tests, etc — are great for reproducing specific security issues, and ensuring that they’re fixed and won’t regress. But they struggle to discover whole classes of security issues, and this is what code scanners enable. Traditional code linters (e.g., Flake8, RuboCop) help to ensure code consistency and find some common issues, but since they have to apply generally to all kinds of projects, they tend to only provide a one-size-fits-all best-practices check. Tools that understand code semantically, like Semgrep, can be used to write tests for your unique codebase and find whole classes of security issues — including, most importantly, new instances of a problem that might be added after the check has been written.

And indeed, that’s what Nathan and his team did: they wrote a Semgrep rule that finds columns with names suggesting they’re sensitive (e.g., containing “token”, “secret”, “key”, etc), and issues a warning.

At this point, we’re in a pretty good spot. New code is continually scanned for this issue, and when found, they are fixed robustly. We have a workflow and tooling that ensures that the original, specific issue is fixed and that it stays fixed, and we ensure that similar issues — present and future — are discovered and fixed. Security and Engineering collaborate on this work, which is now well-automated. There’s still (of course) room for teams to not get along, but we’ve removed some of the most common pain points (verification, regression, conflict between teams).

If suddenly blocking builds with new checks will only increase conflict in your organization, you could softly roll out checks that do not block Engineering and only notify Security. For issues that arise, Security can begin conversations with Engineering to collaborate on both specific and systemic fixes. Once the check is satisfactory to both parties, it can be adjusted to block further issues.

Level 4: collaborative proactive discovery

The final step — and really, it’s icing on the cake — is proactively working together on the discovery of issues. A Security organization conducting proactive discovery includes some or all of these processes:

A robust threat hunting practice
Keeping up with the latest in security research, including new exploitation techniques, vulnerability classes, and impactful CVEs by following conferences (e.g., Black Hat, DEF CON, OWASP), prominent researchers (e.g., James Kettle, Samy Kamkar), social media (e.g., r/netsec) or newsletters (e.g., tl;dr sec, Unsupervised Learning).
Engineering-driven risk penetration testing — engineers know the codebase well, and often have hypotheses about where issues could lurk. They can suggest places for security testers to probe, and use that to locate problems.
And, of course, turning the results of the above into automated checks and tests.
There is much to say here. Each topic on its own could be its own write-up. The key is that systemic fixes are terrifically important. They prevent issues from repeating, which frees Security to focus on proactive discovery. This moves Security into an active (instead of reactive) role in progressing organizational security.

Next steps

Where does your team fit into this model? If you’re at Level 1, consider trying to set up a level by having the security team chip on security-related fixes by adding test cases. I think you’ll be surprised how effective this small change can be. Getting security and engineering working more collaboratively can pay huge dividends in your security posture.

If that’s already a practice, but you haven’t tried moving from spot- to systemic fixes, I think you’ll be surprised at how easy it’ll be to get to Level 3. Security tooling has reached a point where it is easy to move from individual fixes to system fixes and checks. Tools like Semgrep are becoming increasingly easy to integrate into existing CI/CD pipelines. Combined with the ability to easily write new code scanning checks, systemically eliminating issues is within any Security team’s grasp.