Edition 34: A consensus is finally emerging on securing the Agentic SDLC

But we are a while away from solutions that are ready to use.

Jun 24, 2026

Diego Gutiérrez’s 16th-century map got the shape of the Americas roughly right, then filled the gaps with sea monsters, mermaids, and a deeply confused Amazon River. That’s about where we are with the Agentic SDLC: broad strokes clear, details mostly wrong. Source

As frequent readers of the newsletter would know, I’ve been obsessed with the topic of today’s post for a while. ~15mo ago, I wrote and spoke about why AI will change the SDLC, and hence AppSec. Since then, I’ve spoken to hundreds of AppSec professionals and Developers about the topic. Every few months, we’d have some clarity on how things are progressing, and then everything would change again. This happened with the Claude Code launch, Opus 4.5 announcement, OpenClaw going viral, and so on. But something else is happening now. For the first time since ChatGPT launched, there seems to be some consensus emerging on what the future holds (at least for software development). While most companies will continue to have multiple SDLCs, it’s clear where the cutting edge lies. This is good because it finally allows us to take a deep breath and consider how to approach Security in this new landscape. In other words, we’ve moved from the world of unknown-unknowns to the land of known-unknowns. We know what we don’t know, and the next step is to figure out the answers to these unknowns.

SDLC trends

Before we get into the tech changes, a side note: A common theme among the companies I talk to is that larger changes are coming in how teams will be structured to better leverage AI. Companies are questioning every organizational “truth” that hinders AI from moving faster. From span of control to pod sizes to stand-ups to sprint planning, all established norms are up for debate. In the long term, I think this would lead to a new & improved paradigm for structuring software engineering teams. In the short term, it will cause a lot of anxiety and uncertainty for these teams. As we all figure out how to reach the promise land of higher productivity and better outcomes, it is important to recognize that social change is underway and that there will be winners and losers as a result. And unfortunately, some of the losses will be permanent. Economists may call this creative destruction, but as members of the same industry, it is important for all of us (the winners, the losers, and the ones unaffected by it) to lead with empathy.

That said, here are specific things that have changed in most software development shops:

The number of PRs filed has ballooned to crazy levels. This has had a trickle-down effect on what gets pushed to production, too. Last year, we saw a lot of vibe-coded projects pushed by AI coding tools. That’s changed now. AI-coding agents (either through local harnesses like Claude Code or cloud-deployed coding agents in mature orgs) are shipping to prod in important applications. Velocity is truly up across the board
Code Reviews are still a nightmare. You are stuck between YOLO and deal with it later (which puts pressure on senior engineers and security teams), or spend a lot of time reviewing AI slop (which also puts pressure on senior engineers and security teams)
1. A corollary (and we will talk about this later, too) is that PRs are now a terrible place to “start” governance checks. It’s too late.
The AI labs have thrown their hats firmly in the ring. They’ve proposed various solutions to the problems created by their products and to long-standing security problems, too (1 2 3 4 ). Finally, they’ve also mastered the art of FUD, which would put the worst Cybersecurity salesmen to shame (Having said that, I have to mention that Fable was awesome, and I cannot wait for it to be back)
Security teams are all in on AI, and AI labs + cloud providers deserve a pat on the back. Questions like “but where is the data stored?” or “will you use my data for training?” have been summarily answered. Security teams and Security companies (including ours) have started to reimagine every solution with AI firmly in the middle
PRDs have gotten the full monty treatment. Claims range from “we are completely replacing PRDs with prototypes” to “we are writing every decision down in .md files thanks to AI” (Here’s an excellent overview of Spec Drive Development on martinfowler.com). All of them are kinda lying. Documents haven’t gone away, and the only ones reading all those AI-generated .md files are other agents. Ultimately, the jury is still out on how people document and discuss “intent”. My personal opinion is that “writing to inform” (product documentation, how-to guides) will be replaced entirely by AI-generated documents, and the primary consumers will be other agents. “Writing to persuade” (decision documents, vendor comparison, opinion pieces, etc.) will still need to be read by humans, and AI is doing a sloppy job in generating these kinds of artifacts. So, if you consider PRDs as “writings to inform an agent what to build”, there is a chance that they will be made obsolete in the future. But if the PRD is to persuade humans and make a collective decision on what to build or what approach to take while building, I’d say PRDs are more important than ever

What does this mean for AppSec?

This has broad implications for AppSec. From practitioners to vendors, each of us has to respond to these trends.

More PRs mean more reviews, and every code and app scanning tool is under pressure to eliminate false positives. There are 2 approaches that are gaining traction
1. Use AI to triage and reduce false positives: Run an AI reviewer after your deterministic tool generates results
2. Reimagine scanning with AI at the center: Use AI to generate results in the first place, thereby avoiding false positives (this is the claim). Not saying this actually happens :))
Both approaches are working well, but there isn’t a clear winner yet. As with most such trends, the answer will be “somewhere in the middle”. The popular pattern seems to be to have a smart router (say a custom MCP), which can route the traffic to relevant tools (deterministic or AI-native), depending on the use case
Security reviews for pull requests (PRs/MRs) are an absolute nightmare, and many AppSec teams are tempted to guard the perimeter instead. Many companies are focusing on creating separate deployment environments and on deploying code without review in more hardened environments with no access to critical resources. We all know this is a terrible idea and can hurt us in different ways (protecting software through infrastructure controls will always leave a hole. Just ask your friendly neighborhood WAF administrator :)). The AppSec teams implementing this know it too, but there is no choice. Thoughtfully reviewing every PR and approving it before deployment does not meet the organizational tokenmaxxing approach. Teams have (correctly) tried to move the reviews into the coding agents, and that hasn’t worked well yet (low adoption).
A corollary of #2 is that AppSec teams firmly believe the first point of influence must be within the coding harness (e.g., Claude Code, GH Copilot). There are 2 different ways to think about securing coding harnesses:
1. Think of AppSec as an endpoint problem. How can we ensure Claude Code does not do dumb things like use a malicious library or install a skill that exfiltrates data, and so on? In other words, don’t secure the output of the coding agents, but secure the developer using the coding agents (excellent blog on this topic here)
2. How do we introduce traditional AppSec activities in the coding harness? Think Security Design Reviews (SDR), SAST, SCA, Secret Scanning, and so on.

3 pillars of a “good” solution

I think the right way to solve this is to thoughtfully introduce what we know works (secure-by-default, SAST, SDR, etc.) into the coding harness. But there are a few problems with this approach:

UX for assessing intent in coding agents: Developer adoption of governance tools in coding agents will be a challenge. It risks following the same pattern as VS Code plugins for Security. Security teams will introduce it as a shift-left mechanism, and developers will ignore it as a minor nuisance. The same will happen with Security Plugins in Claude Code (and we’ve spoken to devs who have already seen this). It’s gonna be even harder this time because we are trying to first analyze intent (inspecting plans through SDR) and then analyze implementation (assessing code for security bugs). So, the first challenge will be this: We need to build a UX for Security plugins that Developers (or whoever pushes code) actually enjoy using. Skills/Hooks/Plugins are easy to turn off/ignore, and devs will find a way to turn them off if the UX sucks (as 20 years of shoving AppSec tooling down Devs throats have taught us).
1. A subtle (but thorny) problem when deciding on the UX for Security tooling within the SDLC is the Guardrails vs. Validation paradigm. When do you block a user from doing insecure things, vs. when do you “review” a developer's work and call out problems? How does this paradigm change when we are dealing with humans driving the workflow versus autonomous agents (think: somebody kicked off a dev job from within Slack/Jira)?
Visibility and control for Security Teams: Tooling within Coding Agents (Skills, Hooks, etc.) is designed to provide maximum flexibility for users. This is good for developers, but sucks for governance teams. Without the ability to influence how Security tooling is used within the Agents, it’s hard to validate if it’s working well, if adoption is high, and so on. Any solution we build needs to strike a balance between allowing Security teams to define how this tooling works (what to look for, when to fire reviews, how often they are invoked, etc.) and enabling developers to do the work. Some Security teams (depending on org culture) may also want to have “control” over how these tools are used, but that almost always ends badly. At a minimum, AppSec teams need solid telemetry of how the tooling is being used. There’s another UX point to consider: Should we allow Devs/Coding Agents to customize rule sets (or whatever the new paradigm for defining the scope of inquiry is)? If yes, this will diminish the control Security teams have. If not, we risk lower adoption.
The anxiety of a running meter: A core problem with security-review plugins (or any governance plugin) is that users have no control over the token and dollar cost of using this review. Scanning a design document with 1,000 words may consume a few cents of tokens, but scanning a design document with images and tables may consume a few dollars. The cost of reviewing PRs (depending on code size, language, and specific instructions) may also vary significantly. This is unsettling for enterprises. Imagine you have 3,000 developers, and the cost of a security review can vary 10x per PR/doc. How do you estimate the cost to the org? How do you measure ROI? When this happens, most orgs default to what already works (even if it isn’t great): Reviewing code on PR creation in GitHub.
1. A corollary to the cost problem is the latency problem. The same cost variability also affects latency. However, the way Agents are being used today (you give it a task and move on to something else, coming back to a completed task later), developers seem to be slightly more forgiving of slightly higher latency, especially if there’s value at the end of it (which for devs would be: Security team won’t hold up the PR)

So, I think the task is now well defined for the AppSec industry (internal teams, tool creators, and vendors): The solution to the security in agentic-SDLC problem needs to have a UX that works for developers, provide some visibility and control to AppSec teams, and needs to convert variable cost into fixed (or at least mostly fixed) cost. When this happens, I think we can make serious progress in securely shipping AI-powered code.

At Seezo, we will soon ship what we think works, but I am sure there are dozens (if not hundreds) of other approaches to solve these problems. I am now reasonably confident that we will have some consensus on what the solution needs to look like in the next few months. And when that happens, we have a reasonable shot at AppSec keeping pace with the changing SDLC!

That’s it for today! Are you seeing consensus on how we need to solve the AppSec problem in the Agentic SDLC, too? Should we just YOLO and deal with things in prod? Hit me up! You can drop me a message on Twitter (or whatever it is called these days), LinkedIn, or email. I am also the co-founder of Seezo. We help companies automate security design reviews at scale. Check us out if that’s your thing :) If you find this newsletter useful, share it with a friend or colleague, or post it on social media.

The BoringAppSec Community

Discussion about this post

Ready for more?