Malice in the Mesh // 01: Introduction to Defensible Architecture for Agents

The purpose of the Malice in the Mesh series is to give us a toolbox to look at artificial intelligence security beyond hype. Our approach will be a series geared towards detection engineering for agentic AI, but with the foundations for a technologist. Our journey will have a few stops along the way. We will be building conceptual frameworks foundational to modern day artificial intelligence development and into AI security. The first number of installments will be a dissection of artificial intelligence across several planes. While we will conduct exercises of exploration in various components and structures existing within general agent architecture, it’s important to remember that at the end of the day, these parts cohesively form one holistic system. The same as human anatomy comprising separately classified systems of apparatus yet operating in tandem as one. Seamless enough that we don’t even consciously give thought to them as separate, until something goes wrong. In addition, we will be diving into governing non-malicious models and extending our knowledge and understanding into detection engineering of black hat LLM models as well. First, let us lay the foundation and dive into the fundamental architecture of every agent, a process akin to the human respiratory apparatus…

Large Language Model vs. Agent: What’s the difference?

It’s important to note the distinction between an agent and an LLM. While an LLM lives as a stateless function, an agent exists as a stateful control loop around that function. This means the ability to not only call an LLM, but also leverage other tools. It can also offer enhanced security and a layer for LLM auditing through a secure control loop design.

The code below is a minimal example of that model in isolation: a single prompt–a single response, and none of the memory, tooling, or control logic that turns a model into an agent.

from anthropic import Anthropic

client = Anthropic(api_key="ANTHROPIC_API_KEY")

def ask_claude(prompt: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=500,
        system="You are a helpful cybersecurity assistant.",
        messages=[
            {"role": "user", "content": prompt}
        ]
    )
    return response.content[0].text

if __name__ == "__main__":
    result = ask_claude(
        "Explain the impact and severity of CVE-2025-3248 in simple terms."
    )
    print(result)

What follows is the model by itself: not an agent, but the raw reasoning component an agent is later built around.

| **CVSS Score** | ~9.8 (Critical) |
| **Authentication Required** | ? None |
| **Network Access Required** | ? Just internet access |
| **Impact** | Full server compromise |

### What an attacker can do:
- ? **Steal data** (API keys, credentials, model configs)
- ? **Take over the server**
- ? **Install malware or ransomware**
- ? **Pivot deeper into internal networks**
- ? **Destroy or modify AI pipelines**

---

## Who Is Affected?
- Anyone running **Langflow versions prior to 1.3.0** that is **exposed to the internet**
- Particularly risky for organizations using Langflow in **production AI environments**

---

## Fix
- **Update to Langflow v1.3.0 or later** immediately
- **Do not expose Langflow publicly** without authentication/firewall rules
- Require authentication and restrict network access

--snip--

Even though a plain language model by itself can only accomplish next-token prediction or single-turn generation, that is not automatically “agentic.” It becomes agentic when you wrap it in things like goal persistence, tool use, planning, and more. Now, while an LLM is not inherently agentic, agents do not inherently require an LLM. Agents in the broader sense pursue goals through iterative decision and action, even if no language model is involved. So what’s the difference between an agent not leveraging a language model, and a mere… script? A script follows a predefined path. In contrast, an agent can choose among paths based on goals, context, and what it observes. A script is deterministic, while an agent is inherently probabilistic. So, while a language model is not required for an agent, the efficiency of most agents will be dramatically enhanced by the inclusion of a language model (especially for single-agent architectures) due to the package of weight training, pattern matching, flexible interpretation, and semantic compression that comes along with it.

Let’s now look at the practical implication of the agent framework through code. There is a lot going on here, but that is part of the point: agents are systems. And systems, even when they appear elegant from a distance, are made up of moving parts, state transitions, tool calls, permissions, and control logic.

This code spins up an MCP filesystem server, connects to it, and uses that tool layer to list an allowed directory and read a target CVE file rather than hardcoding the file content directly into the prompt. The file in which it is calling has additional information regarding CPE (Common Platform Enumeration) exposure information for public facing servers worldwide (achieved through scanning). It then sends that retrieved additional context to Claude for analysis, showing the shift from a plain LLM call to an agent-like workflow that can interact with external tools before generating an answer.

Do not worry if this feels dense on first pass. We will simplify the agentic process shortly. For now, use this example as a practical anchor as we will return to it again when we walk through the PPAO (Perceive, Plan, Act, Observe) loop step by step:

import os
import asyncio
from anthropic import AsyncAnthropic
from mcp import ClientSession
from mcp.client.stdio import stdio_client, StdioServerParameters

NODE_BIN = "NODE_PATH"
NPX = f"{NODE_BIN}/npx"
ROOT = "ROOT_PATH"
TARGET = "TARGET_PATH"

client = AsyncAnthropic(
    api_key="ANTHROPIC_API_KEY"
)

--snip--

    print("0) Launching MCP...", flush=True)

    async with stdio_client(server_params) as (read, write):
        print("1) MCP launched", flush=True)

        async with ClientSession(read, write) as mcp:
            print("2) Initializing...", flush=True)
            await asyncio.wait_for(mcp.initialize(), timeout=20)
            print("3) Initialized", flush=True)

            print("4) Listing allowed directory...", flush=True)
            listing = await asyncio.wait_for(
                mcp.call_tool("list_directory", {"path": ROOT}),
                timeout=20,
            )
            for item in listing.content:
                if hasattr(item, "text"):
                    print(item.text)

            print("5) Reading CVE file...", flush=True)
            result = await asyncio.wait_for(
                mcp.call_tool("read_file", {"path": TARGET}),
                timeout=20,
            )

--snip--

            print("6) Sending to Claude...", flush=True)
            response = await asyncio.wait_for(
                client.messages.create(
                    model="claude-sonnet-4-6",
                    max_tokens=1000,
                    system="You are a cybersecurity analyst.",
                    messages=[
                        {
                            "role": "user",
                            "content": (
                                "Analyze this CVE and summarize:\n\n"
                                f"{file_text}\n\n"
                                "Include: product, severity, risk, remediation."
                            ),
                        }
                    ],
                ),
                timeout=60,
            )

            summary = response.content[0].text

            print("\n===== FINAL SUMMARY =====\n")
            print(summary)
            print("\n=========================\n")

What follows is an output which is fusing our new custom data source with LLM generation:

--snip--

# CVE Analysis: Langflow Remote Code Execution Vulnerability

## CVE-2025-3248 (Critical)

---

## Product Information

| Field | Details |
|------|---------|
| **Product** | Langflow (AI workflow builder platform) |
| **Affected Versions** | < 1.3.0 (`cpe:2.3:a:langflow:langflow:*`) |
| **Deployment Scale** | ~279 exposed instances globally |
| **Primary Exposure** | United States (112), Germany (23), Brazil (15) |

---

## Vulnerability Summary

--snip--

Every AI agent, regardless of how it is marketed, what framework it runs on, or how sophisticated its architecture claims to be, operates on the same fundamental cycle. We call this the PPAO loop, and it is the conceptual backbone of everything that follows in future discussions. PPAO = Perceive, Plan, Act, Observe. If you’re familiar with current approaches, Think-Observe-Act tends to be dominant as the current paradigm. We dissect the Think stage into Perceive and Plan. This is advantageous to understand how attackers tend to insert themselves into the Think stage. We will be exploring this in much further depth going forward.

Perceive.

First, enter the Perceive stage. The Perceive stage starts at inception, with a user prompt and any accompanying resources. Possibly a PDF for analysis, or in a multi-agent system, a message from another agent. Agents operate on perception which is why we find principles packaged with context will quite often render better results than mechanical instructions. From personal experience, better construction happens by telling an agent to build a solution in the spirit of the Japanese philosophy of 5S, Seiri (sort), Seiton (Set in Order), Seiso (Shine), Seiketsu (Standardize), and Shitsuke (Sustain)… versus prompting an agent only to deliver results that are overly prescriptive, rigid, and step-bound, where the agent is forced to follow instructions without understanding the intent behind them. Both are necessary. This is highly important.

Let’s take a look at what happens when we inject some philosophical underpinnings into our original LLM-only prompt:

--snip--

if __name__ == "__main__":
    result = ask_claude(
        "Explain the impact and severity of CVE-2025-3248 in simple terms. Leverage the philosophies of Genchi genbutsu and Hoshin kanri in your analysis."
    )
    print(result)

And now we observe, not just new information, but a newly crafted structure to our put:

# CVE-2025-3248: Plain Language Analysis

## What Is This Vulnerability?

CVE-2025-3248 is a **critical unauthenticated Remote Code Execution (RCE) vulnerability** in **Langflow**, an open-source visual framework used to build AI/LLM-powered applications.

- **CVSS Score:** 9.8 (Critical)
- **Affected Component:** `/api/v1/validate/code` endpoint
- **Root Cause:** The endpoint executes arbitrary Python code **without requiring any authentication**

---

## What Actually Happens (The Simple Version)
Attacker ? sends crafted HTTP POST request
? to /api/v1/validate/code
? with malicious Python payload
? Langflow executes it AS THE SERVER
? Attacker now owns the machine

No login. No credentials. No special tools. Just an HTTP request.

---

## ? Genchi Genbutsu — "Go and See for Yourself"

*Genchi genbutsu* means going to the actual place of the problem — the "real place" — rather than relying on abstract reports or assumptions. In cybersecurity, this means understanding the **actual attack surface at the code level**, not just reading the advisory.

### Going to the "Gemba" (Actual Place) of This Vulnerability

**The real place here is the endpoint itself:**

```python
# Conceptual representation of the vulnerable pattern
@router.post("/validate/code")

--snip--

When an agent is given only explicit steps, it optimizes for instruction completion. It does not question noise, restructure inputs, or improve the system, it simply executes. This leads to brittle outputs, poor generalization, and failure when conditions change. This means leaning on principles grounded in long-standing, real-world context because they carry deeper, time-tested frames of reference. Don’t think of words as instructions when working with agents, but rather toolsets applied at time of inference. (And yes, that’s a true reframing, not generated AI “reframing” slop).

From our previous working agent example, we see the Perceive stage demonstrated by a call to the filesystem MCP server. There is an important distinction here, as unlike our LLM-only example, the Perceive stage in agent architecture does not have to be a prompt:

print("0) Launching MCP...", flush=True)

async with stdio_client(server_params) as (read, write):
    print("1) MCP launched", flush=True)

Plan.

Perception is only the first stage of the PPAO loop. The next phase, Plan, sets in motion the latter two phases, Act and Observe. Phronesis is an ancient Greek term meaning the ability to make the right decision in the moment by understanding what will actually lead to the best outcome. Just as perception framed by phronesis will take on starkly different patterns of action for the CEO of a nonprofit focused on clean water in sub Saharan Africa vs. a terrorist hellbent on terrorizing innocent civilians, agents begin to unveil their true purpos in this stage, letting their purpose be known. This stage of the loop is the one in which we see working, episodic, semantic, and procedural memory enter the loop.

Circling back to our working example, let’s take a look at the Plan stage. I italicized look because well… where’s the plan? As we will see throughout the series, the most dangerous gap in agent detection takes place between the agent’s Plan stage and Act stage. Often, agent frameworks will call tools/generate output in the Act stage, yet do not show us the reasoning that led to that call. We experience that here because the full planning happens on the Claude side:

response = await asyncio.wait_for(
    client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1000,
        system="You are a cybersecurity analyst.",
        messages=[
            {
                "role": "user",
                "content": (
                    "Analyze this CVE and summarize:\n\n"
                    f"{file_text}\n\n"
                    "Include: product, severity, risk, remediation."
                ),
            }
        ],
    ),
    timeout=60,
)

Act.

The Act stage is simple, yet the most impactful, because this is where the rubber meets the road. Perceive, Plan, and then do the thing. A common misconception in agentic processes is treating the Act stage as the LLM output itself, or more loosely as the final visible result of a code block. But that is not particularly correct. The LLM output belongs more properly to the Plan stage. It is the reasoning product, not the execution. The Act stage begins when the system takes that reasoning and actually does something with it in the world, whether that means calling a tool, writing a file, making a request, sending a message, or taking some other step with consequence. In this case, that effect is limited to rendering the text. So the Act stage here is modest in scope, but the logic still holds. Act is not the thought itself, it is the point where the thought becomes reality.

This becomes fundamentally important to the objective of our series in it’s totality. As the goal of our series is to pave a path to better detection, it is crucial to understand the bifurcation of the reasoning engine vs. the execution engine. We can categorize attacks by their targeting of the reasoning engine or the execution engine. For example, prompt injection targets the reasoning engine. Exploiting vulnerable tool implementation targets the execution engine. More to come.

Observe.

Finally, we enter the Observe phase. Just as life is not a series of decisions for humans, the same holds true for agents. Rather, life is a series of decisions… then memories… then reflections on those memories… then derived meaning influencing future perception of ourselves and the world. And then more decisions. Observe simultaneously closes the loop and gets us ready for the next.

In our working example, the observation is the output. As a bonus (I know, I’m so generous), let’s inject some philosophical underpinnings into our agent example and see the difference. We modify our prompt accodingly:

"role": "user",
"content": (
    "Analyze this CVE and summarize:\n\n"
    f"{file_text}\n\n"
    "Include: product, severity, risk, remediation. "
    "Use the information in CVE-2025-3248.txt in tandem with
    Chinese philosophical concepts of relationality for analysis. "
),

Our observation:

-- snip --

Primary Hosting Providers:
??? Amazon Technologies Inc.: 35
??? Amazon Data Services NoVa:22
??? Amazon.com, Inc.:        22
??? Google LLC:             16
??? Microsoft Corporation:  12

EOL Product Tag:            66 instances (23.7% of exposed)
Cloud-tagged:               198 instances (71.0% of exposed)

> **Critical Observation:** 279 internet-facing Langflow instances are directly exploitable. 66 are flagged as **End-of-Life**, meaning they will receive **no patches**. The majority run on hyperscaler infrastructure (AWS, GCP, Azure), which introduces profound **fourth-party risk chains**.

---

## ???? (Xi?nghù Y?cún) — FOURTH PARTY RISK ANALYSIS
### **"The Concept of Mutual Interdependence"**

Drawing from the Taoist principle of **????** and the I Ching's concept of **? (W?ng — The Net/Web)**, every node in a digital supply chain is bound to every other. A vulnerability in Langflow is not isolated — it propagates through invisible threads of dependency.

---

### The ?? (W? Xíng) Five-Element Risk Chain Model

????????????????????????????????????????????????????????????????
?            ?? FIVE-ELEMENT FOURTH PARTY RISK CHAIN         ?
?                                                              ?
?  ? WOOD         ? FIRE           ? EARTH                  ?
?  (Origin)       (Propagation)     (Foundation)              ?
?  Langflow       AI Pipelines      Cloud Infrastructure      ?
?  Instance   ??? LLM Workflows ??? AWS/GCP/Azure Tenants     ?
?      ?               ?                  ?                    ?
?      ?               ?                  ?                    ?
?  ? METAL        ? WATER                                     ?
?  (Harvest)      (Permeation)                                  ?
?  Data

-- snip --

As we can see we called our exposure additional data via MCP in the Perceive stage, we see how we influenced a more powerful use of that data by injecting additional prompting regarding Chinese philosophy. This is excellent context that would certainly bolster our cyber threat intelligence.

PPAO, our OODA.

Just as a party to kinetic warfare wins the engagement by either disrupting the OODA loop or outpacing it (and disrupting it by outpacing it), the same can be said for the agentic based Perceive, Plan, Act, Observe loop. As we progress, we will see how attacks operate within the PPAO loop. The Perceive phase is where indirect prompt injection happens. The Observe phase is the second major injection point, if an attacker can insert themselves into the loop, they can influence every subsequent planning decision the agent makes. The attack surface is not a single input; it is every input at every iteration, and the effects cascade forward through the entire execution. To amplify our conceptual model of risk within these systems even further, now multiply what we have been describing as a single agent-system to multi-agent, hierarchical, and swarm architectures.

Today’s defensive posture, amid the shift from node-based execution models to probabilistic agentic mesh, reflects a familiar pattern: many once ground-breaking, must-adopt security approaches have been set aside in the rush toward the future.

As we work through the PPAO loop, one principle will become clear: by monitoring and controlling boundaries, a system can remain defensible regardless of how an agentic process is compromised. This insight forms the foundation of Google DeepMind’s CaMeL defense architecture. CaMeL is inspired by security-first and security by design concepts, rather than security-after. By focusing on control flow integrity, access control, and information flow design, CaMeL is far superior than what many rely on now in training or prompting models to adhere to security policies. It’s like having a functioning brake system vs. learning how to swerve around traffic because you have no brakes.

Check out the 125-page CaMeL whitepaper by Google DeepMind here: https://arxiv.org/pdf/2503.18813

In this transition, concepts like Zero Trust have often escaped us and taken a backseat to speed of build. To compound matters, as we have made leaps and bounds from tool in hand to ecosystem-level fabrics, external interdependencies have introduced significant third-party risk. For example, as the MCP ecosystem has grown, a significant amount are susceptible to path traversal attacks. Earlier in 2026, we saw a flood of CVEs emerge, and Tool Poisoning become an attack type that should be on everyone’s radar. This trust boundary violation has become structural. An MCP server that an agent connects to is simultaneously a trusted tool provider and a potential attack vector, and there is no protocol-level mechanism to distinguish the two.

The goal here, however, is not to stifle innovation, but to make it resilient. Going forward, as we explore detection engineering and defensible architectures surrounding agentic processes (through the lens of the PPAO loop), it’s worth keeping one principle in mind, popularized by U.S. Special Operations: slow is smooth, and smooth is fast.

In this installment, we laid the foundation for what an AI agent is, how it differs from a pure LLM call, the essential loop any agentic process operates in, and why this understanding is essential to starting our conversation regarding agentic security. In our next installment we will go deeper into the implementation details of every layer in a modern agent system, from LLM sampling parameters through memory architectures to the orchestration layer, because effective security analysis requires understanding the machine, and we are about to take it apart piece by piece.

Have suggestions or want to collaborate on a future project? Shoot me an email at roccofiorecyber@gmail.com or find me on LinkedIn at the icon below.

The content published on this site reflects personal views and research only. It does not represent the views, positions, or policies of any current or former employer, client, or affiliated organization.

Any references to technologies, vulnerabilities, or security practices are for educational and informational purposes only. Nothing on this site should be interpreted as endorsement, disclosure of confidential information, or professional advice.

All examples are generalized or fictionalized unless explicitly stated otherwise.

Latest Posts

roccofiorecyber@gmail.com

Malice in the Mesh // 02: Temperature and Parameter TamperingApril 11, 2026
Welcome to the second installment of our series on defensible architecture of AI agents. In these next couple of installments we will dive into temperature, sampling, and prompt architecture. To begin this installment, we will focus in on a concept we mentioned briefly in our last piece: A script is deterministic, while an agent is… Read more: Malice in the Mesh // 02: Temperature and Parameter Tampering
Malice in the Mesh // 01: Introduction to Defensible Architecture for AgentsApril 8, 2026
The purpose of the Malice in the Mesh series is to give us a toolbox to look at artificial intelligence security beyond hype. Our approach will be a series geared towards detection engineering for agentic AI, but with the foundations for a technologist. Our journey will have a few stops along the way. We will… Read more: Malice in the Mesh // 01: Introduction to Defensible Architecture for Agents