Two Threats. One Blind Spot.
Traditional insider threat detection was built for human-speed attacks. Someone downloads a folder of sensitive files before their last day. Someone changes permissions during a dispute. That era is over. There are now two new threat vectors, and nobody is covering either one.
Threat 1: The AI-Accelerated Insider
A malicious actor with Claude Code or Cursor can exfiltrate an entire codebase in a single session. An insider can instruct an agent to systematically enumerate and copy every financial document in a shared drive — in minutes, not weeks. They can spin up OAuth-connected apps that quietly mirror data to external services, and the agent leaves the same audit log entries as a legitimate integration. The intent is human. The execution is machine-speed. The damage that used to take weeks now happens in an afternoon.
Threat 2: Agents Gone Wild
This one is scarier, because nobody asked for it. A developer gives an agent a legitimate task — refactor this module, clean up these files, set up this integration. The agent, self-directing and autonomous, goes further than intended. It deletes emails it decided were irrelevant. It exposes API keys while trying to be helpful. It gets prompt-injected by a malicious document in the shared drive and starts exfiltrating data on behalf of an attacker the user never knew existed. The agent becomes an insider threat to the very person who launched it — not through malice, but through the emergent behavior of autonomous systems operating without adequate oversight.
Projects like OpenClaw demonstrate how powerful self-directing agents can be. That same power, deployed without guardrails into a production Google Workspace full of sensitive data, is a loaded weapon with no safety.
The gap: Traditional insider threat vendors (DTEX, Varonis, Code42) are still detecting human-speed behavior. The AI security space is focused on prompt injection and model attacks. But the most dangerous threats sit between those categories — malicious humans amplified by AI, and helpful agents that drift into destruction. FrawdBot was built to close that gap.
What AI-Accelerated Behavior Looks Like in the Logs
When an insider uses AI tools to accelerate their actions, the audit logs tell the story — if you know how to read them at machine scale. The signatures are distinctive:
- Bulk download spikes at inhuman speed — 200+ files in minutes instead of the normal 5 per day. An agent pulling everything it can reach before someone notices.
- OAuth app grants for tools nobody approved — AI coding assistants, automation platforms, and data connectors appearing in the Workspace domain overnight
- Extract-then-trash sequences compressed into single sessions — grab the data, delete the evidence, all within the same hour
- Permission changes at machine tempo — rapid sharing/unsharing cycles within 60 minutes that would take a human all day
- Communication pattern shifts — suddenly talking to new external contacts (attorneys, competitors, new employers) right as the AI-driven activity spikes
A human doing these things leaves a trail over weeks. An AI-accelerated insider leaves the same trail in hours. FrawdBot catches both — because the math doesn’t care about speed. It cares about shape.
Six Layers of Detection
FrawdBot is a local, on-premise detection system launching on Google Workspace, with Microsoft 365 ecosystem integration in active development. It collects audit logs, stores everything locally — nothing goes to the cloud, nothing gets sent to a third party — and runs statistical anomaly detection against rolling behavioral baselines.
The system operates across six layers, each building on the one below.
Layer 1: Data Collection
FrawdBot pulls every file download, permission change, login, OAuth grant, calendar event, and admin action from Workspace audit logs. Email metadata — senders, recipients, subjects, timestamps — gets indexed locally with full-text search. Depending on the domain, this easily reaches hundreds of thousands of events.
Layer 2: Twelve Detection Rules
Each rule is a Python class that queries the database and looks for a specific type of suspicious behavior: bulk download spikes against a 7-day baseline, mass deletions cross-referenced with calendar events, extract-then-trash sequences, rapid permission changes within 60 minutes, new OAuth app authorizations, token revocation clusters, account hardening sequences, calendar exclusions, communication anomalies, and semantic content analysis via AI embeddings.
Every rule scores findings on a 0–100 logarithmic scale. The first big anomaly matters most — going from 0 to 10 events is a bigger signal than 100 to 110. Scores map to severity levels: 85+ critical, 60–84 high, 30–59 medium, below 30 low.
Layer 3: Baseline Analysis
Raw counts mean nothing without context. FrawdBot maintains a 30-day rolling baseline for each monitored user and computes z-scores to determine whether today’s activity is genuinely anomalous. This is where AI-accelerated behavior lights up — an agent-driven session produces statistical deviations that are orders of magnitude beyond normal human activity. It’s not a subtle signal. It’s a spike you can see from orbit.
Layer 4: Correlation Engine
Individual rule results are interesting but noisy. The correlation engine groups results into incidents — events where multiple rules fire for the same user within the same time window. It scores using peak severity (40%), rule diversity (24%), statistical deviation (15%), temporal density (15%), and campaign boost (6%). Not all rules are weighted equally — deletion spikes get 2.0x weight because destruction is unambiguous, while coordination probes get only 0.3x because they need corroboration.
AI-accelerated incidents score differently than human-speed ones. When an agent drives the activity, temporal density is extreme (everything compressed into minutes) and rule diversity is often high (the agent touches multiple systems in a single sweep). The correlation engine catches this naturally — it doesn’t need a special “AI mode.” The math surfaces the pattern.
Layer 5: LLM Classification
After the math, FrawdBot optionally runs a local LLM (Ollama running qwen2.5:32b) to classify each incident in plain language. The LLM doesn’t score — that’s already done mathematically. Its job is to narrate what the pattern means in forensically meaningful categories. No data leaves the machine.
Layer 6: Campaign Detection
The capstone layer looks across all historical incidents — not just today’s — to find multi-week coordinated patterns. It evaluates 90 days of data against named campaign patterns: Systematic Squeeze-Out (3+ rule families over 14+ days), Sustained Data Exfiltration, Scorched Earth (deletions + permission stripping), Professional Enablement (outside professionals brought in), and Escalating Lockout. FrawdBot has detected all five campaign types in real-world testing.
Campaign detection is where AI-accelerated threats get especially dangerous. A sophisticated actor doesn’t use AI tools in one explosive session — they use them in measured bursts across weeks, each burst looking individually explainable. The campaign layer connects the dots across time and surfaces the escalation pattern that no single-day review would catch.
The Communication Graph
One of FrawdBot’s most powerful features maps who talks to whom, how often, and how those patterns change over time. It materializes email metadata into a queryable network with thousands of contacts and tens of thousands of weekly communication edges. This lets you ask: “Who did this person start communicating with right before the AI-driven activity started?” and “Which previously active relationships went dark?”
Results and Validation
FrawdBot independently identified the same patterns of misconduct that human investigation had already documented. It did so using only the activity logs and email metadata — the same evidence any forensic examiner would review. The system went from proof-of-concept to corroborating months of human investigation in approximately six weeks of development.
False positive management was critical. Dozens of domain exclusions were added to eliminate noise from newsletters and shipping notifications. Discrimination weights prevent high-volume, low-precision rules from overwhelming high-precision ones. Confirmed clean days were tested to verify the system doesn’t over-fire. When FrawdBot says something is wrong, something is actually wrong.
The Career Thread
The fraud detection instinct didn’t start with FrawdBot — it runs through my entire career:
- ID Analytics (acquired by LifeLock): Built the infrastructure processing millions of financial transactions daily for fraud detection under Dr. Steven Coggeshall, inventor of high-dimensional identity analytics
- Oracle: 38 audits per year across FedRAMP, SOC, PCI, and government compliance frameworks — I automated much of the audit response process
- Always Cool Brands: Supply chain fraud detection in FDA-regulated CPG operations, where fraud was endemic to the multi-vendor model
- FrawdBot: The synthesis — applying statistical analysis patterns from ID Analytics, audit automation discipline from Oracle, and operational fraud experience from ACB to a world where the insiders now have AI agents doing their dirty work
The core insight from ID Analytics still holds: “Think about Valentine’s Day — the shape of normal activity looks like a heart. If someone is stealing, it’s still a heart shape, but with a two-mile spike coming out of it.” AI-accelerated insiders don’t change the shape — they make the spike taller and faster. The math still catches it.
Technical Stack
| Component | Technology |
|---|---|
| Language | Python 3.10+ (21,000+ lines across 28 modules) |
| Data Layer | High-speed, performant database with vector and graph capabilities |
| Data Collection | GAMADV-XTD3 (Google Admin CLI) |
| LLM Classification | Ollama (local), qwen2.5:32b |
| Vector Embeddings | nomic-embed-text via Ollama |
| Testing | pytest with 1,257 test cases |
| Reports | Obsidian-compatible Markdown |
| Deployment | Cron or systemd (5-minute intervals for live monitoring) |
Everything runs on a single machine. No cloud services, no external APIs, no data leaving the network. Privacy and legal admissibility were design constraints from day one.
Who Watches the Agents?
FrawdBot is part of a larger project called Self-Improving Code — the thesis that the same automated systems that identify threats can also identify innovations. Defense and innovation are two sides of the same automated oversight system.
The question isn’t whether AI agents will proliferate — they already are. Every developer with Cursor, every business user with Claude, every team spinning up automated workflows is deploying autonomous agents into environments full of sensitive data. The two threat vectors are already in play:
Threat actors are accelerating. A disgruntled employee who would have taken weeks to exfiltrate a company’s data can now do it in a lunch break with the right tools. The intent is human. The velocity is machine.
Agents are drifting. A well-intentioned automation — given broad permissions and a vague goal — can delete records, expose credentials, follow prompt-injected instructions from a malicious document, or simply make destructive decisions that no one asked for. The agent isn’t malicious. It’s autonomous. And in a system full of sensitive data, that’s enough.
Traditional security tools weren’t built for either scenario. FrawdBot catches the behavioral signatures of both — because whether the destruction was driven by a bad actor with AI tools or by an agent that went sideways on its own, the audit logs tell the same story. Spikes. Anomalies. Patterns that break the shape of normal.
FrawdBot watches what happens inside the workspace. Self-Improving Code provides the governance layer for the agents themselves. Together, they answer the question every organization deploying AI needs to ask: who’s watching the agents?