LLMs review code one file at a time. Design flaws live between files.
LLMs see code through a keyhole: one file, one function, one context window at a time.
Bug finding is a solved problem. Design analysis is not. Design flaws require negative queries (what's missing?) and cross-file relationship analysis (what connects to what?).
Understands intent, finds local bugs. But can't hold architecture. It sees one file at a time.
Sees entire codebase, 3,000+ rules for injection & XSS. No semantic reasoning. It can't ask "why is this wrong?"
Pattern matchers can't express "find all routes that do NOT have auth middleware." Design analysis needs something else.
A Code Property Graph overlays three views of the same code. Together they capture structure, control flow, and data movement: 443,000 nodes for the target app.
Toggle each layer to see how structure, execution paths, and data flow overlay on the same code.
The graph carries the cross-partition knowledge. The LLM provides the reasoning. Neither works alone.
LLMxCPG (Lekssays et al., 2025) combines fine-tuned LLMs with Joern's CPG for vulnerability detection.
↑ The human provides the CWE and the code. The bottleneck remains
The LLM formulated CPGQL queries using only "this is an Express app with Sequelize."
All in server.ts, 3 route groups
click to flipsecurity.* pattern identified
click to flipisAuthorized, denyAll, appendUserId
click to flipSequelize + MongoDB, writes in 20 route files
click to flipNo library, no ad-hoc validation
click to flipLeaks raw errors + Express version
click to flipValidators exist but always pass
click to flipOnly reset-password + 2FA
click to flipsocket.io, no JWT on connection
click to flip
Negative sub-traversal on app.verb() calls in server.ts, filtering out routes with security.* arguments.
Cross-referenced with path-scoped app.use() auth. Notable: GET /rest/user/change-password,
admin endpoints, all file uploads left unprotected.
hash() uses crypto.createHash('md5') with no salt, no stretching.
hmac() key 'pa4qacea4VK9t9nGv7yZtwmj' hardcoded in source.
Card numbers stored as plaintext integers. TOTP secrets unencrypted in DB.
checkFileType checks extension for challenge solving but always calls next().
checkUploadSize always calls next(). No magic byte validation,
no content-type check, no multer fileFilter configured.
Mass assignment in POST /api/Users: full req.body to Sequelize with no field allowlist.
Chatbot query text overwrites username. req.body.UserId used instead of middleware-injected value in 14+ files.
All 5 are password-emptiness checks. Zero type coercion, zero regex, zero schema validation. Model-layer sanitization exists but is output encoding, not input validation.
changePassword calls hash(), resetPassword calls hmac(),
chatbot calls authorize() and verify().
Prometheus metrics endpoint exposed without auth.
4/109 routes rate-limited (reset-password + 2FA only). All use 5min/100req.
Reset-password keyGenerator uses X-Forwarded-For, which is trivially bypassable.
Checkout verifies basket and delivery but NOT address or payment card. No session-level step tracking. Angular frontend stores state in sessionStorage. Coupon applied to any basket without ownership check.
Every finding maps to at least one known exploitable challenge in vulnerable-app.
Admin Section: admin panel accessible without auth. Change User1's Password: password change via GET, no auth. View Basket: another user's basket viewable. Five-Star Feedback: delete feedback without auth.
Password Strength: admin password crackable (MD5 reversible). Weird Crypto: "algorithm it should not use." Two Factor Auth: TOTP secret stored unsafely.
Upload Size: upload >100kB (checkUploadSize always passes). Upload Type: upload non-PDF/ZIP (checkFileType always passes). Arbitrary File Write: overwrite legal information file.
Admin Registration: mass assignment sets role:admin. Forged Feedback: post as another user. Forged Review: edit any review. Manipulate Basket: put product in another's basket. NoSQL Manipulation: update multiple reviews.
Zero Stars: UI prevents 0 stars, server accepts it. Empty User Registration: register with empty fields. Repetitive Registration: DRY violation. Payback Time: negative quantities not validated.
Exposed Metrics: Prometheus endpoint without auth. Change User1's Password: password change reaches hash function without auth.
CAPTCHA Bypass: submit 10+ feedbacks in 20 seconds. Reset User3's Password: brute force despite rate limiting (bypassable).
Deluxe Fraud: obtain membership without paying. Payback Time: order makes you rich (negative quantities). Expired Coupon: redeem expired coupon.
The CPG didn't eliminate code reading; it reduced it to targeted verification.
5 of 11 Pass 2 queries needed get_source, reading 10–20 lines each.
SAST has 3,000+ rules for injection and XSS. Zero rules for "does this workflow enforce step ordering?" or "what percentage of routes lack auth?" Design flaws are architectural.
The keyhole problem is structural, not a context window limitation. Even with infinite context, file-by-file reading loses the cross-file relationships that define design.
Not source code (too granular). Not pattern matching (no semantics, no absence detection). A structural graph the LLM can query, reason over, and verify against. Two passes: learn the architecture, then interrogate the design.
Validated on vulnerable-app (443K CPG nodes, 107 known challenges). 30 findings confirmed across 8 OWASP A06 categories. Zero false positives. All tools open source.
| Tool | Purpose | URL |
|---|---|---|
| Joern | Open-source CPG engine, CPGQL query language | joern.io |
| GitNexus | Code knowledge graph, community detection, blast radius | github.com/BlockSecCA/GitNexus |
| Claude | LLM for query generation, reasoning, interpretation | anthropic.com |
| joern-mcp | MCP server wrapping Joern's HTTP API | github.com/BlockSecCA/joern-mcp |
| vulnerable-app | Target application: intentionally vulnerable Express/Sequelize app (107 challenges) | github.com/BlockSecCA/vulnerable-app |
| Reference | URL |
|---|---|
| OWASP Top 10: A06:2021 Insecure Design | owasp.org/Top10/A06 |
| CWE (Common Weakness Enumeration) | cwe.mitre.org |
| CPGQL Documentation | docs.joern.io/cpgql |
| Paper | Authors | URL |
|---|---|---|
| LLMxCPG | Lekssays, Mouhcine, Tran, Yu, Khalil (2025) | arxiv.org/abs/2507.16585 |
| Code Property Graphs | Yamaguchi, Golde, Arp, Rieck (2014) | ieeexplore.ieee.org/… |