How Zod's .refine() Can Cause a Denial of Service — And How to Fix It (2026)
Zod's .refine() executes even when earlier validators like .min() and .max() fail — placing expensive logic inside .refine() can cause a DoS. Learn the root cause, a live demo, and the safe pattern.

TL;DR
Zod's .refine() executes on every input — even when earlier validators like .min() and .max() have already failed. If you place an expensive operation such as a database query inside .refine(), an attacker can trigger that query with every request, including requests containing completely invalid inputs that would never pass validation. Flood enough of those requests concurrently and the server goes down. The fix is one line away — validate first, query after — but only if you know the behavior exists.
Introduction
Zod is one of the most widely adopted TypeScript validation libraries. If you are building a Next.js API route, a tRPC endpoint, or a server action, Zod is likely involved in your validation layer. It is trusted precisely because it makes validation straightforward — define a schema once, validate everywhere.
That trust makes this behavioral edge case more dangerous than it would be in a less-used library. Developers assume that if .min() or .max() rejects an input, Zod stops there. It does not. And if you have written production code that places a database query inside .refine() — which is a natural thing to do when implementing uniqueness checks, for example — your server has an application-layer Denial of Service vulnerability that requires no authentication and no special tooling to exploit.
What Is Zod?
Zod is a TypeScript-first schema validation library. Define a schema, validate data against it, get back either the validated data or a typed error object. It works across the full stack: form validation on the client, API request and response validation on the server, environment variable validation at startup, and runtime type narrowing anywhere TypeScript cannot provide compile-time guarantees.
import { z } from "zod"
const UserSchema = z.object({
username: z.string().min(2).max(20),
email: z.string().email(),
age: z.number().int().min(18),
})
// Parse returns typed data or throws ZodError
const user = UserSchema.parse(req.body)
// safeParse returns { success: true, data } or { success: false, error }
const result = UserSchema.safeParse(req.body)The library is used across authentication systems, API validation layers, database input sanitization, and anywhere else that untrusted data enters a typed system. Its ubiquity is exactly what makes understanding its security edge cases important.
The .refine() Function
.refine() adds custom validation logic that Zod cannot express natively. The common use cases are uniqueness checks against a database, cross-field comparisons, and domain-specific business rule validation.
const UsernameSchema = z.string()
.min(2, "Username too short")
.max(20, "Username too long")
.refine(
async (val) => {
// Check if username already exists in database
const existing = await db.users.findUnique({ where: { username: val } })
return existing === null
},
{ message: "Username already taken" }
)This looks correct. It looks like the database check only runs if the username passes .min() and .max(). It does not.
The Vulnerability: .refine() Runs Regardless of Earlier Failures
This is the behavior that most developers do not expect: Zod executes .refine() even when an earlier validator in the chain has already produced a validation error.
The following schema demonstrates the execution order:
const schema = z.object({
username: z.string()
.min(2, "Username too short")
.max(4, "Username too long")
.refine((val) => {
console.log("REFINE EXECUTED with value:", val)
return true
}, {
message: "Validation failed"
})
})When this schema receives the input "a" (one character — fails .min(2)):
Input: "a"
Result: ZodError — "Username too short"
Console: REFINE EXECUTED with value: a ← runs despite min() failureWhen it receives "this is a long string" (fails .max(4)):
Input: "this is a long string"
Result: ZodError — "Username too long"
Console: REFINE EXECUTED with value: this is a long string ← runs despite max() failureScreenshot context: Input
"a"fails.min(2)and returns "Username too short" — but the server console showsREFINE EXECUTED with value: a, proving.refine()ran despite the earlier validation failure. The error response is correct; the hidden execution is not.
Screenshot context: Input
"this is test"fails.max(4)and returns "Username too long" — but the full string appears in the server console, confirming.refine()ran on an input that exceeded the maximum length and should have been rejected before reaching the custom validator.
Why This Causes a Denial of Service
The behavior above is a curiosity with a console.log(). Replace that console.log() with a database query and it becomes an application-layer DoS vector.
// UNSAFE — database query inside .refine()
const unsafe_schema = z.object({
username: z.string()
.min(2, "Username too short")
.max(4, "Username too long")
.refine((val) => expensiveDBCall(val), {
message: "DB validation failed"
}),
});The attack is straightforward: send a flood of requests with an input that trivially fails .min() — a single character, or an empty string. Each request:
- Fails
.min()— validation error produced .refine()executes anyway — database query fires- Response returns validation error to the attacker — but the query already ran
Screenshot context: The response correctly returns "Username too short" for input
"a"— but the server log showsQUERY #99: SELECT * FROM users WHERE id = 99executed. A database query fired on an input that failed the most basic length check, before any business logic should have run.
Screenshot context: The oversized input
"this_is"fails.max(4)and returns a validation error — but the server log showsQUERY #699, an incrementing query number confirming that every request regardless of validation outcome triggers a real database operation. Each invalid request adds to the query load.
Screenshot context: After flooding the endpoint with large invalid inputs, the server begins returning errors and becomes unavailable to legitimate users. Database connections are exhausted by the volume of queries triggered by inputs that never passed basic validation — the application-layer DoS is achieved using nothing but an HTTP client and invalid strings.
The Amplification Dynamic
The input size compounds the damage. A single-character input triggers one database query per request. A 10,000-character input triggers the same one database query per request — but it also:
- Consumes more memory to hold the string during processing
- Takes more CPU cycles to evaluate the string in the
.refine()callback - May trigger more expensive database operations if the query scales with input complexity
An attacker does not need to guess the right input. Any input that fails basic validation — undersized or oversized — triggers the expensive operation. The attack surface is the entire valid range of the input field, and the invalid range outside it.
Image context: The flow comparison shows exactly where the execution diverges — in the unsafe pattern the database query fires regardless of validation outcome, while the safe pattern gates the expensive operation behind a successful validation check.
The Safe Pattern
The fix requires understanding one principle: validation and business logic are separate phases. Validate all structural requirements first. Only execute expensive operations after validation succeeds and you have clean, known-good data.
// SAFE — validate first, query after
const safe_schema = z.object({
username: z.string()
.min(2, "Username too short")
.max(4, "Username too long")
});
// No .refine() with expensive logic here
// validation middleware
const validate = (schema) => (req, res, next) => {
const result = schema.safeParse(req.body);
if (!result.success) {
const errors = result.error.issues.map((issue) => issue.message);
return res.status(400).json({
error: errors[0] // or return full array if you want
});
}
req.validatedData = result.data;
next();
};
// route handler
app.post('/safe-validation', validate(safe_schema), async (req, res) => {
const { username } = req.validatedData;
try {
// fake db call
const result = await safeExpensiveDBCall(username);
return res.json({
message: "Safe processing completed",
result
});
} catch (err) {
return res.status(500).json({
error: "Internal processing failed"
});
}
});This pattern gates the database query behind successful structural validation. An attacker sending single-character or oversized inputs hits the .min()/.max() check, gets a validation error, and the function returns — no database query, no CPU expenditure beyond the string length check.
Screenshot context: Input
"a"fails.min(2)and returns "Username too short" — and the server console is silent. No database query appears. The validation middleware returned the error before the route handler ever executed, demonstrating that the two-phase pattern fully gates the expensive operation.
Screenshot context: A long input exceeding
.max(4)returns "Username too long" with no database query in the server log — confirming that oversized inputs are stopped at the schema validation layer before reaching any business logic, even when the input would previously have triggered a query in the unsafe pattern.
Screenshot context: A valid input that passes both
.min(2)and.max(4)flows through to the route handler and the database query executes — exactly as intended. The safe pattern does not block legitimate requests, it only gates the expensive operation behind successful structural validation.
When .refine() Is Safe
Not all .refine() usage is dangerous. The risk is specifically about expensive operations with external side effects. Pure computational checks are fine:
// SAFE — pure computation, no I/O
const PasswordSchema = z.string()
.min(8, "Password too short")
.refine(
(val) => /[A-Z]/.test(val) && /[0-9]/.test(val),
{ message: "Password must contain uppercase and a number" }
)
// SAFE — lightweight synchronous check
const DateRangeSchema = z.object({
start: z.date(),
end: z.date(),
}).refine(
(data) => data.end > data.start,
{ message: "End date must be after start date" }
)
// UNSAFE — database query, API call, file I/O
const UniqueEmailSchema = z.string()
.email()
.refine(
async (val) => !(await db.users.findUnique({ where: { email: val } })),
{ message: "Email already registered" }
)The rule: if .refine() makes a network call, touches a database, reads a file, or calls an external API — move it outside. If it is pure synchronous computation on the value itself — it is safe where it is.
Security Testing Workflow: Finding This in the Wild
For penetration testers and security engineers auditing applications that use Zod, the following workflow identifies endpoints vulnerable to this pattern before they are exploited in production.
Step 1 — Identify Targets
Start with endpoints that are most likely to trigger uniqueness checks or expensive lookups inside a validator:
- Registration and signup forms (username, email fields)
- Login endpoints (credential lookup)
- Search and filter endpoints (database queries per input)
- Profile update endpoints (slug, handle, or email uniqueness)
Any string field on an authenticated or unauthenticated endpoint is a candidate.
Step 2 — Baseline Measurement
Send three requests to the same endpoint and record response times:
- A valid request that passes all validation
- A request with a single-character input (fails
.min()) - A request with a 10,000-character string (fails
.max())
If the server is performing expensive operations inside .refine(), the response time for the invalid requests will approach the response time for the valid request — because the same database query is running in all three cases.
Step 3 — Detect the Signal
Two patterns indicate the vulnerability:
- Response time for invalid inputs ≈ response time for valid inputs → expensive operation executing on all inputs regardless of validation outcome
- Response time for a large invalid input significantly exceeds a small invalid input → processing scales with input size, confirming work is being done on unvalidated data
Step 4 — Confirm with Load Test
Using Burp Suite Intruder or a simple script, send 100 concurrent requests with a single-character invalid input. Monitor server response times and error rates. If response times degrade or 500 errors begin appearing, the DoS vector is confirmed — the server is exhausting resources processing queries triggered by inputs that should have been rejected at the door.
Step 5 — Verify Remediation
After the fix is applied, repeat Step 2. Invalid input response times should drop to near-zero — just the cost of a string length check, with no database round-trip. There should be no correlation between input size and response time for invalid inputs.
Common Mistakes
Assuming Zod short-circuits on first failure. The most natural mental model — validation runs, first error stops processing — is incorrect for .refine(). Zod collects all errors by default to give users a complete picture of what is wrong with their input. This is great for UX. It is dangerous for security when .refine() contains expensive operations. The assumption is wrong and the consequence of acting on that assumption is a production vulnerability.
Moving the database check to .superRefine() as the "fix." .superRefine() is .refine() with access to the refinement context — it has identical execution behavior. Switching from .refine() to .superRefine() does not fix the underlying issue. The only fix is moving expensive operations outside the schema definition entirely and executing them only after .parse() or .safeParse() succeeds.
Only applying the fix to the specific field that caused the incident. If one endpoint has this pattern, audit all endpoints. The same developer who put a database query in .refine() for username uniqueness very likely did the same for email uniqueness on the registration form, slug uniqueness in a CMS, or invite code validation. A targeted fix that leaves the pattern in place elsewhere leaves multiple DoS vectors active.
Testing only with valid and obviously invalid inputs. Security testing of input validation should always include both undersized inputs (empty string, single character), oversized inputs (kilobytes of random data), and inputs with special characters, Unicode, and null bytes. The Zod DoS specifically requires inputs that fail structural validation — developers who only test with valid inputs and cleanly formatted invalid inputs never discover that their expensive logic is running on the bad inputs too.
Frequently Asked Questions
Conclusion
Zod's .refine() is not broken — it behaves exactly as designed. The vulnerability is the assumption that it does not. Every developer who puts a database query inside .refine() to check uniqueness, every engineer who calls an external API inside a validator callback, has made the same natural but incorrect assumption: that .refine() only runs when earlier validators passed. The fix requires no new dependencies and no library changes — validate structure with Zod, then execute expensive operations on the validated result. The two-phase pattern eliminates the entire attack surface. As a pentester, the signal to look for is response time correlation with input size on fields that are likely to trigger database lookups — registration forms, username fields, email inputs. If large invalid inputs are slower than small invalid inputs, expensive operations are running on data that should have been rejected at the door.
Sources
- Zod Official Documentation — .refine() — Official Zod API reference for .refine() and .superRefine() with usage examples
- YouTube — Zod DoS Demo — Original video demonstration of this server-down scenario using Zod .refine()
- GitHub — Zod Validation Bypass Lab — Hands-on lab with the three demo endpoints (/visualize-validation, /unsafe-validation, /safe-validation) to reproduce and test the vulnerability yourself
- OWASP — Denial of Service — Application-layer DoS attack patterns and prevention guidance
- OWASP — Input Validation — Comprehensive input validation cheat sheet covering the validate-first principle
- CWE-400 — Uncontrolled Resource Consumption — MITRE's classification for resource exhaustion vulnerabilities including this pattern