
Hardening Rust Deserialization with Serde: Validating Untrusted Input Safely
Why deserialization needs security controls
Deserialization is not just a data conversion step. It is where attacker-controlled bytes become structured values your code may trust. Common risks include:
- Type confusion in business logic: accepting fields that should never be user-controlled
- Resource exhaustion: deeply nested or oversized payloads causing excessive CPU or memory use
- Unexpected defaults: missing fields silently falling back to unsafe values
- Overly permissive schemas: accepting extra fields that should be rejected
- Unsafe direct mapping: deserializing directly into internal types with invariants
Serde is flexible, but that flexibility means you must define your trust model explicitly.
A safe pattern: deserialize into a boundary type, then validate
A strong pattern is:
- Deserialize into a boundary DTO that mirrors the external format.
- Validate and normalize the DTO.
- Convert it into an internal domain type with strict invariants.
This keeps untrusted input separate from trusted application state.
Example: user registration payload
Suppose an API accepts a registration request:
use serde::Deserialize;
use std::convert::TryFrom;
#[derive(Debug)]
struct NewUser {
username: String,
age: u8,
marketing_opt_in: bool,
}
#[derive(Debug, Deserialize)]
struct NewUserRequest {
username: String,
age: u8,
#[serde(default)]
marketing_opt_in: bool,
}
impl TryFrom<NewUserRequest> for NewUser {
type Error = String;
fn try_from(req: NewUserRequest) -> Result<Self, Self::Error> {
let username = req.username.trim();
if username.len() < 3 || username.len() > 32 {
return Err("username must be 3..=32 characters".into());
}
if !username.chars().all(|c| c.is_ascii_alphanumeric() || c == '_') {
return Err("username contains invalid characters".into());
}
if req.age < 13 {
return Err("users must be at least 13 years old".into());
}
Ok(NewUser {
username: username.to_owned(),
age: req.age,
marketing_opt_in: req.marketing_opt_in,
})
}
}This design prevents the rest of the codebase from accidentally using an invalid username or underage account. The DTO is allowed to be incomplete or messy; the domain type is not.
Rejecting unknown fields
By default, Serde ignores extra fields in many formats. That can be dangerous when an attacker sends fields your code does not expect, especially if later refactors accidentally start using them.
Use deny_unknown_fields for strict schemas:
use serde::Deserialize;
#[derive(Debug, Deserialize)]
#[serde(deny_unknown_fields)]
struct LoginRequest {
username: String,
password: String,
}This is especially useful for security-sensitive inputs such as:
- authentication requests
- privilege changes
- payment instructions
- role assignments
- configuration files
If you need forward compatibility, consider a versioned schema instead of silently accepting unknown data.
Enforcing required fields and avoiding unsafe defaults
Defaults are convenient, but they can hide missing data. For security-sensitive fields, prefer explicit presence unless a default is truly safe.
Good defaults versus risky defaults
| Field type | Safer approach | Risky approach |
|---|---|---|
| Authentication flags | Require explicit input | Defaulting to true or privileged mode |
| Authorization roles | Validate against allowlist | Defaulting to admin-like access |
| Limits and quotas | Require explicit configuration | Defaulting to unlimited |
| Booleans affecting security | Make them explicit | Using #[serde(default)] casually |
Use #[serde(default)] only when the default is harmless and well understood. For example, a cosmetic preference may be fine; a permission flag usually is not.
Limiting input size before deserialization
Serde cannot protect you from huge payloads by itself. If you deserialize unbounded input, an attacker can force large allocations or expensive parsing.
Always enforce size limits before parsing:
use std::io::{self, Read};
fn read_limited(mut reader: impl Read, max_bytes: usize) -> io::Result<Vec<u8>> {
let mut buf = Vec::new();
let mut limited = reader.take(max_bytes as u64 + 1);
limited.read_to_end(&mut buf)?;
if buf.len() > max_bytes {
return Err(io::Error::new(io::ErrorKind::InvalidData, "payload too large"));
}
Ok(buf)
}Then deserialize from the bounded buffer:
use serde_json::from_slice;
fn parse_request(bytes: &[u8]) -> Result<LoginRequest, serde_json::Error> {
from_slice(bytes)
}In web services, enforce limits at multiple layers:
- reverse proxy request size limits
- framework body size limits
- application-level byte caps
- field-level length validation
Defense in depth matters because one layer may be misconfigured.
Preventing deeply nested or pathological inputs
Some formats can be used to create deeply nested structures that consume stack or CPU. JSON, YAML, and similar formats may be vulnerable to this kind of abuse depending on the parser and configuration.
Practical mitigations include:
- setting parser depth limits when available
- rejecting excessively nested structures at the application layer
- avoiding recursive domain models unless necessary
- preferring flat schemas for external input
If your input format supports recursion, consider whether the feature is actually needed. Many APIs can use arrays of bounded objects instead of arbitrary trees.
Validating strings, numbers, and collections
Security issues often come from accepting values that are syntactically valid but semantically dangerous.
Strings
Validate:
- minimum and maximum length
- allowed character set
- normalization rules
- trimming behavior
- case sensitivity expectations
Example:
fn validate_tag(tag: &str) -> Result<(), &'static str> {
if tag.is_empty() || tag.len() > 20 {
return Err("tag length out of range");
}
if !tag.chars().all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '-') {
return Err("tag contains invalid characters");
}
Ok(())
}Numbers
Validate:
- range boundaries
- sign constraints
- whether zero is allowed
- whether the value is a count, index, or identifier
Do not rely on type width alone. A u32 can still be an invalid business value.
Collections
Validate:
- maximum item count
- uniqueness
- per-item constraints
- aggregate constraints
For example, a list of allowed IPs or email addresses should be checked item by item and also as a whole.
Using custom deserializers for constrained fields
Serde lets you attach custom deserializers to fields that need strict parsing. This is useful for dates, identifiers, and enums where the external representation must be tightly controlled.
Example: bounded string length during deserialization
use serde::de::{self, Deserializer};
use serde::Deserialize;
fn deserialize_username<'de, D>(deserializer: D) -> Result<String, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
let s = s.trim();
if s.len() < 3 || s.len() > 32 {
return Err(de::Error::custom("username must be 3..=32 characters"));
}
if !s.chars().all(|c| c.is_ascii_alphanumeric() || c == '_') {
return Err(de::Error::custom("invalid username characters"));
}
Ok(s.to_owned())
}
#[derive(Debug, Deserialize)]
struct AccountInput {
#[serde(deserialize_with = "deserialize_username")]
username: String,
}This approach pushes validation closer to the input boundary and avoids forgetting to validate later.
Prefer explicit enums over free-form strings
Free-form strings are a common source of security bugs because they invite typos, unexpected casing, and unhandled variants. Use enums when the set of values is known.
use serde::Deserialize;
#[derive(Debug, Deserialize)]
#[serde(rename_all = "snake_case")]
enum AccessLevel {
ReadOnly,
Standard,
Admin,
}This makes invalid values fail fast during parsing rather than later in business logic.
If you need to accept legacy values, handle them intentionally with aliases rather than broad string matching.
Handling optional fields carefully
Optional fields are useful, but they can make security decisions ambiguous. A missing field should not mean “safe” unless that is explicitly documented and tested.
Consider this pattern:
#[derive(Debug, Deserialize)]
struct FeatureToggleRequest {
enabled: Option<bool>,
}If enabled is None, the caller has not made an explicit choice. In security-sensitive contexts, that may be unacceptable. Prefer:
- required fields for critical decisions
- explicit tri-state enums when “unset” is meaningful
- separate endpoints for different actions
For example, Option<bool> is often less clear than:
#[derive(Debug, Deserialize)]
enum ToggleState {
Enable,
Disable,
}Testing deserialization boundaries
Security hardening is incomplete without tests. Focus on the boundary behavior, not just the happy path.
Test cases to include
- valid minimal payloads
- missing required fields
- unknown fields
- oversized strings
- invalid character sets
- out-of-range numbers
- deeply nested structures
- duplicate or conflicting values
Example test:
#[cfg(test)]
mod tests {
use super::*;
use serde_json::from_str;
#[test]
fn rejects_short_username() {
let json = r#"{"username":"ab","age":20}"#;
let req: NewUserRequest = from_str(json).unwrap();
assert!(NewUser::try_from(req).is_err());
}
#[test]
fn rejects_unknown_fields() {
let json = r#"{"username":"alice","password":"x","age":20}"#;
let result: Result<LoginRequest, _> = from_str(json);
assert!(result.is_err());
}
}Fuzz testing is also valuable for parsers and deserializers. Even if your application logic is correct, malformed inputs can still expose parser edge cases or resource issues.
Choosing the right validation strategy
Different inputs require different levels of strictness. The table below summarizes common options.
| Strategy | Best for | Benefit | Trade-off |
|---|---|---|---|
deny_unknown_fields | Security-sensitive schemas | Rejects unexpected data | Less forward compatible |
DTO + TryFrom | Business objects with invariants | Clear separation of trust | Requires extra conversion step |
| Custom field deserializers | Constrained fields | Validates at parse time | More code per field |
| Size limits before parsing | All untrusted input | Prevents resource abuse | Requires plumbing at I/O boundary |
| Enums instead of strings | Fixed value sets | Prevents invalid states | Less flexible for extensions |
In practice, combine several of these rather than choosing only one.
A practical checklist for secure deserialization
Before shipping a feature that accepts serialized input, verify the following:
- The input size is bounded before parsing.
- Unknown fields are rejected when appropriate.
- Required fields are truly required.
- Security-sensitive defaults are not implicit.
- External DTOs are converted into internal domain types.
- Strings, numbers, and collections are validated for business rules.
- Enums are used instead of free-form strings where possible.
- Tests cover malformed, oversized, and unexpected payloads.
Conclusion
Secure deserialization in Rust is less about a single library feature and more about disciplined boundary design. Serde gives you powerful tools, but the safest approach is to treat all external input as untrusted until it has been parsed, constrained, validated, and converted into a trusted internal type.
If you make the boundary strict, keep domain types invariant-safe, and test malformed inputs aggressively, you will eliminate an entire class of bugs before they reach production.
