Protect PII in JSON data structures. Schema-aware processing for API responses, configuration files, log data, and complex nested data with intelligent field detection.
Structured data intelligence
Understand JSON structure to intelligently identify fields likely to contain PII.
Process deeply nested objects and arrays with proper path handling.
Detect PII based on field names like "email", "phone", "ssn", "address".
Analyze field values for PII patterns regardless of field naming.
Process large JSON files and NDJSON streams efficiently.
Maintain JSON structure, data types, and formatting after redaction.
Simple integration, powerful results
Send your documents, text, or files through our secure API endpoint or web interface.
Our AI analyzes content to identify all sensitive information types with 99.7% accuracy.
Sensitive data is automatically redacted based on your configured compliance rules.
Receive your redacted content with full audit trail and compliance documentation.
Get started with just a few lines of code
import requests
api_key = "your_api_key"
url = "https://api.redactionapi.net/v1/redact"
data = {
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}
response = requests.post(url,
headers={"Authorization": f"Bearer {api_key}"},
json=data
)
print(response.json())
# Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
const axios = require('axios');
const apiKey = 'your_api_key';
const url = 'https://api.redactionapi.net/v1/redact';
const data = {
text: "John Smith's SSN is 123-45-6789",
redaction_types: ["ssn", "person_name"],
output_format: "redacted"
};
axios.post(url, data, {
headers: { 'Authorization': `Bearer ${apiKey}` }
})
.then(response => {
console.log(response.data);
// Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
});
curl -X POST https://api.redactionapi.net/v1/redact \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"text": "John Smith's SSN is 123-45-6789",
"redaction_types": ["ssn", "person_name"],
"output_format": "redacted"
}'
# Response:
# {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
JSON has become the universal format for data exchange—APIs return JSON, applications store configuration in JSON, analytics platforms export JSON, and modern databases use JSON documents. This prevalence means PII inevitably flows through JSON data structures: user profiles in API responses, customer records in data exports, personal information in log files, and sensitive details in configuration. Protecting this data requires understanding both JSON's structural nature and the patterns of PII within it.
Our JSON processing combines structural awareness with content analysis. We parse JSON hierarchically, understanding the relationship between fields and their values. Field names provide context—a field named "email" likely contains an email address. Value patterns confirm PII presence—a string matching email format is redacted regardless of field name. This dual approach ensures comprehensive protection for JSON data regardless of schema design or naming conventions.
Understanding JSON structure enables intelligent redaction:
// Input JSON
{
"user": {
"id": 12345,
"name": "John Smith",
"email": "[email protected]",
"phone": "+1-555-123-4567",
"address": {
"street": "123 Main St",
"city": "New York",
"state": "NY",
"zip": "10001"
},
"orders": [
{
"id": "ORD-001",
"items": ["Widget A", "Widget B"],
"notes": "Call customer at 555-987-6543"
}
]
}
}
// Redacted output
{
"user": {
"id": 12345,
"name": "[NAME]",
"email": "[EMAIL]",
"phone": "[PHONE]",
"address": {
"street": "[ADDRESS]",
"city": "[CITY]",
"state": "NY",
"zip": "[ZIP]"
},
"orders": [
{
"id": "ORD-001",
"items": ["Widget A", "Widget B"],
"notes": "Call customer at [PHONE]"
}
]
}
}
Field names indicate likely PII content:
// High-confidence PII field names
{
"email": "...",
"emailAddress": "...",
"email_address": "...",
"phone": "...",
"phoneNumber": "...",
"phone_number": "...",
"mobile": "...",
"ssn": "...",
"socialSecurityNumber": "...",
"social_security_number": "...",
"name": "...",
"firstName": "...",
"lastName": "...",
"fullName": "...",
"address": "...",
"streetAddress": "...",
"creditCard": "...",
"dob": "...",
"dateOfBirth": "...",
"birthDate": "..."
}
// Field name matching is case-insensitive
// Supports camelCase, snake_case, and variations
// Configured aliases for industry-specific terms
Configurable Field Lists:
{
"fieldDetection": {
"alwaysRedact": [
"$.user.ssn",
"$.customer.*.credit_card",
"$.employees[*].salary"
],
"neverRedact": [
"$.metadata.created_by",
"$.system.service_account"
],
"patterns": [
"*_email",
"*_phone",
"*.pii.*"
]
}
}
Detect PII by value patterns regardless of field name:
// PII detected in any string field
{
"description": "Contact [email protected] for details",
"notes": "Customer SSN: 123-45-6789",
"comments": "Call back at (555) 123-4567",
"reference": "Credit card ending in 4242",
"blob": "John Smith, 123 Main St, New York NY 10001"
}
// All PII detected and redacted
{
"description": "Contact [EMAIL] for details",
"notes": "Customer SSN: [SSN]",
"comments": "Call back at [PHONE]",
"reference": "Credit card ending in [CREDIT_CARD]",
"blob": "[NAME], [ADDRESS], [CITY] [STATE] [ZIP]"
}
Handle complex nested structures:
// Deeply nested data
{
"company": {
"departments": [
{
"name": "Engineering",
"teams": [
{
"name": "Backend",
"members": [
{
"employee": {
"personal": {
"name": "John Smith",
"contact": {
"email": "[email protected]",
"phone": "555-1234"
}
}
}
}
]
}
]
}
]
}
}
// Field paths tracked:
// $.company.departments[0].teams[0].members[0].employee.personal.name
// $.company.departments[0].teams[0].members[0].employee.personal.contact.email
// $.company.departments[0].teams[0].members[0].employee.personal.contact.phone
Process arrays of values and objects:
// Array of strings
{
"emails": ["[email protected]", "[email protected]", "[email protected]"]
}
// → ["[EMAIL]", "[EMAIL]", "[EMAIL]"]
// Array of objects
{
"contacts": [
{"name": "John", "email": "[email protected]"},
{"name": "Jane", "email": "[email protected]"}
]
}
// Each object processed independently
// Mixed arrays
{
"data": [
"[email protected]",
{"type": "phone", "value": "555-1234"},
123,
null
]
}
// Each element handled by type
Efficiently process large JSON data:
Standard JSON Streaming:
// For large single JSON objects/arrays
// Uses streaming parser (not full DOM load)
const stream = fs.createReadStream('large-data.json');
const parser = new StreamingJsonParser();
parser.on('value', async (path, value) => {
if (shouldRedact(path, value)) {
const redacted = await redact(value);
outputStream.write(path, redacted);
} else {
outputStream.write(path, value);
}
});
stream.pipe(parser);
NDJSON Processing:
// Newline-delimited JSON (JSON Lines)
// One JSON object per line
// Input: logs.ndjson
{"timestamp": "2024-01-15", "user": "[email protected]", "action": "login"}
{"timestamp": "2024-01-15", "user": "[email protected]", "action": "logout"}
{"timestamp": "2024-01-15", "user": "[email protected]", "action": "purchase"}
// Processing
const readline = require('readline');
const rl = readline.createInterface({
input: fs.createReadStream('logs.ndjson')
});
for await (const line of rl) {
const obj = JSON.parse(line);
const redacted = await redactJson(obj);
outputStream.write(JSON.stringify(redacted) + '\n');
}
Batch API for NDJSON:
POST /v1/redact/ndjson
Content-Type: application/x-ndjson
{"user": {"email": "[email protected]"}}
{"user": {"email": "[email protected]"}}
{"user": {"email": "[email protected]"}}
Response (streaming):
{"user": {"email": "[EMAIL]"}}
{"user": {"email": "[EMAIL]"}}
{"user": {"email": "[EMAIL]"}}
Maintain JSON data types after redaction:
// Input with various types
{
"string_field": "[email protected]",
"number_field": 123456789,
"boolean_field": true,
"null_field": null,
"array_field": ["a", "b", "c"],
"object_field": {"key": "value"}
}
// Redaction options for type handling
{
"typePreservation": {
"strings": "redact_inline", // "[EMAIL]"
"numbers": "null", // null (can't redact inline)
"preserveStructure": true // keep object/array structure
}
}
// Output
{
"string_field": "[EMAIL]",
"number_field": null, // Number couldn't be preserved
"boolean_field": true, // Unchanged (not PII)
"null_field": null, // Unchanged
"array_field": ["a", "b", "c"], // Unchanged (no PII)
"object_field": {"key": "value"} // Unchanged (no PII)
}
Configure redaction based on known schemas:
// JSON Schema with PII annotations
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"id": {
"type": "integer",
"x-pii": false
},
"email": {
"type": "string",
"format": "email",
"x-pii": true,
"x-pii-type": "email"
},
"metadata": {
"type": "object",
"x-pii-scan": false // Skip scanning this subtree
}
}
}
}
}
// Use schema for precise field-level control
{
"schema": "https://example.com/schemas/user.json",
"useSchemaAnnotations": true
}
Use JSONPath for precise field selection:
// JSONPath examples
{
"redactPaths": [
"$.user.email", // Specific field
"$.users[*].email", // All user emails
"$.orders[*].customer.*", // All customer fields in orders
"$..email", // All email fields anywhere
"$.data[?(@.type=='personal')]", // Conditional selection
"$.users[0:10].ssn" // Array slice
],
"excludePaths": [
"$.metadata.*", // Skip metadata
"$.system.service_account" // Skip service accounts
]
}
Replace PII with consistent tokens for analytics:
// Tokenization preserves referential integrity
{
"users": [
{"id": 1, "email": "[email protected]", "manager_email": "[email protected]"},
{"id": 2, "email": "[email protected]", "manager_email": "[email protected]"}
]
}
// With tokenization
{
"users": [
{"id": 1, "email": "tok_abc123", "manager_email": "tok_def456"},
{"id": 2, "email": "tok_def456", "manager_email": "tok_ghi789"}
]
}
// Note: [email protected] → tok_def456 consistently
// Enables: JOIN operations, relationship analysis
// Prevents: PII exposure
// Redact JSON data
POST /v1/redact
Content-Type: application/json
{
"json": {
"customer": {
"name": "John Smith",
"email": "[email protected]",
"orders": [
{"id": "123", "amount": 99.99}
]
}
},
"options": {
"format": "json",
"fieldDetection": true,
"valueDetection": true,
"preserveTypes": true,
"outputFormat": "json"
}
}
Response:
{
"redacted_json": {
"customer": {
"name": "[NAME]",
"email": "[EMAIL]",
"orders": [
{"id": "123", "amount": 99.99}
]
}
},
"detections": [
{"path": "$.customer.name", "type": "name", "confidence": 0.95},
{"path": "$.customer.email", "type": "email", "confidence": 0.99}
]
}
API Response Logging:
// Redact before logging API responses
app.use(async (req, res, next) => {
const originalJson = res.json.bind(res);
res.json = async (data) => {
// Log redacted version
const redacted = await redactionClient.redactJson(data);
logger.info('API Response', { body: redacted });
// Send original to client
return originalJson(data);
};
next();
});
Data Export Sanitization:
// Process data exports before distribution
async function exportUserData(userId) {
const userData = await db.getFullUserRecord(userId);
// Redact for external report
const redactedData = await redactionClient.redactJson(userData, {
outputFormat: 'json',
includeReplacementValues: true
});
return redactedData;
}
Analytics Data Preparation:
// Prepare JSON for analytics pipeline
const pipeline = [
readJsonStream('events.ndjson'),
redactStream({ types: ['pii'] }),
writeToAnalytics('redacted_events')
];
await runPipeline(pipeline);
RedactionAPI has transformed our document processing workflow. We've reduced manual redaction time by 95% while achieving better accuracy than our previous manual process.
The API integration was seamless. Within a week, we had automated redaction running across all our customer support channels, ensuring GDPR compliance effortlessly.
We process over 50,000 legal documents monthly. RedactionAPI handles it all with incredible accuracy and speed. It's become an essential part of our legal tech stack.
The multi-language support is outstanding. We operate in 30 countries and RedactionAPI handles all our documents regardless of language with consistent accuracy.
Trusted by 500+ enterprises worldwide





We use two approaches: field name analysis (detecting keys like "email", "phone_number", "ssn") and value pattern matching (identifying emails, phone numbers, etc. in any field). Combining both ensures comprehensive detection regardless of naming conventions.
Yes, we recursively process nested objects and arrays to any depth. Field paths are tracked (e.g., "user.contact.email") for accurate redaction reporting. Nested arrays of objects are each processed individually.
Large JSON files are processed using streaming parsers that don't require loading the entire file into memory. For NDJSON (newline-delimited JSON) files, each line is processed independently, enabling efficient handling of log files and data exports.
NDJSON files (one JSON object per line) are fully supported. Each line is parsed and processed independently, making this format ideal for log files, event streams, and large data exports.
Yes, we maintain proper JSON data types. If a numeric field is redacted, the replacement can be a number or null. String fields remain strings. Arrays and objects maintain their structure. Output is always valid JSON.
Yes, you can provide explicit field paths to always redact, whitelist paths to never redact, or use patterns for flexible matching. This gives precise control over redaction behavior for known schemas.