RedactionAPI.net
Home
Data Types
Name Redaction Email Redaction SSN Redaction Credit Card Redaction Phone Number Redaction Medical Record Redaction
Compliance
HIPAA GDPR PCI DSS CCPA SOX
Industries
Healthcare Financial Services Legal Government Technology
Use Cases
FOIA Redaction eDiscovery Customer Support Log Redaction
Quick Links
Pricing API Documentation Login Try Redaction Demo
Portuguese Text Redaction
99.7% Accuracy
70+ Data Types

Portuguese Text Redaction

Detect and redact PII from Portuguese text with support for Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT). Native language processing for names, addresses, and regional identifiers.

Enterprise Security
Real-Time Processing
Compliance Ready
0 Words Protected
0+ Enterprise Clients
0+ Languages
260 M+
Speakers
9
Countries
2
Variants
99 %
Accuracy

Portuguese Language Features

Native Portuguese NLP

Portuguese Names

Detect names with compound surnames common in Portuguese-speaking cultures.

Regional Variants

Support Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT) differences.

Regional IDs

Detect CPF, NIF, BI, and other country-specific identifiers.

Address Formats

Parse Brazilian and Portuguese address structures and postal codes.

Phone Numbers

Recognize phone formats across Portuguese-speaking countries.

Compliance

Support LGPD (Brazil) and GDPR (Portugal) requirements.

How It Works

Simple integration, powerful results

01

Upload Content

Send your documents, text, or files through our secure API endpoint or web interface.

02

AI Detection

Our AI analyzes content to identify all sensitive information types with 99.7% accuracy.

03

Smart Redaction

Sensitive data is automatically redacted based on your configured compliance rules.

04

Secure Delivery

Receive your redacted content with full audit trail and compliance documentation.

Easy API Integration

Get started with just a few lines of code

  • RESTful API with JSON responses
  • SDKs for Python, Node.js, Java, Go
  • Webhook support for async processing
  • Sandbox environment for testing
redaction_api.py
import requests

api_key = "your_api_key"
url = "https://api.redactionapi.net/v1/redact"

data = {
    "text": "John Smith's SSN is 123-45-6789",
    "redaction_types": ["ssn", "person_name"],
    "output_format": "redacted"
}

response = requests.post(url,
    headers={"Authorization": f"Bearer {api_key}"},
    json=data
)

print(response.json())
# Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
const axios = require('axios');

const apiKey = 'your_api_key';
const url = 'https://api.redactionapi.net/v1/redact';

const data = {
    text: "John Smith's SSN is 123-45-6789",
    redaction_types: ["ssn", "person_name"],
    output_format: "redacted"
};

axios.post(url, data, {
    headers: { 'Authorization': `Bearer ${apiKey}` }
})
.then(response => {
    console.log(response.data);
    // Output: {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
});
curl -X POST https://api.redactionapi.net/v1/redact \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "John Smith's SSN is 123-45-6789",
    "redaction_types": ["ssn", "person_name"],
    "output_format": "redacted"
  }'

# Response:
# {"redacted_text": "[PERSON_NAME]'s SSN is [SSN_REDACTED]"}
SSL Encrypted
<500ms Response

Portuguese Language PII Detection

Portuguese is spoken by over 260 million people across four continents, with Brazil representing the largest Portuguese-speaking population by far, followed by Portugal, Mozambique, Angola, and other lusophone nations. While these countries share a common language, they differ significantly in vocabulary, spelling conventions, naming practices, identifier formats, and address structures. Effective Portuguese PII detection must account for these regional variations while maintaining consistent protection across all variants.

Our Portuguese language processing handles both major variants—Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT)—with dedicated models for each. We recognize regional naming conventions, validate country-specific identifiers with appropriate algorithms, and parse address formats used in each country. This enables comprehensive PII protection for Portuguese-language documents regardless of their origin.

Regional Variants

Key differences between Brazilian and European Portuguese:

Vocabulary Differences:

PII-related terms:
Identity document: carteira de identidade (BR) vs bilhete de identidade (PT)
Cell phone: celular (BR) vs telemóvel (PT)
Address: endereço (both, but formatting differs)
Personal data: dados pessoais (both)
Passport: passaporte (both)
Driver's license: carteira de motorista (BR) vs carta de condução (PT)

// Detection handles both vocabularies
"Telemóvel: 912 345 678" → [PHONE] (Portugal)
"Celular: (11) 99999-8888" → [PHONE] (Brazil)

Spelling Differences:

// Post-1990 agreement variations
ação (BR) vs acção (PT, older)
anônimo (BR) vs anónimo (PT)
fato (BR) vs facto (PT)

// Names may follow either convention
Both spellings recognized for matching

Portuguese Name Detection

Portuguese naming conventions across regions:

Brazilian Names:

Structure: Given name(s) + Maternal surname + Paternal surname
Example: Maria Eduarda Silva Santos

Components:
- Given names: Often religious or Italian/Portuguese origin
- Multiple given names common
- Surnames: Usually maternal then paternal
- Sometimes only one surname

Common patterns:
João Pedro Oliveira Costa
Ana Carolina Ferreira Lima
José Carlos da Silva
Maria das Graças Santos

Portuguese (European) Names:

Structure: Similar but with different common names
Example: António Manuel Rodrigues Sousa

Components:
- Given names: Traditional Portuguese names
- Surnames: Often preceded by "de", "da", "dos"
- Nobility particles preserved

Common patterns:
António José Almeida
Maria João Carvalho
Pedro Miguel Teixeira da Costa
Ana Sofia Santos e Silva

Name Detection Approach:

// Comprehensive name databases
Brazilian first names: João, José, Maria, Ana, Pedro, Paulo...
Portuguese first names: António, Manuel, Joaquim, Rui...
Shared surnames: Silva, Santos, Oliveira, Souza/Sousa...

// Context indicators
Nome: [name]
Nome completo: [full name]
Cliente: [name]
Sr./Sra./Dr./Dra. [name]

Country-Specific Identifiers

Brazilian Identifiers:

CPF (Cadastro de Pessoas Físicas):
Format: XXX.XXX.XXX-XX (11 digits)
Example: 123.456.789-09
Validation: Modulo 11 checksum

CNPJ (Cadastro Nacional da Pessoa Jurídica):
Format: XX.XXX.XXX/XXXX-XX (14 digits)
Example: 11.222.333/0001-81
Validation: Modulo 11 checksum

RG (Registro Geral):
Format: Varies by state
Example: 12.345.678-9 (SSP-SP)

Portuguese Identifiers:

NIF (Número de Identificação Fiscal):
Format: XXXXXXXXX (9 digits)
Example: 123456789
First digit indicates entity type:
- 1, 2, 3: Individual
- 5: Corporate
- 6: Public administration

CC (Cartão de Cidadão):
Format: XXXXXXXXX XX X (12 characters)
Example: 12345678 9 ZZ4
Components: Base number + Check digits + Version

NISS (Número de Identificação de Segurança Social):
Format: XXXXXXXXXXX (11 digits)
Example: 12345678901

Other Lusophone Countries:

Angola:
- NIF: Tax identification
- BI: Bilhete de Identidade

Mozambique:
- NUIT: Tax identification
- BI: Identity document

Cape Verde:
- NIF: Tax number
- BI: Identity card

Address Formats

Brazilian Addresses:

Format:
[Street type] [Name], [Number] - [Complement]
[Neighborhood]
[City] - [State]
CEP: [Postal code]

Example:
Rua das Flores, 123 - Apto 45
Jardim Paulista
São Paulo - SP
CEP: 01310-100

CEP format: XXXXX-XXX (8 digits)
Regions indicated by first digits

Portuguese Addresses:

Format:
[Street type] [Name], [Number], [Floor]
[Postal code] [City]

Example:
Rua Augusta, 25, 3º Esq.
1100-053 Lisboa

Código Postal format: XXXX-XXX (7 digits)
First 4: Region, Last 3: Specific location

Street types: Rua, Avenida, Praça, Largo, Travessa

Phone Number Formats

Brazilian Phone Numbers:

Mobile: (XX) 9XXXX-XXXX (11 digits with area code)
Landline: (XX) XXXX-XXXX (10 digits)
International: +55 XX XXXXX-XXXX

Examples:
(11) 99999-8888  // São Paulo mobile
(21) 2222-3333   // Rio de Janeiro landline
+55 11 99999-8888 // International format

Area codes: 11-99 across states

Portuguese Phone Numbers:

Mobile: 9XX XXX XXX (9 digits, starts with 9)
Landline: 2XX XXX XXX (9 digits, starts with 2)
International: +351 XXX XXX XXX

Examples:
912 345 678  // Mobile
213 456 789  // Lisbon landline
+351 912 345 678 // International

Mobile prefixes: 91, 92, 93, 96
Landline prefixes: 21 (Lisbon), 22 (Porto), etc.

Financial Identifiers

Brazilian Banking:

Bank account format:
Banco: [3-digit bank code]
Agência: [4 digits]-[check digit]
Conta: [variable digits]-[check digit]

Example:
Banco: 001 (Banco do Brasil)
Agência: 1234-5
Conta: 123456-7

PIX keys: CPF, phone, email, or random key

Portuguese Banking:

NIB (Número de Identificação Bancária):
Format: XXXXXX.XXXX.XXXXXXXXXXX.XX (21 digits)
Example: 0035.0123.00001234567.89

IBAN format:
PTXX XXXX XXXX XXXX XXXX XXXX X (25 characters)
Example: PT50 0035 0123 0000 1234 5678 9

Regulatory Compliance

Brazil - LGPD:

  • Lei Geral de Proteção de Dados (effective 2020)
  • Similar structure to GDPR
  • Enforced by ANPD (National Data Protection Authority)
  • Significant fines for non-compliance

Portugal - GDPR + National Law:

  • GDPR directly applicable (EU member)
  • Lei 58/2019 implementing GDPR
  • CNPD (Comissão Nacional de Proteção de Dados)
  • Sector-specific regulations

Language Detection and Processing

// Automatic variant detection
Input: "O cliente João Silva forneceu seu telemóvel"
Detected: pt-PT (telemóvel indicates European Portuguese)

Input: "O cliente João Silva forneceu seu celular"
Detected: pt-BR (celular indicates Brazilian Portuguese)

// Processing adapts to variant
- Identifier formats appropriate to region
- Address parsing matching regional format
- Name patterns for detected region

// Can also specify explicitly
{
  "language": "pt",
  "region": "BR"  // or "PT"
}

API Configuration

POST /v1/redact
{
  "text": "Cliente: Maria Silva, NIF: 123456789, Telemóvel: 912 345 678",
  "language": "pt",
  "auto_detect_region": true,
  "redaction_types": ["name", "tax_id", "phone", "address"]
}

Response:
{
  "redacted_text": "Cliente: [NAME], NIF: [TAX_ID], Telemóvel: [PHONE]",
  "detected_region": "PT",
  "detections": [
    {
      "type": "name",
      "value": "Maria Silva",
      "confidence": 0.94
    },
    {
      "type": "tax_id",
      "value": "123456789",
      "format": "portugal_nif",
      "confidence": 0.97
    },
    {
      "type": "phone",
      "value": "912 345 678",
      "format": "portugal_mobile",
      "confidence": 0.98
    }
  ]
}

Trusted by Industry Leaders

Trusted by 500+ enterprises worldwide

Frequently Asked Questions

Everything you need to know about our redaction services

Still have questions?

Our team is ready to help you get started.

Contact Support
01

What's the difference between Brazilian and European Portuguese?

Beyond spelling differences (following 1990 orthographic agreement to varying degrees), vocabulary, syntax, and formal expressions differ. Brazilian Portuguese uses different terms for common PII contexts. We handle both variants with region-specific processing.

02

How do you detect Portuguese names?

Portuguese names typically include multiple surnames (often maternal then paternal). We maintain databases of common Portuguese and Brazilian names and surnames, handling compound surnames and the various naming conventions across Portuguese-speaking regions.

03

What identifiers do you detect for each country?

For Brazil: CPF, CNPJ, RG, PIS. For Portugal: NIF, CC (Cartão de Cidadão), NISS. For other Portuguese-speaking countries: country-specific tax IDs and national identifiers. Each uses appropriate validation.

04

Do you handle addresses from all Portuguese-speaking countries?

Yes, we support address formats for Brazil (with CEP), Portugal (with Código Postal), Angola, Mozambique, and other lusophone countries. Each has distinct formatting conventions for street addresses and postal codes.

05

How do you handle mixed Portuguese-English documents?

Business documents often mix Portuguese and English. We detect PII in both languages within the same document, applying appropriate rules for each language section while maintaining document context.

06

What about Portuguese text with diacritics?

Portuguese uses various diacritical marks (á, à, â, ã, ç, é, ê, í, ó, ô, õ, ú). Our processing correctly handles all diacritics, including matching names regardless of accent presence or accuracy in the source text.

Enterprise-Grade Security

Process Portuguese Documents

Try Portuguese text redaction.

No credit card required
10,000 words free
Setup in 5 minutes
?>