Skip to main content
PII Redaction automatically detects and replaces sensitive entities (names, emails, addresses, etc.) in your transcript output. This feature is only available for pre-recorded transcription.

Usage

Add "pii_redaction": true to your request to redact all detected PII in the transcript. Sensitive entities will be replaced with markers in the output.
{
  "audio_url": "YOUR_AUDIO_URL",
  "pii_redaction": true
}

Optional configuration

You can customize the behavior with pii_redaction_config:
entity_types
string[]
Preset or list of PII entity types to redact (e.g. ["GDPR"]). See Named Entity Recognition for supported entity types.
processed_text_type
enum
default:"MARKER"
How to replace detected PII:
  • MARKER: Placeholder labels like [NAME_1], [EMAIL_1]. Same entity will have same ID.
  • MASK: Each character replaced by a mask (e.g. “John Smith” → #### #####)

Example body

{
  "audio_url": "YOUR_AUDIO_URL",
  "pii_redaction": true,
  "pii_redaction_config": {
    "entity_types": ["GDPR"],
    "processed_text_type": "MARKER"
  }
}

Example output

Without PII redaction (raw transcript):
Hi, I’m calling about the order for John Smith. Can you confirm the delivery to john.smith@company.com? Yes, John Smith placed it yesterday.
With PII redaction (processed_text_type="MASK"):
Hi, I’m calling about the order for #### #####. Can you confirm the delivery to ######################? Yes, #### ##### placed it yesterday.
With PII redaction (processed_text_type="MARKER"):
Hi, I’m calling about the order for [NAME_1]. Can you confirm the delivery to [EMAIL_1]? Yes, [NAME_1] placed it yesterday.
The same entity mentioned multiple times receives the same marker ID (e.g. “John Smith” becomes [NAME_1] both times), so you can track references across the transcript while keeping sensitive data redacted.
This consistency is also useful for downstream tasks using LLMs, which can reason about entities (e.g. “the person in [NAME_1]”) without ever seeing the raw PII.