PhantoM
Voice Deepfake Detection Engine
Telephony-grade voice deepfake detection for call centers - streaming, low latency, DD-grade evidence. Acquisition-first packaging.
At a Glance
Detect AI-generated voice in real-time over telephony connections (8 kHz, G.711/Opus). PhantoM is packaged as an IP acquisition asset with reproducible benchmarks, raw evaluation scores, and evidence-first documentation - not a SaaS product.
The Problem
Voice deepfake attacks are now a practical fraud vector in call centers and voice-authentication flows. Traditional controls struggle with compressed telephony audio, short interaction windows, and the operational cost of false positives. Buyers need reproducible benchmark evidence for narrowband (8 kHz) telephony conditions.
The Solution
PhantoM provides a streaming detection core designed for narrowband telephony audio (8 kHz) with an enterprise deployment wrapper and an evidence-first diligence package. Buyers can verify latency, benchmark methodology, and operating points under NDA using raw-score artifacts and gate reports.
Capabilities
Capabilities designed for enterprise integration.
Telephony-Grade Streaming
Real-time scoring over 8kHz narrowband audio with telephony-focused preprocessing.
FAR-Anchored Decisioning
Operating points expressed in FAR/TPR terms to match call-center risk tolerances (e.g., TPR at FAR=1-2%).
Telephony Robustness Suite
Benchmark harness across codec simulation (G.711/Opus), jitter, packet loss, and noise conditions.
Deploy Anywhere
On-prem / self-hosted deployment design. No requirement to send audio outside buyer infrastructure.
Enterprise Security Wrapper
Enterprise wrapper patterns (mTLS, RBAC, audit logging) documented under NDA; implementation scope varies by integration.
Acquisition-First Packaging
Stage-gated disclosure and DD-grade artifacts designed for low-friction M&A / IP acquisition.
Evidence & Proof Points
DD-friendly artifacts and verifiable outputs for technical evaluation.
Sample Outputs
All samples are synthetic (no customer data, no secrets). Full evaluation evidence packs are available under NDA during technical evaluation.
Integration
Clear inputs and outputs for seamless integration into your stack.
Inputs
- Streaming audio chunks (8 kHz narrowband or resampled)
- Encodings: PCM_S16LE, G.711 u-law, G.711 A-law, Opus (simulation-supported)
- Optional call metadata (session ID, timestamps)
Outputs
- Per-chunk confidence score (human <-> synthetic)
- Verdict: human / synthetic / uncertain
- Latency per chunk
- Optional audit events (JSON)
Ideal For
Best-fit buyer profiles and use cases.
Embed voice deepfake detection as a native platform capability.
Add synthetic voice detection to your authentication stack.
Protect high-value voice transactions from impersonation attacks.
Deploy telephony-grade fraud-risk signals at the network edge.
Add synthetic-voice risk scoring to voice verification workflows.
Acquire DD-grade IP with reproducible benchmarks and evidence chain.
Known Limits & Scope
- Preferred model: full IP acquisition (transfer of ownership). Exclusive licensing discussed case-by-case under NDA. No hosted SaaS option.
- Evidence pack and sensitive artifacts shared under mutual NDA
- No ongoing support post-acquisition; fixed-scope transition support (20-40 hours)
- Cross-domain performance is disclosed; DF-domain retraining expected for production deployments
- No SOC 2 / ISO certification (solo-engineered asset; architecture artifacts provided)
- Performance degrades under severe bandpass/jitter/noise conditions (documented)
- Not a full identity verification or liveness product; provides synthetic-voice detection signal for integration into existing workflows
- Benchmarked on ASVspoof 2021 DF dataset (Open Data Commons Open Database License v1.0); dataset not redistributed; third-party license inventory available under NDA
Ready for a Deep Dive?
Schedule a 30-minute technical walkthrough to see PhantoM in action and discuss integration options.