A trusted contact's voice on the phone urgently asks you to authorise a wire transfer 'within the next ten minutes'. The voice is unmistakable and the caller ID matches. What is the safest response?

Tell the caller you'll call them back on the number you have for them, then verify before doing anything. Voice cloning is good enough in 2026 that 'I recognised the voice' is no longer a usable signal. Caller ID can be spoofed. Verification must happen on a separate channel, on a number you control.

How much audio does an attacker typically need to produce a credible voice clone in 2026?

Three to thirty seconds of clean source audio. Commercial and open-source voice-cloning tools converged on small-data fine-tuning by 2023–2024. Public podcast clips, conference talks, and earnings calls give attackers far more than enough.

What is a codeword protocol and why does it work?

A pre-agreed phrase known only to the real parties — the attacker cannot synthesise the right answer because the phrase is not in their training data, so a wrong or improvised answer reveals the call as synthetic. Codewords work because they are pre-shared private information not derivable from public audio. AI voice clones can imitate speech but cannot reliably fabricate accurate private context.

If a deepfake voice call has already convinced you and money has moved, what is the most time-critical action?

Call your bank's fraud line within the first hour to attempt recall, AND alert finance leadership AND security simultaneously. Recall odds drop sharply after the first hour. Banks have international clawback rails when notified quickly. Finance and security need to investigate concurrently for additional indicators.

Which family member is most often targeted in deepfake-voice family-emergency scams?

Elderly parents and grandparents — attackers use a child or grandchild's cloned voice claiming an emergency and asking for an immediate money transfer. Family-emergency scams using cloned voices of children and grandchildren targeting elderly relatives are documented at scale in 2024–2026 by NCSC-NL and other national agencies. Defensive guidance: family codewords + slow-down rule + verification callback.

Deepfake Voice Phishing in 2026 — When the Voice on the Phone Is Synthetic

Het scenario

An executive assistant at a 1,800-person Dutch B2B SaaS company gets a phone call Tuesday 11:47. The display says Private number. The voice is unmistakably the CEO's — the same accent, same cadence, same use of 'right?' at the end of every other sentence. He sounds slightly hoarse 'because of the flight back from Singapore' and asks her to fast-track an authorisation for a €420,000 partner payment that was 'agreed last week but the contract is still with legal'. He says he'll forward the supplier's invoice from his personal Gmail because his corporate inbox is rate-limited from the hotel WiFi. The assistant pushes back once: 'should I check with the CFO?' The voice answers, 'No, the CFO knows, we discussed this in the board pre-read — just push it through standard AP and I'll sign off when I'm back tomorrow morning.' The conversational responses are fluid; the pretext is plausible; the urgency is calibrated. She authorises the payment. The CEO has been in Singapore but never called — the attacker scraped 4 minutes of audio from his keynote at Web Summit six weeks earlier, trained a clone in under an hour, and ran the live call through commercial voice-cloning infrastructure that handles real-time prosody. The money moved at 11:53 and the CEO learned about it from the assistant's email Wednesday morning. Less than €60,000 was recovered.

Hoe de aanval werkt

Voice cloning crossed the credibility threshold for phone-quality real-time impersonation around 2023. By 2026 the pipeline is commoditised: source audio from any public appearance (podcast, conference talk, earnings call, YouTube video, even social-media livestream) is fed to a fine-tuned voice model that produces real-time conversational audio. Some kits handle live two-way dialogue; others use a short pre-recorded message plus interactive 'live' clips synthesised on the fly. The attacker also handles the channel layer: caller ID is spoofed to match the target's known contact for the impersonated person, background ambience is added to match the claimed setting (airport, taxi, conference), and the attacker uses pretext detail (board meetings, recent travel, internal project names) scraped from LinkedIn, public press releases, leaked breach data, or prior reconnaissance. The combination of voice + caller ID + pretext detail + emotional pressure (urgency, authority, secrecy) defeats most cognitive defences. MITRE ATT&CK techniques: T1566.004 (Spearphishing Voice), T1585.002 (Establish Accounts: Email — for the supporting paper trail), T1656 (Impersonation). The only category of defence that consistently works is process: out-of-band callback to a verified number, shared codeword authentication, two-person rule on financial actions, in-person or verified-video confirmation for high-value or off-process requests.

Waar je op moet letten

Inbound call from a private or unfamiliar number claiming to be a known contact in unusual circumstances ('flight delay', 'new mobile', 'borrowed phone')
Urgency framed around financial actions, credential resets, account changes, or sensitive document sharing
Caller resisting normal verification channels — 'no time to do callbacks', 'don't loop in finance, this is between us', 'I'll be on a flight in five minutes'
Voice prosody that is almost perfect but has subtle artefacts — oddly even breathing, flat affect on emotional words, unnatural pauses between sentences
Background ambience that is too clean (perfectly studio-quiet) or that doesn't match the claimed environment
Conversational mistakes a real person wouldn't make — incorrect names, wrong recent shared context, slightly off recall of an event you both attended
Calls outside normal hours combined with pressure to act before someone else (CFO, manager, partner) becomes aware
A coordinated multi-channel pretext: a voice call following an SMS, email, or push notification that softens you up beforehand

Wat te doen

Hang up and call back on a verified number — every time, regardless of how convincing the voice isUse the number in your phone contacts, your CRM, your contracts file, or the company switchboard. Never use a number provided during the call.
Use a shared codeword for sensitive requests within your family and your executive teamA simple agreed phrase known only to the real parties. If the caller doesn't know it, or improvises around it, treat as adversarial. The codeword should not be guessable from public information.
Apply the financial two-person rule and verification workflow with no exceptionsNo phone call from any voice — synthetic or real — bypasses the workflow. Process beats perception.
If you suspect a deepfake voice call, ask a question the real person would answer reflexively and the attacker would not knowPersonal anecdotes, recent shared experiences, internal jokes. AI voice clones can synthesise speech but cannot reliably fabricate accurate context.
Report attempted deepfake calls to security with as much detail as you can recallTime, claimed pretext, caller-ID, requested action. Every attempt is reconnaissance evidence and may indicate active targeting of the executive.
If money or credentials moved, escalate within the first hour for recall and revocationBank fraud line + finance leadership + security simultaneously. Hours matter for clawback.

Verdediging — voor IT en beleid

Technische controles

Voice-spoofing detection on critical inbound numbers (offered by several carriers in 2026) — flags SS7-routed and SIP-injected calls
Privacy review of executive voice exposure — limit duration and quality of public audio appearances where practical; for many roles this is unavoidable, in which case defences must focus on process
Caller-ID display hygiene — apps that flag unknown numbers and label spoofing risk; corporate devices configured to suppress private/withheld numbers from reaching key roles without screening
AI-detection software in contact-centre environments — emerging in 2026, still maturing; useful but not yet reliable enough to be sole control
Pre-shared identity tokens (PIN codes, codewords) for executive-to-executive sensitive comms — explicitly required by policy and rehearsed

Beleidscontroles

Written policy: payment instructions cannot be authorised on phone alone, regardless of who appears to be calling — must go through ERP / signed authorisation workflow
Documented executive codeword protocol — explicitly required for any high-value or off-process request from an executive over phone or text
Family-level guidance for executives and their families — a shared codeword for 'is this really you?' calls (especially for 'I'm in trouble, send money' family-emergency scams)
Out-of-band verification requirement for any voice-only request to: reset MFA, change banking details, share credentials, transfer money, share sensitive documents
Quarterly tabletop with finance + executive team that rehearses a deepfake-voice scenario

Trainingsfrequentie

Annual simulated deepfake voice exercise targeting executive assistants, finance leadership and family members of high-value targets. Include a 'what would you have done?' debrief — the goal is to make the verification reflex automatic so the conscious 'this sounded just like them' override cannot occur. Pair training with a one-line internal mnemonic: 'A real voice never minds being called back.'

Korte check

Vijf vragen. Antwoorden en toelichting verschijnen na inzenden.

Q1.
A trusted contact's voice on the phone urgently asks you to authorise a wire transfer 'within the next ten minutes'. The voice is unmistakable and the caller ID matches. What is the safest response?
Q2.
How much audio does an attacker typically need to produce a credible voice clone in 2026?
Q3.
What is a codeword protocol and why does it work?
Q4.
If a deepfake voice call has already convinced you and money has moved, what is the most time-critical action?
Q5.
Which family member is most often targeted in deepfake-voice family-emergency scams?

Bronnen & verdere lectuur

NCSC-NL — Deepfake-fraude en synthetische media[primary]
FBI IC3 — Deepfake-enabled fraud alerts[primary]
MITRE ATT&CK — T1566.004 Spearphishing Voice[primary]
ENISA — AI-enabled threats and deepfake landscape[primary]
Krebs on Security — Deepfake CEO fraud case studies[secondary]
Mandiant — Synthetic media threat-actor analyses[secondary]