AI Voice Agent

Local-first AI voice intake for live business calls.

AI Voice Agent is a Windows-first demo foundation for Asterisk-connected calls, bilingual transcription, short answer flows, structured lead capture, and human handoff.

Live-call demo surface
AI Voice Agent local-first live-call reception demo screenshot
Local-first call intake, transfer readiness, and session review
Live-call demo foundation MD3 internal AI voice work

A receptionist foundation for calls where privacy, latency, and handoff all matter.

AI Voice Agent is built around live Asterisk call paths and local-first processing experiments. The goal is a practical intake layer: understand the caller, answer simple questions, capture the useful fields, and know when to transfer.

  • Whisper.net local transcription with adaptive Chinese, English, and mixed-language recognition
  • Live Asterisk 22 ARI, Stasis, externalMedia, and call-transcription path
  • Rule-based FAQ matching with company-profile replies and optional local LLM refinement
  • Structured capture for name, company, phone, email, and callback reason
  • Local TTS greeting and reply playback experiments for phone calls
  • Human-transfer intent detection when the caller should reach a person
  • Saved session packages with transcript, conversation turns, captured fields, and timing data
Live-call foundation
Asterisk-connected intake externalMedia + local STT
Caller: I need pricing and a callback. Detected mixed-language intake, then matched to a short response and capture path.
CaptureName, phone, email, reason
DecisionAnswer, callback, or transfer
FAQ reply Local TTS Human handoff
Call Flow

The useful AI voice workflow starts inside the call path, not after the call is over.

The current direction connects live call audio, local transcription, short response logic, structured capture, and human-transfer readiness into one reviewable session.

01

Receive the call

Asterisk 22, ARI, Stasis, and externalMedia provide the live-call foundation for intake experiments.

02

Transcribe locally

Whisper.net supports adaptive Chinese, English, and mixed-language recognition without making cloud transcription the only path.

03

Answer and capture

FAQ matching, short replies, and field capture can collect the caller's name, company, phone, email, and callback reason.

04

Transfer or review

Human-transfer intent and saved session packages keep the handoff traceable instead of hiding it in a transcript folder.

Structured Capture

The destination matters as much as the transcript.

A phone AI assistant becomes more useful when every answer has a place to land: a lead field, callback reason, task, call note, case record, or human handoff summary.

Session package Call intake record
TranscriptCaller turns and detected language
FieldsName, company, phone, email, reason
DecisionFAQ answer, callback, or transfer
ReviewTiming data and human-readable summary
Current Boundary

Public wording stays grounded in the live-call demo foundation.

This page should not imply a finished autonomous call center. The current evidence is a strong foundation for local-first intake, Asterisk-connected call handling, structured capture, TTS experiments, and transfer logic.

Built around PBX events

Live Asterisk call plumbing is part of the product story because call state and transfer readiness matter in real business telephony.

Local-first by design

The current direction uses local transcription and optional local model refinement before reaching for cloud-only assumptions.

Workflow-ready next

The natural next layer is writing captured details into a CRM, case system, callback queue, or custom DigiPBX workflow surface.

Next Step

Discuss an AI voice workflow that stays connected to the PBX.

The strongest starting point is not a generic chatbot. It is the call path, the fields worth capturing, and the moment a human should take over.