Imagine practicing your behavioral interview questions with an AI coach that sounds human, understands your answers in real-time, and provides instant feedback.
No typing. No delays. Just natural conversation.
This guide will show you how to build exactly that in about 5 minutes using open-source infrastructure and two API keys.
Why This Matters
Most interview prep tools force you to type, submit, then wait for feedback. That’s not how real interviews work.
Real interviews are conversational. They’re about thinking on your feet, handling follow-ups, and recovering from stumbles. A voice-based coach helps you practice all of that.
This project uses OpenAI’s Realtime API, which offers sub-100ms latency speech-to-speech. No separate speech-to-text, language model, and text-to-speech pipeline. It’s direct. It’s fast. It feels real.
What You’ll Build
A voice AI agent that:
- Asks behavioral interview questions (STAR method)
- Listens to your answer in real-time
- Provides follow-up questions
- Gives constructive feedback
- Maintains conversation context
- Sounds like a real person
All through natural voice conversation.
The Tech Stack (3 Components)
Your Voice
↓
[OpenAI Realtime API] ← Understands, thinks, and responds (all in one)
↓
[LiveKit Agents] ← Infrastructure to run the agent
↓
Speaker Output
Why this approach?
- Speed: Sub-100ms latency—no awkward pauses
- Simplicity: One API handles everything (not 3 separate services)
- Quality: Expressive, human-sounding voices
Step 1: Get Your API Keys (2 Minutes)
LiveKit Cloud
Go to cloud.livekit.io and sign up (free tier available).
- Create a project → Get your API Key and API Secret
- You get 1 free agent on the cloud
OpenAI
Go to platform.openai.com/account/api-keys.
- Create a new API key → Copy and save it
That’s it. You now have the credentials to run a voice AI agent.
Step 2: Set Up Your Project (1 Minute)
Create a new Python project and install dependencies:
# Create project
uv init interview-coach --bare
cd interview-coach
# Install dependencies
uv add \
"livekit-agents[openai]~=1.2" \
"livekit-plugins-noise-cancellation~=0.2" \
"python-dotenv"
What just happened:
- Created a Python project
- Added LiveKit Agents framework
- Added noise cancellation (removes background noise)
- Added environment variable support
Add Your Credentials
Create a .env.local file:
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
LIVEKIT_URL=your_livekit_ws_url
OPENAI_API_KEY=your_openai_api_key
Or use the LiveKit CLI:
brew install livekit-cli
lk cloud auth
lk app env -w
Step 3: Write Your Coach (1 Minute)
Create a file called coach.py:
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentServer, AgentSession, Agent, room_io
from livekit.plugins import openai, noise_cancellation
load_dotenv(".env.local")
class InterviewCoach(Agent):
def __init__(self) -> None:
super().__init__(instructions="""You are an interview preparation coach.
Your role:
1. Start with a warm greeting
2. Ask behavioral interview questions one at a time:
- Tell me about a time you faced a challenging deadline. How did you handle it?
- Describe a situation where you had to work with someone difficult. How did you manage it?
- Give an example of when you had to learn a new skill quickly. What was your approach?
- Tell me about a time you failed. What did you learn?
- Describe when you took initiative without being asked. What was the outcome?
3. As they answer:
- Listen actively and acknowledge their response
- Ask follow-up questions to understand the situation, action, and result (STAR)
- Provide constructive feedback
- Point out strengths and areas to improve
- Give tips to structure their answer better
4. Be encouraging and keep responses conversational (it's a voice interview)
5. After their answer, ask if they'd like another question or to refine this one""")
server = AgentServer()
@server.rtc_session()
async def interview_agent(ctx: agents.JobContext):
session = AgentSession(
llm=openai.realtime.RealtimeModel(voice="coral")
)
await session.start(
room=ctx.room,
agent=InterviewCoach(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: (
noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC()
),
),
),
)
await session.generate_reply(
instructions="Greet the user warmly and introduce yourself as their interview coach. Ask if they're ready to practice a behavioral interview."
)
if __name__ == "__main__":
agents.cli.run_app(server)
That’s your entire agent. ~60 lines of code.
Step 4: Test It Locally (1 Minute)
# Run in console mode (your terminal)
uv run coach.py console
You’ll see something like:
Agents Starting console mode 🚀
MacBook Pro Microphone ▁ ▁ ▅ ▁ ▁ ▅ █ ▅ ▃ ▂ ▁ ▃
Now you can type or speak to practice. Try:
- “Hi, I’m ready to practice”
- Then answer one of the interview questions
The agent will listen, understand your answer, ask follow-ups, and provide feedback.
Step 5: Customize Your Coach (Optional)
Change the Voice
OpenAI Realtime has 4 voices: "coral" (warm), "sage" (professional), "ballad" (expressive), "echo" (clear).
llm=openai.realtime.RealtimeModel(voice="sage")
Change the Personality
Edit the instructions string to make the coach stricter, friendlier, or more specific:
instructions="""You are a strict but fair Google interviewer..."""
Add Different Questions
Replace the questions list with your own:
- Tell me about a time you led a team
- Describe a conflict with a colleague and how you resolved it
Step 6: Deploy to the Cloud (Optional)
Once you’re happy with it locally, deploy to LiveKit Cloud:
# Run in development mode (deployed to the cloud)
uv run coach.py dev
Now your agent is accessible from anywhere. You can connect from your phone, browser, or any device with internet access.
How It Works Behind the Scenes
- You speak → Audio captured
- Audio sent to OpenAI Realtime → Understands your speech
- OpenAI thinks → Generates response
- Response converted to speech → Played back to you
- Loop continues → Natural conversation flow
The key: All of this happens in under 100ms. That’s why it feels like talking to a real person, not a chatbot.
Next Steps
You just built a voice AI interview coach in 5 minutes.
But there are a few ways to take it further:
- Add user profiles → Track progress over time
- Record sessions → Review your answers later
- Add analytics → See which questions you struggle with
- Phone interviews → Use LiveKit’s SIP integration to practice actual phone calls
- Multiple agents → Practice with different interviewer personalities
Want a Faster Way?
Building your own voice coach is great for learning. But if you want to start practicing interviews right now without any setup:
Try Talent Hub at casuro.ai.
We’ve built a full interview prep platform with:
- Real-time voice interviews
- AI feedback on your answers
- STAR method coaching
- Session recordings and analytics
- 200+ realistic tech and behavioral questions
Get a $10 credit to try it out. Perfect for practicing your first few interviews before you build your own coach.
Key Takeaways
- OpenAI Realtime = Direct speech-to-speech with sub-100ms latency
- LiveKit Agents = Simple, open infrastructure for building voice AI
- Interview coaching = One of the best use cases for voice AI (practice → real interviews)
- 5 minutes = How long it takes to build a working prototype
- $2-5/month = Estimated cost to run this on OpenAI’s API
Happy building! 🚀