Imagine practicing your behavioral interview questions with an AI coach that sounds human, understands your answers in real-time, and provides instant feedback.

No typing. No delays. Just natural conversation.

This guide will show you how to build exactly that in about 5 minutes using open-source infrastructure and two API keys.

Why This Matters

Most interview prep tools force you to type, submit, then wait for feedback. That’s not how real interviews work.

Real interviews are conversational. They’re about thinking on your feet, handling follow-ups, and recovering from stumbles. A voice-based coach helps you practice all of that.

This project uses OpenAI’s Realtime API, which offers sub-100ms latency speech-to-speech. No separate speech-to-text, language model, and text-to-speech pipeline. It’s direct. It’s fast. It feels real.

What You’ll Build

A voice AI agent that:

Asks behavioral interview questions (STAR method)
Listens to your answer in real-time
Provides follow-up questions
Gives constructive feedback
Maintains conversation context
Sounds like a real person

All through natural voice conversation.

The Tech Stack (3 Components)

Your Voice
    ↓
[OpenAI Realtime API] ← Understands, thinks, and responds (all in one)
    ↓
[LiveKit Agents] ← Infrastructure to run the agent
    ↓
Speaker Output

Why this approach?

Speed: Sub-100ms latency—no awkward pauses
Simplicity: One API handles everything (not 3 separate services)
Quality: Expressive, human-sounding voices

Step 1: Get Your API Keys (2 Minutes)

LiveKit Cloud

Go to cloud.livekit.io and sign up (free tier available).

Create a project → Get your API Key and API Secret
You get 1 free agent on the cloud

OpenAI

Go to platform.openai.com/account/api-keys.

Create a new API key → Copy and save it

That’s it. You now have the credentials to run a voice AI agent.

Step 2: Set Up Your Project (1 Minute)

Create a new Python project and install dependencies:

# Create project
uv init interview-coach --bare
cd interview-coach

# Install dependencies
uv add \
  "livekit-agents[openai]~=1.2" \
  "livekit-plugins-noise-cancellation~=0.2" \
  "python-dotenv"

What just happened:

Created a Python project
Added LiveKit Agents framework
Added noise cancellation (removes background noise)
Added environment variable support

Add Your Credentials

Create a .env.local file:

LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
LIVEKIT_URL=your_livekit_ws_url
OPENAI_API_KEY=your_openai_api_key

Or use the LiveKit CLI:

brew install livekit-cli
lk cloud auth
lk app env -w

Step 3: Write Your Coach (1 Minute)

Create a file called coach.py:

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentServer, AgentSession, Agent, room_io
from livekit.plugins import openai, noise_cancellation

load_dotenv(".env.local")

class InterviewCoach(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="""You are an interview preparation coach.

Your role:
1. Start with a warm greeting
2. Ask behavioral interview questions one at a time:
   - Tell me about a time you faced a challenging deadline. How did you handle it?
   - Describe a situation where you had to work with someone difficult. How did you manage it?
   - Give an example of when you had to learn a new skill quickly. What was your approach?
   - Tell me about a time you failed. What did you learn?
   - Describe when you took initiative without being asked. What was the outcome?

3. As they answer:
   - Listen actively and acknowledge their response
   - Ask follow-up questions to understand the situation, action, and result (STAR)
   - Provide constructive feedback
   - Point out strengths and areas to improve
   - Give tips to structure their answer better

4. Be encouraging and keep responses conversational (it's a voice interview)
5. After their answer, ask if they'd like another question or to refine this one""")

server = AgentServer()

@server.rtc_session()
async def interview_agent(ctx: agents.JobContext):
    session = AgentSession(
        llm=openai.realtime.RealtimeModel(voice="coral")
    )

    await session.start(
        room=ctx.room,
        agent=InterviewCoach(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: (
                    noise_cancellation.BVCTelephony()
                    if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                    else noise_cancellation.BVC()
                ),
            ),
        ),
    )

    await session.generate_reply(
        instructions="Greet the user warmly and introduce yourself as their interview coach. Ask if they're ready to practice a behavioral interview."
    )

if __name__ == "__main__":
    agents.cli.run_app(server)

That’s your entire agent. ~60 lines of code.

Step 4: Test It Locally (1 Minute)

# Run in console mode (your terminal)
uv run coach.py console

You’ll see something like:

Agents Starting console mode 🚀

MacBook Pro Microphone ▁ ▁ ▅ ▁ ▁ ▅ █ ▅ ▃ ▂ ▁ ▃

Now you can type or speak to practice. Try:

“Hi, I’m ready to practice”
Then answer one of the interview questions

The agent will listen, understand your answer, ask follow-ups, and provide feedback.

Step 5: Customize Your Coach (Optional)

Change the Voice

OpenAI Realtime has 4 voices: "coral" (warm), "sage" (professional), "ballad" (expressive), "echo" (clear).

llm=openai.realtime.RealtimeModel(voice="sage")

Change the Personality

Edit the instructions string to make the coach stricter, friendlier, or more specific:

instructions="""You are a strict but fair Google interviewer..."""

Add Different Questions

Replace the questions list with your own:

- Tell me about a time you led a team
- Describe a conflict with a colleague and how you resolved it

Step 6: Deploy to the Cloud (Optional)

Once you’re happy with it locally, deploy to LiveKit Cloud:

# Run in development mode (deployed to the cloud)
uv run coach.py dev

Now your agent is accessible from anywhere. You can connect from your phone, browser, or any device with internet access.

How It Works Behind the Scenes

You speak → Audio captured
Audio sent to OpenAI Realtime → Understands your speech
OpenAI thinks → Generates response
Response converted to speech → Played back to you
Loop continues → Natural conversation flow

The key: All of this happens in under 100ms. That’s why it feels like talking to a real person, not a chatbot.

Next Steps

You just built a voice AI interview coach in 5 minutes.

But there are a few ways to take it further:

Add user profiles → Track progress over time
Record sessions → Review your answers later
Add analytics → See which questions you struggle with
Phone interviews → Use LiveKit’s SIP integration to practice actual phone calls
Multiple agents → Practice with different interviewer personalities

Want a Faster Way?

Building your own voice coach is great for learning. But if you want to start practicing interviews right now without any setup:

Try Talent Hub at casuro.ai.

We’ve built a full interview prep platform with:

Real-time voice interviews
AI feedback on your answers
STAR method coaching
Session recordings and analytics
200+ realistic tech and behavioral questions

Get a $10 credit to try it out. Perfect for practicing your first few interviews before you build your own coach.

Key Takeaways

OpenAI Realtime = Direct speech-to-speech with sub-100ms latency
LiveKit Agents = Simple, open infrastructure for building voice AI
Interview coaching = One of the best use cases for voice AI (practice → real interviews)
5 minutes = How long it takes to build a working prototype
$2-5/month = Estimated cost to run this on OpenAI’s API

Happy building! 🚀

Case Study

A360

VOCAL

Bar Raiser Interview

JD Builder

Simulation

ATS Partners

Ashby

Lever

Teamtailor

Greenhouse