Follow these steps to go from sign-in to a working Voice Agent. Scroll through to see each step with visual guides.
Get Started with Indus.io
If you’re new, follow the guide once end-to-end, then repeat it with one small change (prompt or voice) so you can clearly see cause → effect.
Step 1: Create New Agent
Start with a blank template and a simple name that matches your use case (e.g. “Appointment Booking”) so testing feels concrete.
Step 2: Define Behavior
Make your System prompt specific (role + goal + do/don’t rules). Keep it short so you can iterate quickly after each test call.
Step 3: Capture Structured Data
Start with 1–2 essential fields (name, reason for call). Add more only after you confirm the agent reliably captures the basics.
Step 4: Set Intelligence & Speech
Change one setting at a time (only voice, or only language, or only temperature) so you can tell exactly what improved or broke.
Optional: Build Complex Flows
Only add workflow/tools after your basic prompt works well in Test Call—otherwise it’s harder to debug what changed.
See a full call flow from your browser
Keep a simple “pass/fail” checklist (greeting, info gathering, closing). When it consistently passes, you’re ready for API integration.
This guide walks you through the Indus.io platform from the moment you log in to running your first voice agent. No prior technical experience is required—follow the sections in order for the smoothest path to a working Voice Agent.
Indus.io (IndusLabs Studio) is a voice AI platform that lets you build and run conversational voice agents. You define agents with a name, a voice (text-to-speech), and a brain (language model). Users then talk to these agents in real time: their speech is transcribed, the agent reasons and replies, and the reply is spoken back—all powered by Indus.io’s APIs and dashboard.
The platform handles speech-to-text (STT), language model (LLM) conversations, and text-to-speech (TTS) in one place. You can manage everything from the dashboard or integrate programmatically via the API and SDK.
An Indus.io account and API key. Sign up or log in at the Indus.io dashboard (e.g. playground.induslabs.io). Once you’re in, you’ll see the home screen with quick access to TTS, STT, Conversational Agents, and usage. The rest of this guide assumes you’re logged in.

IndusLabs Studio home: feature cards, usage snapshot, and quick actions like “Create my first call agent.”
Agents are the core of voice interactions on Indus.io. Each agent has a name, a voice (TTS), and a language model (LLM) that generates replies. You can start from a blank agent or use a template (e.g. Receptionist, Healthcare, Lead Qualification).
From the dashboard, use the left sidebar: under Products, open Conversational Agents. You can also use the “Create my first call agent” suggestion on the home page. This is where you create, edit, and list all your voice agents.
Open an existing agent (if you have one) and just scan the tabs (Agent, Config, Workflow, Tools, Branches, Test Call) before you create anything. Knowing what you’ll configure later makes the next steps easier.
Click Create New Agent. In the modal you’ll choose an agent name (required), the agent type, and a template to get started.

Create New Agent: name, type, and template selection.
Start with “Single prompt agent” and a blank template for your first build. It helps you learn the core loop before you add workflow complexity.
After creating the agent, you’ll land on the Agent tab. Here you define how the agent behaves and what it says first.
{{ variable }} for dynamic content if supported.Use Save Draft to save, or Publish when the agent is ready to go live.

Agent tab: system prompt, first message, and call infields/outcomes.
For a Single prompt agent, aim for a System prompt that clearly covers:
Keep this prompt concise (5–8 short sentences) and then iterate in small edits, using Test Call after each change so you can clearly hear the impact.
Open the Config tab to tune the intelligence engine, speech-to-text, and text-to-speech.
openai/gpt-oss-120b or groq). Adjust Temperature (focused vs creative), Max Tokens, and Context Turns (conversation memory).nova-2), and set Language (e.g. Hindi).Indus-hi-maya) that the agent uses to speak replies.
Config tab: model, voice, and detection parameters.
Make one change at a time here (only STT language, or only voice, or only temperature). That way you can attribute improvements or regressions to a single setting.
For more control over conversation flow and capabilities, use the Workflow and Tools tabs, along with First Message settings.

Workflow: Start and Subagent nodes with configuration panel.

First Message & Infields: Configure greeting and structured data capture.
Only add workflow complexity or custom tools after your basic prompt works well in a Test Call.
The Test Call tab is the fastest way to understand how voice sessions work on Indus.io. You can start a call in your browser, speak to your agent, and hear replies in real time without writing any code.
Open your agent, make sure you have saved your latest changes, and go to the Test Call tab. Under Web Call, select a Voice (for example, Indus-hi-maya) and click Start Web Call to begin a browser-based session.
Before starting, say out loud what you expect the agent to do (e.g. “Greet me and ask one question”). This makes it easier to notice when the behavior doesn’t match your intent.
Once the call starts, speak normally into your microphone. Indus.io captures your audio, runs Speech-to-Text (STT) to transcribe it, sends the text to the agent’s language model (LLM), and then uses Text-to-Speech (TTS) to play the reply back in the voice you configured.
Try a few different phrasings of the same question (simple, detailed, and edge-case scenarios). This helps you understand how robust your current prompt and configuration are.
After a short conversation, end the call and review what happened: Did the agent greet correctly? Did it follow your instructions from the Agent and Workflow tabs? Did it capture the right information in Call Infields or Outcomes (if configured)?
For every issue you notice (tone, missing questions, wrong language, etc.), write down a one-line fix (e.g. “Ask for the customer’s name before anything else”) and immediately update the System prompt, Workflow node, or Config settings.
Run another Test Call after you change prompts, workflow nodes, or voice/STT settings. Each call replays the full chain: user speech → STT → LLM reasoning → TTS voice. Repeating this loop is the quickest way to build intuition for how small changes affect the live experience.
Create a short checklist (e.g. greeting, information gathering, closing) and confirm each item during every Test Call. When the call consistently passes your checklist, you are ready to move on to API or telephony integration.
After you are comfortable with Test Call, explore Voice Agents for API-level session control and TTS / STT docs to deepen your understanding of the underlying building blocks.