Generate intelligent chat completions using state-of-the-art language models. Our API is fully compatible with the OpenAI SDK, making integration seamless. Use /v1/chat/completions for both streaming and non-streaming responses.
Need an API Key? If you don't have an API key yet, you can create one here: https://playground.induslabs.io/register
Our LLM service is fully compatible with the OpenAI Python and JavaScript SDKs. Simply set the base_url parameter to https://voice.induslabs.io/v1 and use your API key.
Enable finetune: true in the extra_body parameter to use our fine-tuned model specifically optimized for voice agent use cases. This model delivers superior performance in conversational AI applications with more natural, context-aware responses tailored for voice interactions.
gpt-oss-120bllama-4-maverickAll requests require authentication via the Authorization header with a Bearer token:
Authorization: Bearer YOUR_API_KEY
Your API key can be found in your dashboard at playground.induslabs.io
from openai import OpenAI
# Initialize the client
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# Simple completion
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "Hello! How are you?"}
],
temperature=0.7,
max_tokens=1000
)
print(response.choices[0].message.content)
from openai import OpenAI
# Initialize the client
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# Voice agent optimized completion
response = client.chat.completions.create(
model="llama-4-maverick",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hi, introduce yourself?"}
],
temperature=0.7,
max_tokens=1000,
extra_body={
"finetune": True, # Enable voice-optimized model
"language": "en",
"gender": "female",
"accent": "american"
}
)
print(response.choices[0].message.content)
from openai import OpenAI
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# Hindi language with Indian accent
response = client.chat.completions.create(
model="llama-4-maverick",
messages=[
{"role": "user", "content": "नमस्ते, आप कैसे हैं?"}
],
temperature=0.7,
max_tokens=1000,
extra_body={
"finetune": True,
"language": "hi",
"gender": "female",
"accent": "indian"
}
)
print(response.choices[0].message.content)
from openai import OpenAI
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# Spanish with Mexican accent
response = client.chat.completions.create(
model="llama-4-maverick",
messages=[
{"role": "user", "content": "¿Puedes presentarte?"}
],
temperature=0.7,
max_tokens=1000,
extra_body={
"finetune": True,
"language": "es",
"gender": "female",
"accent": "mexican"
}
)
print(response.choices[0].message.content)
from openai import OpenAI
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# Male voice with British accent
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "Tell me something interesting!"}
],
temperature=0.7,
max_tokens=1000,
extra_body={
"finetune": True,
"language": "en",
"gender": "male",
"accent": "british"
}
)
print(response.choices[0].message.content)
from openai import OpenAI
client = OpenAI(
base_url="https://voice.induslabs.io/v1",
api_key="YOUR_API_KEY"
)
# With stop sequences and max tokens
response = client.chat.completions.create(
model="gpt-oss-120b",
messages=[
{"role": "user", "content": "Write a short motivational quote!"}
],
temperature=0.7,
max_tokens=150,
stop=["!", "
"],
extra_body={
"finetune": True,
"language": "en"
}
)
print(response.choices[0].message.content)
/v1/chat/completionsGenerate chat completions using large language models. Supports both streaming and non-streaming responses, with optional voice agent optimization.
| Name | Type | Default | Description |
|---|---|---|---|
messages | array | required | Array of message objects with role and content. |
model | string | required | Model ID (e.g., "gpt-oss-120b", "llama-4-maverick"). |
temperature | number | 1.0 | Sampling temperature (0–2). Higher values = more random. |
max_tokens | integer | null | Maximum tokens to generate in completion. |
top_p | number | 1.0 | Nucleus sampling parameter (0–1). |
stream | boolean | false | Whether to stream responses via SSE. |
stop | array | null | Stop sequences to end generation early. |
extra_body | object | null | Additional parameters for voice agent optimization (see below). |
| Status | Type | Description |
|---|---|---|
200 OK | application/json | Returns complete chat completion with usage metrics. |
401 Unauthorized | application/json | Invalid or missing API key. |
422 Validation Error | application/json | Validation failure. Inspect detail array. |
503 Service Unavailable | application/json | LLM service temporarily unavailable. |
| Name | Type | Default | Description |
|---|---|---|---|
finetune | boolean | false | Enable fine-tuned model optimized for voice agent use cases. |
language | string | null | Target language code (e.g., "en", "hi", "es", "fr"). |
gender | string | null | Voice gender preference: "male" or "female". |
accent | string | null | Accent/dialect (e.g., "american", "british", "indian", "mexican"). |
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thank you for asking. How can I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 20,
"total_tokens": 32
}
}
{"detail": "Invalid API key"}
{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}
/v1/chat/completions (Streaming)Stream chat completions in real-time using Server-Sent Events. Set stream: true in the request body.
| Name | Type | Default | Description |
|---|---|---|---|
messages | array | required | Array of message objects with role and content. |
model | string | required | Model ID (e.g., "gpt-oss-120b", "llama-4-maverick"). |
temperature | number | 1.0 | Sampling temperature (0–2). Higher values = more random. |
max_tokens | integer | null | Maximum tokens to generate in completion. |
top_p | number | 1.0 | Nucleus sampling parameter (0–1). |
stream | boolean | false | Whether to stream responses via SSE. |
stop | array | null | Stop sequences to end generation early. |
extra_body | object | null | Additional parameters for voice agent optimization (see below). |
| Status | Type | Description |
|---|---|---|
200 OK | text/event-stream | Returns chat completion chunks via SSE. |
401 Unauthorized | application/json | Invalid or missing API key. |
422 Validation Error | application/json | Validation failure. Inspect detail array. |
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-oss-120b","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-oss-120b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-oss-120b","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-oss-120b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
{"detail": "Invalid API key"}
{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}
/v1/chat/modelsRetrieve a list of available chat models with their metadata.
| Status | Type | Description |
|---|---|---|
200 OK | application/json | Returns list of available models. |
{
"object": "list",
"data": [
{
"id": "gpt-oss-120b",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
},
{
"id": "llama-4-maverick",
"object": "model",
"created": 1677610602,
"owned_by": "meta"
}
]
}