Skip to main content

Text-to-Speech Service

Deliver natural-sounding speech with configurable voices, streaming playback, and file-based output. All endpoints use a consistent JSON payload via POST requests, making integration simple and straightforward.

Need an API Key? If you don't have an API key yet, you can create one here: https://playground.induslabs.io/register

Shared Request Payload

All text-to-speech endpoints use the same JSON schema sent via POST request. Simply adjust parameters like output_format or stream depending on your use case.

{
"text": "Hello, this is a test request.",
"voice": "Indus-hi-maya",
"output_format": "wav",
"model": "indus-tts-v1",
"api_key": "YOUR_API_KEY",
"normalize": true,
"stream": true,
"speed": 1,
"pitch_shift": 0,
"loudness_db": 0
}

Payload Fields

NameTypeDefaultDescription
textstringrequiredThe text to be synthesized into speech.
voicestringIndus-hi-mayaThe voice model to be used (e.g., "Indus-hi-maya").
output_formatstringwavAudio format for output (e.g., "wav", "mp3", "pcm").
modelstringindus-tts-v1The TTS model to use (e.g., "indus-tts-v1").
api_keystringrequiredAuthentication API key.
normalizebooleantrueWhether to normalize text before synthesis (default: true).
streambooleantrueWhether to stream the output (default: true).
speednumber1Speed of speech synthesis (default: 1).
pitch_shiftnumber0Pitch shift adjustment (default: 0).
loudness_dbnumber0Loudness adjustment in decibels (default: 0).
POST/v1/audio/speech

Synthesize Speech

This endpoint is used to synthesize speech (TTS - Text-to-Speech) and stream the audio data.

Functionality
  • Converts input text into speech audio.
  • Uses credit system authentication.
  • Returns audio data directly in the response body.
  • Supports streaming for real-time audio playback.

Inputs

NameTypeDefaultDescription
textstringrequiredThe text to be synthesized into speech.
voicestringIndus-hi-mayaThe voice model to be used (e.g., "Indus-hi-maya").
output_formatstringwavAudio format for output (e.g., "wav", "mp3", "pcm").
modelstringindus-tts-v1The TTS model to use (e.g., "indus-tts-v1").
api_keystringrequiredAuthentication API key.
normalizebooleantrueWhether to normalize text before synthesis (default: true).
streambooleantrueWhether to stream the output (default: true).
speednumber1Speed of speech synthesis (default: 1).
pitch_shiftnumber0Pitch shift adjustment (default: 0).
loudness_dbnumber0Loudness adjustment in decibels (default: 0).

Outputs

StatusTypeDefaultDescription
200 OKaudio/wav-Returns synthesized speech audio as binary data.
422 Validation Errorapplication/json-Validation failure. Inspect detail array.

200 OK

Binary audio data (WAV format)

422 Validation Error

{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}
POST/v1/audio/speech/file

Synthesize Speech File

This endpoint is used to synthesize speech (TTS - Text-to-Speech) and return the complete audio file as a downloadable file.

Functionality
  • Converts input text into speech audio.
  • Returns the synthesized audio as a complete file download.
  • Unlike /v1/audio/speech, this endpoint returns the full audio file at once.

Inputs

NameTypeDefaultDescription
textstringrequiredThe text to be synthesized into speech.
voicestringIndus-hi-mayaThe voice model to be used (e.g., "Indus-hi-maya").
output_formatstringwavAudio format for output (e.g., "wav", "mp3", "pcm").
modelstringindus-tts-v1The TTS model to use (e.g., "indus-tts-v1").
api_keystringrequiredAuthentication API key.
normalizebooleantrueWhether to normalize text before synthesis (default: true).
streambooleantrueWhether to stream the output (default: true).
speednumber1Speed of speech synthesis (default: 1).
pitch_shiftnumber0Pitch shift adjustment (default: 0).
loudness_dbnumber0Loudness adjustment in decibels (default: 0).

Outputs

StatusTypeDefaultDescription
200 OKaudio/wav-Returns the synthesized speech audio as a downloadable file.
422 Validation Errorapplication/json-Validation failure. Inspect detail array.

200 OK

Binary audio data (WAV format)

422 Validation Error

{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}
POST/v1/audio/speech/preview

Speech Preview

This endpoint provides a preview of how text will be processed for speech synthesis without actually generating audio.

Functionality
  • Accepts input text and parameters, then shows how the text would be processed by the TTS system.
  • Does not generate audio, only returns metadata and analysis of the input.
  • Useful for estimating credits, duration, and validating parameters before synthesis.

Inputs

NameTypeDefaultDescription
textstringrequiredThe text to be synthesized into speech.
voicestringIndus-hi-mayaThe voice model to be used (e.g., "Indus-hi-maya").
output_formatstringwavAudio format for output (e.g., "wav", "mp3", "pcm").
modelstringindus-tts-v1The TTS model to use (e.g., "indus-tts-v1").
api_keystringrequiredAuthentication API key.
normalizebooleantrueWhether to normalize text before synthesis (default: true).
streambooleantrueWhether to stream the output (default: true).
speednumber1Speed of speech synthesis (default: 1).
pitch_shiftnumber0Pitch shift adjustment (default: 0).
loudness_dbnumber0Loudness adjustment in decibels (default: 0).

Outputs

StatusTypeDefaultDescription
200 OKapplication/json-Returns detailed analysis including character count, word count, estimated duration, credit cost, and configuration details.
422 Validation Errorapplication/json-Validation failure. Inspect detail array.

200 OK

{
"analysis": {
"total_characters": 30,
"total_words": 6,
"estimated_duration_seconds": 2.4,
"estimated_credits": 0.04,
"chunking_strategy": {
"total_chunks": 1,
"max_words_per_chunk": 15,
"overlap_words": 0,
"chunks": [
{
"index": 0,
"word_count": 6,
"text_preview": "Hello, this is a test request.",
"is_final": true
}
]
}
},
"configuration": {
"voice": "Indus-hi-maya",
"model": "indus-tts-v1",
"output_format": "wav",
"stream": true,
"temperature": 0.6,
"max_tokens": 1800,
"top_p": 0.8,
"repetition_penalty": 1.1,
"bitrate": null
},
"user_info": {
"user_id": "USR_A3E785AF",
"credits_remaining": 399.58,
"tts_unit_cost": 1,
"sufficient_credits": true
},
"output_settings": {
"format": "wav",
"voice": "Indus-hi-maya",
"model": "indus-tts-v1",
"streaming": true,
"sample_rate": 24000,
"channels": 1,
"bit_depth": 16
},
"text_processing": {
"original_text": "Hello, this is a test request.",
"processed_text": null,
"normalization_applied": false,
"normalize_setting": true,
"character_change": 0
},
"size_estimates": {
"pcm_bytes": 115200,
"wav_bytes": 115244,
"mp3_bytes": 38400,
"target_format_bytes": 115244
}
}

422 Validation Error

{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}
GET/api/voice/get-voices

List Available Voices

Retrieves the catalog of voices available for speech synthesis across multiple languages.

Functionality
  • Returns a comprehensive list of available voices organized by language.
  • Each voice includes name, voice_id, and gender information.
  • Supports multiple languages including Hindi, English, Bengali, Kannada, Marathi, Telugu, Arabic, and regional languages.
  • No authentication required for this endpoint.

Outputs

StatusTypeDefaultDescription
200 OKapplication/json-Returns voice catalog organized by language with name, voice_id, and gender for each voice.
422 Validation Errorapplication/json-Validation failure. Inspect detail array.

200 OK

{
"status_code": 200,
"message": "Voices fetched successfully",
"error": null,
"data": {
"hindi": [
{
"name": "Maya",
"voice_id": "Indus-hi-maya",
"gender": "female"
},
{
"name": "Urvashi",
"voice_id": "Indus-hi-Urvashi",
"gender": "female"
},
{
"name": "Aditi",
"voice_id": "Indus-hi-Aditi",
"gender": "female"
},
{
"name": "Arjun",
"voice_id": "Indus-hi-Arjun",
"gender": "male"
}
],
"english": [
{
"name": "Maya",
"voice_id": "Indus-en-maya",
"gender": "female"
},
{
"name": "Urvashi",
"voice_id": "Indus-en-Urvashi",
"gender": "female"
}
],
"bengali": [
{
"name": "Alivia",
"voice_id": "Indus-bn-Alivia",
"gender": "female"
},
{
"name": "Sayan",
"voice_id": "Indus-bn-Sayan",
"gender": "male"
}
],
"kannada": [
{
"name": "Aahna",
"voice_id": "Indus-bn-Aahna",
"gender": "female"
},
{
"name": "Chinmay",
"voice_id": "Indus-bn-Chinmay",
"gender": "male"
}
],
"arabic": [
{
"name": "Fatima",
"voice_id": "Indus-ar-Fatima",
"gender": "female"
},
{
"name": "Hamdan",
"voice_id": "Indus-ar-Hamdan",
"gender": "male"
}
]
}
}

422 Validation Error

{
"detail": [
{
"loc": ["string", 0],
"msg": "string",
"type": "string"
}
]
}