Get Voices

Retrieve a list of all available voices for text-to-speech synthesis.

Authentication

Requires API key authentication via x-api-key header.

Response

Returns an array of voice objects.

Response Schema

Each voice object contains:

Field	Type	Description
`voice_id`	string	Unique identifier for the voice (used in text-to-speech requests)
`name`	string	Human-readable name of the voice
`description`	string \| null	Detailed description of the voice characteristics
`gender`	string \| null	Gender of the voice (`male`, `female`, or `null`)
`age`	string \| null	Age category of the voice (`middle`, `elderly`, or `null`)
`languages`	array[string]	List of language codes supported by the voice (e.g., `["ar", "en"]`)
`dialect`	array[string]	List of dialects supported by the voice (e.g., `["fusha", "emirati", "najdi"]`)
`type`	string \| null	Voice type (`neural` or `null`)
`sample_url`	string	URL to an audio sample of the voice

Usage

Use the voice_id from the response in text-to-speech generation endpoints:

POST /text-to-speech/:model_id - Include voice_id in the request body
WS /text-to-speech - Include voice_id in the WebSocket message

Voice Types

Voices can be categorized by:

Dialect: fusha (Modern Standard Arabic), emirati, najdi, hijazi, kuwaiti, egyptian, british, etc.
Gender: male or female
Age: middle or elderly
Languages: Supported language codes (e.g., ar for Arabic, en for English)

Custom Voices: Some voices may have null values for certain fields. These are typically custom user-created voices. The voice_id can still be used in text-to-speech requests regardless of these field values.

Caching: Voice information doesn’t change frequently. Consider caching the voice list to reduce API calls and improve application performance.

Authorizations

x-api-key

string

header

required

API key for authentication

Response

List of available voices

voice_id

string

required

Unique identifier for the voice (used in text-to-speech requests)

name

string

required

Human-readable name of the voice

languages

string[]

required

List of language codes supported by the voice (e.g., ["ar", "en"])

dialect

string[]

required

List of dialects supported by the voice (e.g., ["fusha", "emirati", "najdi"])

sample_url

string<uri>

required

URL to an audio sample of the voice

description

string | null

Detailed description of the voice characteristics

gender

enum<string> | null

Gender of the voice

Available options:

male,

female

age

enum<string> | null

Age category of the voice

Available options:

middle,

elderly

type

enum<string> | null

Voice type

Available options:

neural

Getting Started

Text To Speech

Voice Cloning

Voice Isolation

Authentication

Response

Response Schema

Usage

Voice Types

Authorizations

Response

Getting Started

Text To Speech

Voice Cloning

Voice Isolation

​Authentication

​Response

​Response Schema

​Usage

​Voice Types

Authorizations

Response

Authentication

Response

Response Schema

Usage

Voice Types