Documentation Index
Fetch the complete documentation index at: https://docs.faseeh.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Text-to-Speech WebSocket API is designed to generate audio from partial text input while ensuring consistency throughout the generated audio. Although highly flexible, the WebSocket API isn’t a one-size-fits-all solution. It’s well-suited for scenarios where:
- The input text is being streamed or generated in chunks.
- Real-time audio generation is required with low latency.
- You need to send text incrementally as it becomes available.
However, it may not be the best choice when:
- The entire input text is available upfront. Given that the generations are partial, some buffering is involved, which could potentially result in slightly higher latency compared to a standard HTTP request.
- You want to quickly experiment or prototype. Working with WebSockets can be harder and more complex than using a standard HTTP API, which might slow down rapid development and testing.
Endpoint
WSS /websocket/text-to-speech
Connection URL
wss://api.faseeh.ai/api/v1/websocket/text-to-speech?x-api-key=YOUR_API_KEY
Authentication
Requires API key authentication via x-api-key query parameter or in the initial connection message.
Query Parameters
| Parameter | Type | Required | Description |
|---|
x-api-key | string | Yes | Your Faseeh API key |
Message Types
Initialize Connection
After establishing the WebSocket connection, you must send an initialization message.
Request:
{
"type": "initConnection",
"model_id": "faseeh-v1-preview",
"voice_id": "ar-najdi-male-2",
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.75,
"speed": 1.0
},
"output_format": "pcm_24000",
"x_api_key": "YOUR_API_KEY"
}
Request Fields:
| Field | Type | Required | Description |
|---|
type | string | Yes | Must be "initConnection" |
model_id | string | No | Model ID to use (default: "faseeh-mini-v1-preview") |
voice_id | string | Yes | The voice ID to use for synthesis |
voice_settings | object | No | Voice configuration |
voice_settings.stability | number | No | Stability setting (default: 0.5) |
voice_settings.similarity_boost | number | No | Similarity boost (default: 0.75) |
voice_settings.speed | number | No | Speed setting, range 0.7-1.2 (default: 1.0) |
output_format | string | No | Audio output format. Options: "pcm_8000", "pcm_16000", "pcm_22050", "pcm_24000" (default: "pcm_24000") |
x_api_key | string | No | API key (if not provided in query parameter) |
Response:
{
"type": "connectionInitialized"
}
Send Text
Send text chunks for audio generation.
Request:
{
"type": "text",
"text": "مرحبا بك في فصيح ",
"flush": false,
"try_trigger_generation": false
}
Request Fields:
| Field | Type | Required | Description |
|---|
type | string | Yes | Must be "text" |
text | string | Yes | Text to convert to speech |
flush | boolean | No | Force generation of audio even if buffer is small (default: false) |
try_trigger_generation | boolean | No | Attempt to trigger generation immediately (default: false) |
Response:
{
"audio": "base64_encoded_audio_data",
"sampleRate": 24000
}
Response Fields:
| Field | Type | Description |
|---|
audio | string | Base64-encoded PCM audio data |
sampleRate | number | Sample rate of the audio (typically 24000 Hz) |
Clear Buffer
Clear the current text buffer.
Request:
Response: No response message.
Close Connection
Close the WebSocket connection gracefully.
Request:
{
"type": "closeConnection"
}
Response: Connection closes.
Error Responses
If an error occurs, you’ll receive:
{
"type": "error",
"errorCode": 40101,
"errorMessage": "Invalid API key"
}
Error Response Fields:
| Field | Type | Description |
|---|
type | string | Always "error" |
errorCode | number | Numeric error code (e.g., 40101, 40001) |
errorMessage | string | Human-readable error message |
Example Usage
const ws = new WebSocket('wss://api.faseeh.ai/api/v1/websocket/text-to-speech?x-api-key=YOUR_API_KEY');
ws.onopen = () => {
// Initialize connection
ws.send(JSON.stringify({
type: "initConnection",
model_id: "faseeh-v1-preview",
voice_id: "ar-najdi-male-2",
voice_settings: {
stability: 0.5,
similarity_boost: 0.75,
speed: 1.0
},
output_format: "pcm_24000"
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === "connectionInitialized") {
// Connection ready, send text
ws.send(JSON.stringify({
type: "text",
text: "مرحبا بك في فصيح "
}));
} else if (data.audio) {
// Process audio chunk
const audioData = atob(data.audio);
// Handle audio playback
} else if (data.type === "error" || data.errorCode) {
console.error("Error:", data.errorMessage);
}
};
ws.onerror = (error) => {
console.error("WebSocket error:", error);
};
ws.onclose = () => {
console.log("WebSocket closed");
};
Best Practices
- Always initialize: Send
initConnection immediately after opening the connection
- Handle errors: Check for error messages in responses
- Flush when done: Use
flush: true when sending the last text chunk to ensure all audio is generated
- Close gracefully: Send
closeConnection before closing the WebSocket
- Buffer audio: Collect audio chunks and play them sequentially for smooth playback