FAQ

Does it work with the voice clients I already use?

Yes. swaram follows the same event protocol as the OpenAI Realtime model, so most clients work by changing the address (wss://api.swaram.live/v1/realtime), the key, and the model name (mal-realtime-simple or mal-realtime-premium).

What's the difference between the two modes?

Both speak natural Malayalam with the exact same events, tools, and voices. Simple is the low-cost option; Premium has lower latency and a more expressive voice. Switch by changing the model value.

What languages does it support?

Malayalam, first and foremost. It also handles common English words, and you set the tone and style through your instructions.

Can I use it for phone calls?

Yes. swaram is the voice layer — connect your own telephony setup and send the call audio to it as 24 kHz PCM16.

The model talks over itself or replies to its own voice — what do I do?

It's hearing its own playback through the microphone. Send it only the user's voice: enable echo cancellation when you capture (echoCancellation: true in the browser), use headphones, or switch to push-to-talk so the mic is open only while the user speaks. See Audio and Turn-taking.

How am I billed?

Per minute, in credits. You can see your balance and usage on the dashboard. A session is refused (and ends) when your balance reaches zero.

Where's my data?

Conversations aren't stored by default. You bring your own context each session, and you own your data.

Can agents read these docs?

Yes — every page is available as plain Markdown at /docs/<page>.md, with an index at /llms.txt and the full set at /llms-full.txt.

← All docs