# Lesan AI Unified API - Public Doc — Full Documentation

> Complete, machine-readable documentation for the Lesan AI API: Speech-to-Text
> (ASR), Text-to-Speech (TTS), and Machine Translation (MT), with strong support
> for Amharic, Tigrinya, and English. API version 4.0.0.

Production-ready Speech-to-Text API supporting real-time streaming, batch processing, speaker diarization, and language auto-detection.

## Authentication
All endpoints except `/health` require a Bearer token.
Use `Authorization: Bearer <api_key>` header.

## Pagination
List endpoints return paginated results with `data`, `has_more`, `next_cursor`, and `total` fields. Pass `cursor` from the response to fetch the next page.

## Errors
All errors follow the format: `{"error": {"type": "...", "code": "...", "message": "..."}}`

**Note**: Admin endpoints are hidden here.

**Machine-readable docs**: this spec is also served as JSON at `/openapi.json`. For LLM tooling, see `/llms.txt` (index) and `/llms-full.txt` (full text).

## Guides

Conceptual guides with worked code samples. Open a URL for the full page:

- **Getting Started** — Get started with Lesan AI services in minutes.
  https://docs.lesan.ai/guides/getting-started
- **Authentication** — How to authenticate with the Lesan AI API using Bearer API keys and scopes.
  https://docs.lesan.ai/guides/authentication
- **Media Probe** — Inspect a media URL for duration, title, and format before transcribing — useful for credit gating and ETA display.
  https://docs.lesan.ai/guides/media-probe
- **ASR (Speech Recognition)** — Transcribe audio to text: single files (sync and async), async polling, and batch processing of multiple files.
  https://docs.lesan.ai/guides/asr
- **MT (Machine Translation)** — Translate text between languages with the Machine Translation API.
  https://docs.lesan.ai/guides/mt
- **Streaming (WebSocket)** — Real-time speech recognition using the Lesan AI WebSocket streaming API.
  https://docs.lesan.ai/guides/streaming
- **Webhooks** — Receive real-time notifications when transcription jobs complete or fail.
  https://docs.lesan.ai/guides/webhooks
- **Best Practices** — Best practices and recommendations for using Lesan AI services.
  https://docs.lesan.ai/guides/best-practices
- **Error Codes** — Reference of error codes, the error response format, and handling strategies for the Lesan AI API.
  https://docs.lesan.ai/guides/error-codes
- **Rate Limits** — API rate limits, quotas, and usage tracking.
  https://docs.lesan.ai/guides/rate-limits
- **Audio Formats** — Supported audio formats for file upload and WebSocket streaming.
  https://docs.lesan.ai/guides/audio-formats

## API Endpoints

### GET /

**Root**

API root with available endpoints.

**Response 200** — Successful Response

### GET /health

**Health Check**

Health check with system status.

Returns information about:
- Server status
- Loaded ASR models
- Available languages
- Device (CPU/CUDA)
- VAD availability
- Storage configuration
- Redis connectivity

**Response 200** — Successful Response

### POST /translate/v1

**Translate text**

Translates input text from a source language to a target language. Supports translation between English, Amharic, and Tigrinya.

**Server:** Use the **Translation API** server (select it from the server dropdown above) when trying this endpoint.

**Response 200** — Successful translation

```json
{
  "tgt_text": "ዛሬ እንዴት ነህ?"
}
```

**Response 400** — Bad request — missing or invalid parameters

```json
{
  "error": {
    "type": "invalid_request_error",
    "code": "missing_required_field",
    "message": "The 'text' field is required.",
    "param": "text"
  }
}
```

**Response 401** — Unauthorized — invalid or missing API key

```json
{
  "error": {
    "type": "authentication_error",
    "code": "invalid_api_key",
    "message": "The API key provided is invalid."
  }
}
```

**Response 429** — Too many requests — rate limit exceeded

```json
{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Retry after 2 seconds.",
    "retry_after": 2
  }
}
```

**Response 500** — Internal server error

```json
{
  "error": {
    "type": "server_error",
    "code": "internal_error",
    "message": "An internal error occurred. Please try again."
  }
}
```

### GET /v1/languages

**List Languages**

List available languages for transcription.

Returns:
- Available language codes
- Currently loaded models
- Default language

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 429** — Rate limit exceeded

### POST /v1/media/probe

**Probe a media URL for metadata without downloading**

Inspect a remote media URL and return its duration, title, thumbnail,
uploader, and related metadata. Useful for pre-submission credit gating
and ETA display — clients typically call this before POST /v1/transcriptions.

Supports yt-dlp platforms (YouTube, Twitter/X, TikTok, Vimeo, ...) and
direct HTTP(S) audio/video URLs. Results are cached in Redis for
~5 minutes (when the Redis backend is active).

**Response 200** — Successful Response

**Response 400** — Invalid URL

**Response 401** — Authentication required

**Response 403** — Media unavailable

**Response 404** — Media not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 502** — Upstream probe error

**Response 504** — Probe timed out

### POST /v1/transcriptions

**Create transcription job**

Transcribe audio from URL or uploaded file.

**Input Methods:**
- **JSON** (`Content-Type: application/json`): Provide `audio_url`
- **File Upload** (`Content-Type: multipart/form-data`): Provide `file`

Both sync and async modes use the same processing pipeline.
Jobs are enqueued to Redis for batch workers to process.

**JSON Example:**
```json
{
    "audio_url": "https://example.com/audio.mp3",
    "language": "en",
    "mode": "async"
}
```

**File Upload Example (curl):**
```bash
curl -X POST /transcribe \
  -H "Authorization: Bearer <key>" \
  -F "file=@audio.mp3" \
  -F "language=en" \
  -F "mode=async"
```

**Modes:**
- `async` (default): Returns immediately with job_id for polling
- `sync`: Waits for completion, returns full result

**Supported Formats:** MP3, WAV, M4A, FLAC, OGG, WEBM, AAC

**Response 202** — Transcription job created

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### GET /v1/transcriptions

**List transcription jobs**

List transcription jobs for the current user.

Supports cursor-based pagination. Use `next_cursor` from the response
to fetch the next page. Use `source` to filter by job origin
('batch' for file transcriptions, 'websocket' for streaming sessions).

**Parameters:**

- `status_filter` (query): string | null
- `source` (query): string | null — Filter by source: 'batch' or 'websocket'
- `cursor` (query): string | null
- `limit` (query): integer

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### POST /v1/transcriptions/batch

**Batch transcribe multiple audio files**

Submit multiple audio files for batch transcription.

**Input Methods:**
- **JSON** (`Content-Type: application/json`): Provide `audio_urls` array
- **File Upload** (`Content-Type: multipart/form-data`): Provide multiple `files`

Each file/URL is dispatched as a separate job to the async pipeline.
The batch job tracks overall progress.

Processing is always async. Poll `GET /transcribe/{job_id}` for results.

**JSON Example:**
```json
{
    "audio_urls": ["https://example.com/a.mp3", "https://example.com/b.mp3"],
    "language": "en"
}
```

**File Upload Example (curl):**
```bash
curl -X POST /transcribe/batch \
  -H "Authorization: Bearer <key>" \
  -F "files=@audio1.mp3" \
  -F "files=@audio2.mp3" \
  -F "language=en"
```

**Supported Formats:** MP3, WAV, M4A, FLAC, OGG, WEBM, AAC

**Polling batch results:** the 202 response represents the batch job; poll `GET /v1/transcriptions/{job_id}` with its `id`. Per-file results appear in `metadata.jobs[]`, each entry carrying `job_id`, `url` (the submitted audio URL), `status`, and `text`. Fetch a child `job_id` for that file’s full result.

**Request body example:**

```json
{
  "audio_urls": [
    "https://example.com/audio1.mp3",
    "https://example.com/audio2.mp3"
  ],
  "language": "am"
}
```

**Response 202** — Successful Response

```json
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "queued",
  "language": "am",
  "text": null,
  "segments": null,
  "speakers": null,
  "progress": null,
  "error": null,
  "metadata": {
    "batch_mode": true,
    "total_count": 2
  },
  "created_at": "2026-03-05T12:14:53.849727Z",
  "completed_at": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}
```

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### GET /v1/transcriptions/{job_id}

**Get transcription status and results**

Get status and results of a transcription job.

**Statuses:**
- `queued`: Job waiting to be processed
- `processing`: Job currently running
- `completed`: Job finished successfully
- `failed`: Job failed with error
- `cancelled`: Job was cancelled

**Parameters:**

- `job_id` (path, required): string

**Response 200** — Successful Response

```json
{
  "completed_at": "2024-01-01T12:00:02Z",
  "created_at": "2024-01-01T12:00:00Z",
  "duration_seconds": 2.54,
  "id": "txn_550e8400-e29b-41d4-a716-446655440000",
  "language": "en",
  "object": "transcription",
  "processing_time_seconds": 1.2,
  "segments": [
    {
      "duration_ms": 2420,
      "end_ms": 2540,
      "id": 0,
      "start_ms": 120,
      "text": "Hello, how are you today?"
    }
  ],
  "status": "completed",
  "text": "Hello, how are you today?",
  "url": "/v1/transcriptions/txn_550e8400-e29b-41d4-a716-446655440000"
}
```

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### DELETE /v1/transcriptions/{job_id}

**Delete a transcription job**

Delete a transcription job and its data.

**Parameters:**

- `job_id` (path, required): string

**Response 200** — Successful Response

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### GET /v1/transcriptions/{job_id}/audio

**Get audio download URL**

Get a fresh signed URL for the job's audio artifact.

Returns a 302 redirect to a time-limited signed URL (1 hour).
Old jobs without a stored audio key return 404.

**Parameters:**

- `job_id` (path, required): string

**Response 200** — Successful Response

**Response 302** — Redirect to signed audio URL

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 409** — Job not completed

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### POST /v1/transcriptions/{job_id}/cancel

**Cancel a transcription job**

Cancel a transcription job.

Only jobs in `queued` or `processing` status can be cancelled.

**Parameters:**

- `job_id` (path, required): string

**Response 200** — Successful Response

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 409** — Job cannot be cancelled (already completed/failed)

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### GET /v1/transcriptions/{job_id}/transcript

**Get transcript download URL**

Get a fresh signed URL for the job's transcript JSON artifact.

Returns a 302 redirect to a time-limited signed URL (1 hour).
Old jobs without a stored transcript key return 404.

**Parameters:**

- `job_id` (path, required): string

**Response 200** — Successful Response

**Response 302** — Redirect to signed transcript URL

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 404** — Resource not found

**Response 409** — Job not completed

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

**Response 500** — Internal server error

### POST /v1/uploads

**Upload audio file**

Upload an audio file directly via multipart/form-data.

**Supported formats:** mp3, wav, ogg, flac, m4a, webm

**Max size:** 500MB

After upload, use the returned `file_url` in your transcription request.

**Response 201** — File uploaded successfully

**Response 400** — Invalid file

**Response 401** — Authentication required

**Response 413** — File too large

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### GET /v1/uploads/signed-url

**Get signed upload URL**

Generate a pre-signed URL for direct-to-storage upload.

This allows clients to upload large files directly to cloud storage,
bypassing the API server. Ideal for:
- Large files (>100MB)
- Client-side uploads from browsers
- Reducing API server load

**Flow:**
1. Call this endpoint to get a signed URL
2. Upload file directly to the signed URL using PUT
3. Use the returned `file_url` in your transcription request

**Parameters:**

- `filename` (query, required): string — Filename for the upload
- `content_type` (query): string — MIME type
- `content_length` (query): integer | null — Expected file size

**Response 200** — Signed URL generated

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### GET /v1/usage

**Get usage statistics**

Get current API usage and quota information.

Shows:
- Current rate limit usage (minute/hour/day)
- Concurrent job/connection limits
- Audio minutes processed (if applicable)
- Historical statistics

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 429** — Rate limit exceeded

### POST /v1/webhooks

**Register webhook**

Register a webhook URL to receive job completion notifications.

**Events:**
- `job.completed` - Job finished successfully
- `job.failed` - Job failed with error
- `job.progress` - Job progress updates (optional)
- `*` - All events

**Payload:** See `WebhookPayload` schema for the JSON structure sent to your URL.

**Signature Verification:** If you provide a `secret`, we'll sign payloads with HMAC-SHA256.
The signature is in the `X-Webhook-Signature` header: `t={timestamp},v1={signature}`

**Response 201** — Webhook registered

**Response 400** — Invalid request

**Response 401** — Authentication required

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### GET /v1/webhooks

**List webhooks**

List all registered webhooks for the current API key.

**Parameters:**

- `status` (query): string | null — Filter by status: active, paused, failed
- `limit` (query): integer
- `offset` (query): integer

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### GET /v1/webhooks/{webhook_id}

**Get webhook**

Get details of a specific webhook.

**Parameters:**

- `webhook_id` (path, required): string

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 404** — Webhook not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### PATCH /v1/webhooks/{webhook_id}

**Update webhook**

Update webhook configuration.

**Parameters:**

- `webhook_id` (path, required): string

**Response 200** — Successful Response

**Response 401** — Authentication required

**Response 404** — Webhook not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### DELETE /v1/webhooks/{webhook_id}

**Delete webhook**

Delete a registered webhook.

**Parameters:**

- `webhook_id` (path, required): string

**Response 204** — Webhook deleted

**Response 401** — Authentication required

**Response 404** — Webhook not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### POST /v1/webhooks/{webhook_id}/test

**Test webhook**

Send a test payload to verify webhook is working.

**Parameters:**

- `webhook_id` (path, required): string

**Response 200** — Test delivery result

**Response 401** — Authentication required

**Response 404** — Webhook not found

**Response 422** — Validation Error

**Response 429** — Rate limit exceeded

### GET /v1/ws/transcribe

**Real-time streaming transcription**

WebSocket endpoint for real-time audio transcription.

**Protocol:** The client opens a WebSocket connection (HTTP 101 Upgrade). After the server sends a `ready` message, the client streams audio as binary frames and sends text commands to control the session.

## Authentication
Send `Authorization: Bearer <api_key>` in the WebSocket upgrade headers. The token is validated **before** the connection is accepted.

## Connection Lifecycle
```
Client                          Server
  |--- WS Upgrade + Bearer -------->|
  |                  (validate key) |
  |<--------- 101 Switching --------|
  |<-- {type: "ready"} -------------|
  |--- binary audio data ---------->|
  |<-- {type: "chunk_received"} ----|
  |--- "TRANSCRIBE" --------------->|
  |<-- {type: "transcription"} -----|
  |--- "END" ---------------------->|
  |<-- {type: "transcription",      |
  |         is_final: true} ---------|
  |         [connection closed]      |
```

## Client Commands (text frames)
| Command | Description |
|---------|-------------|
| `FORMAT:<name>` | Change audio format (e.g. `FORMAT:pcm_s16le_48k`) |
| `TRANSCRIBE` | Transcribe current buffer (keeps connection open) |
| `END` | Transcribe buffer and close connection |
| `CLEAR` | Discard audio buffer without transcribing |
| `PING` | Health check; server replies with `pong` |

## Audio Formats
| Format | Sample Rate | Codec | Notes |
|--------|-------------|-------|-------|
| `pcm_s16le` | 16 kHz | PCM 16-bit LE | **Default** |
| `pcm_s16le_48k` | 48 kHz | PCM 16-bit LE | Higher fidelity |
| `wav` | 16 kHz | PCM in WAV | Standard WAV container |
| `webm_opus` | 48 kHz | Opus in WebM | Browser MediaRecorder |
| `opus_raw_16k` | 16 kHz | Raw Opus frames | WASM / low-level clients |

## Server VAD Mode
When `turn_detection=server_vad`, the server uses Voice Activity Detection to automatically segment speech into turns. Each completed turn triggers a `turn_detected` message with the transcription. No manual `TRANSCRIBE` commands are needed.

**VAD algorithms:**
- `energy` (default) - RMS energy threshold, fast and lightweight
- `silero` - Neural network VAD, higher accuracy, requires PyTorch

## Raw Opus Mode (`format=opus_raw_16k`)
Optimized for WASM clients streaming individual Opus frames. Each binary message should contain exactly one Opus frame (20 ms at 16 kHz). Turn detection is always active; transcriptions are emitted automatically.

## Limits
- Max chunk size: 1 MB per message
- Max buffer: 50 MB (~26 min at 16 kHz mono 16-bit)
- Idle timeout: 300 seconds

## Close Codes
| Code | Meaning |
|------|---------|
| `1000` | Normal closure or idle timeout |
| `1009` | Buffer limit exceeded (50 MB) |
| `4001` | Missing Authorization header |
| `4003` | Invalid API key |
| `4029` | Rate limit or connection limit exceeded |

## Server Messages
All server messages are JSON text frames with a `type` discriminator. See the `WebSocketServerMessage` schema for the full union type.

**Parameters:**

- `language` (query): string — Language code for transcription (e.g. `en`, `es`, `am`). Defaults to server default.
- `lang` (query): string — Deprecated. Use `language` instead.
- `turn_detection` (query): "server_vad" — Set to `server_vad` to enable automatic speech turn detection.
- `vad_type` (query): "energy" | "silero" — VAD algorithm. `energy` is fast; `silero` is more accurate but requires PyTorch.
- `format` (query): "pcm_s16le" | "pcm_s16le_48k" | "wav" | "webm_opus" | "opus_raw_16k" — Audio format. Set to `opus_raw_16k` to use the raw Opus handler.

**Response 101** — WebSocket connection established. Server immediately sends a `ready` JSON message.

```json
{
  "type": "ready",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "format": "pcm_s16le",
  "sample_rate": 16000,
  "job_id": "txn_550e8400-e29b-41d4-a716-446655440000"
}
```

## Schemas

### APIKeyCreatedResponse

Response when a new API key is created.

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string |  |
| `key` *(required)* | string | The raw API key - SHOW ONLY ONCE |
| `prefix` *(required)* | string |  |
| `key_type` *(required)* | string |  |
| `environment` *(required)* | string |  |
| `scopes` *(required)* | string[] |  |
| `name` *(required)* | string | null |  |
| `created_at` *(required)* | string |  |
| `expires_at` *(required)* | string | null |  |
| `rate_limits` *(required)* | object |  |
| `warning` | string |  |

### APIKeyListResponse

Response for listing API keys.

| Field | Type | Description |
| --- | --- | --- |
| `keys` *(required)* | APIKeyResponse[] |  |
| `total` *(required)* | integer |  |

### APIKeyResponse

API key metadata response (no raw key).

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string |  |
| `user_id` *(required)* | string |  |
| `email` | string | null |  |
| `prefix` *(required)* | string |  |
| `key_fragment` *(required)* | string |  |
| `display_key` *(required)* | string |  |
| `key_type` *(required)* | string |  |
| `environment` *(required)* | string |  |
| `scopes` *(required)* | string[] |  |
| `name` *(required)* | string | null |  |
| `description` *(required)* | string | null |  |
| `status` *(required)* | string |  |
| `created_at` *(required)* | string |  |
| `updated_at` *(required)* | string |  |
| `expires_at` *(required)* | string | null |  |
| `last_used_at` *(required)* | string | null |  |
| `rate_limits` *(required)* | object |  |
| `allowed_ips` *(required)* | string[] |  |
| `allowed_origins` *(required)* | string[] |  |

### APIKeyRotatedResponse

Response for a key rotation: the new key (raw, once) + the retired one.

| Field | Type | Description |
| --- | --- | --- |
| `new_key` *(required)* | APIKeyCreatedResponse |  |
| `retired_key_id` *(required)* | string |  |
| `retired_status` *(required)* | string | 'revoked' (grace 0) or 'active' until retired_expires_at |
| `retired_expires_at` | string | null | When the old key expires, if a grace window was given |

### AdminKeyStatsResponse

| Field | Type | Description |
| --- | --- | --- |
| `total_keys` *(required)* | integer |  |
| `active_keys` *(required)* | integer |  |
| `revoked_keys` *(required)* | integer |  |
| `expired_keys` | integer |  |
| `keys_by_type` | object |  |
| `keys_by_environment` | object |  |

### AdminModelActionResponse

| Field | Type | Description |
| --- | --- | --- |
| `message` *(required)* | string |  |
| `loaded_models` *(required)* | string[] |  |

### AdminModelsResponse

| Field | Type | Description |
| --- | --- | --- |
| `loaded` *(required)* | string[] | Currently loaded language codes |
| `available` *(required)* | string[] | All available language codes |
| `device` *(required)* | string | Compute device (cpu or cuda) |
| `cache_info` | object | Model cache information |

### Body_upload_file_v1_uploads_post

| Field | Type | Description |
| --- | --- | --- |
| `file` *(required)* | string | Audio file to upload |

### CreateAPIKeyRequest

Request to create a new API key.

| Field | Type | Description |
| --- | --- | --- |
| `name` | string | null | Name/label for the key |
| `description` | string | null | Description |
| `scopes` | string[] | null | Permission scopes: read, write, admin |
| `key_type` | string | null | Key type: secret, publishable, restricted, admin |
| `environment` | string | null | Environment: live, test, dev |
| `expires_in_days` | integer | null | Expiration in days (null for no expiration) |
| `allowed_ips` | string[] | null | IP whitelist (null for no restriction) |
| `allowed_origins` | string[] | null | Origin whitelist for CORS (null for no restriction) |
| `rate_limits` | object | null | Custom rate limits |

### DeleteResponse

Standard delete response.

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string | ID of deleted resource |
| `object` *(required)* | string | Resource type |
| `deleted` | boolean | Whether deletion was successful |

### ErrorDetail

Structured error detail following Stripe-style error envelope.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string | Error type category |
| `code` *(required)* | string | Machine-readable error code |
| `message` *(required)* | string | Human-readable error message |
| `param` | string | null | Parameter that caused the error |

### ErrorResponse

| Field | Type | Description |
| --- | --- | --- |
| `error` | object |  |

### FileUploadResponse

Response after file upload.

| Field | Type | Description |
| --- | --- | --- |
| `file_id` *(required)* | string | Unique file identifier |
| `file_url` *(required)* | string | Internal URL to access the file |
| `filename` *(required)* | string | Original filename |
| `size_bytes` *(required)* | integer | File size in bytes |
| `content_type` *(required)* | string | MIME type |
| `duration_seconds` | number | null | Audio duration if detected |
| `expires_at` | string | null | When the file will be deleted |

### HTTPValidationError

| Field | Type | Description |
| --- | --- | --- |
| `detail` | ValidationError[] |  |

### HealthResponse

Health check response.

| Field | Type | Description |
| --- | --- | --- |
| `status` *(required)* | "healthy" | "unhealthy" | "degraded" |  |
| `version` *(required)* | string | API version |
| `loaded_models` | string[] |  |
| `available_models` | string[] |  |
| `device` *(required)* | string | Compute device (cpu/cuda) |
| `silero_vad_available` *(required)* | boolean |  |
| `ina_vad_available` | boolean | INA Speech Segmenter availability |
| `streaming_available` | boolean | WebSocket streaming availability |
| `storage_provider` | string | null | Storage provider (local/gcs/s3) |
| `storage_bucket` | string | null | Storage bucket name (for gcs/s3) |
| `storage_configured` | boolean | Whether storage is configured |
| `redis_status` | "healthy" | "unhealthy" | "not_configured" | Redis connection status |
| `redis_version` | string | null | Redis server version |
| `redis_connected_clients` | integer | null | Number of Redis clients |
| `redis_memory` | string | null | Redis memory usage |
| `auth_enabled` | boolean | Whether API key authentication is enforced |
| `timestamp` *(required)* | string | Current server time |

### JobStatusEnum

Job status values.

### LanguagesResponse

Available languages response.

| Field | Type | Description |
| --- | --- | --- |
| `languages` *(required)* | string[] | Available language codes |
| `loaded` *(required)* | string[] | Currently loaded languages |
| `default` *(required)* | string | Default language |

### MediaProbeRequest

Request body for POST /v1/media/probe.

| Field | Type | Description |
| --- | --- | --- |
| `url` *(required)* | string | Public media URL to probe (http/https only) |

### MediaProbeResponse

Metadata about a remote media URL, returned without downloading the media.

Intended for pre-submission checks (credit gating, ETA display, live-stream
detection). `duration_seconds` is the primary field; if null, inspect
`warnings` for the reason.

| Field | Type | Description |
| --- | --- | --- |
| `object` | "media_probe" | Object type |
| `url` *(required)* | string | Probed URL (canonical, as echoed back) |
| `source` *(required)* | string | Detected source platform (youtube, twitter, direct, etc.) |
| `extractor` | string | null | yt-dlp extractor name, or 'generic' for direct URLs |
| `title` | string | null | Media title |
| `uploader` | string | null | Uploader/channel name |
| `duration_seconds` | number | null | Duration in seconds (null if unknown) |
| `thumbnail_url` | string | null | Thumbnail image URL |
| `upload_date` | string | null | Upload date (ISO 8601 YYYY-MM-DD) |
| `is_live` | boolean | True if the URL points to a live stream |
| `language` | string | null | Declared source language, if any |
| `file_size_bytes` | integer | null | Content-Length (direct URLs) or yt-dlp filesize_approx |
| `content_type` | string | null | MIME type (direct URLs) |
| `warnings` | "live_stream" | "duration_unknown" | "very_long" | "age_restricted"[] | Non-fatal issues the client should surface |
| `cached` | boolean | True if this response came from the probe cache |
| `probed_at` *(required)* | string | UTC timestamp when the probe ran (ISO 8601) |
| `metadata` | object | Extractor-specific extras (view_count, description, etc.) |

### ProgressResponse

Job progress information.

| Field | Type | Description |
| --- | --- | --- |
| `stage` *(required)* | string | Current processing stage |
| `percent` *(required)* | integer | Progress percentage |
| `total_segments` | integer | null | Total segments to process |
| `done_segments` | integer | null | Segments completed |

### QuotaInfo

Quota information for a specific limit.

| Field | Type | Description |
| --- | --- | --- |
| `limit` *(required)* | integer | Maximum allowed |
| `used` *(required)* | integer | Currently used |
| `remaining` *(required)* | integer | Remaining quota |
| `reset_at` | string | null | When quota resets |

### RevokeAPIKeyRequest

Request to revoke an API key.

| Field | Type | Description |
| --- | --- | --- |
| `reason` | string | null | Reason for revocation |

### RevokeKeyResponse

| Field | Type | Description |
| --- | --- | --- |
| `message` *(required)* | string |  |
| `key_id` *(required)* | string |  |
| `status` *(required)* | string |  |
| `revoked_at` | string |  |

### RootResponse

| Field | Type | Description |
| --- | --- | --- |
| `name` | string |  |
| `version` | string |  |
| `docs` | string |  |
| `endpoints` | object |  |

### RotateAPIKeyRequest

Request to rotate an API key.

| Field | Type | Description |
| --- | --- | --- |
| `grace_minutes` | integer | How long the old key stays valid after rotation (0 = revoke immediately). During the grace window both keys work, so clients can swap with no downtime; the old key then expires. |

### SegmentResponse

Transcribed segment in response.

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | integer | Segment index |
| `start_ms` *(required)* | integer | Start time in milliseconds |
| `end_ms` *(required)* | integer | End time in milliseconds |
| `duration_ms` *(required)* | integer | Duration in milliseconds |
| `text` *(required)* | string | Transcribed text |
| `speaker` | string | null | Speaker label (if diarization enabled) |
| `confidence` | number | null | Confidence score |

### SignedUrlResponse

Response with signed upload URL.

| Field | Type | Description |
| --- | --- | --- |
| `upload_url` *(required)* | string | Pre-signed URL for direct upload |
| `file_id` *(required)* | string | File identifier to use after upload |
| `file_url` *(required)* | string | URL to use in transcription request after upload completes |
| `method` | string | HTTP method to use for upload |
| `headers` | object | Required headers for the upload request |
| `expires_in` *(required)* | integer | Seconds until the signed URL expires |
| `max_size_bytes` *(required)* | integer | Maximum allowed file size |

### SpeakerInfo

Speaker information from diarization.

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string | Speaker identifier |
| `label` *(required)* | string | Speaker label (e.g., 'Speaker 1') |
| `segments_count` *(required)* | integer | Number of segments for this speaker |

### TranscriptionErrorInfo

Error information for failed transcriptions.

| Field | Type | Description |
| --- | --- | --- |
| `code` *(required)* | string | Error code |
| `message` *(required)* | string | Error message |

### TranscriptionListResponse

Paginated list of transcriptions (v4).

| Field | Type | Description |
| --- | --- | --- |
| `has_more` *(required)* | boolean | Whether more results are available |
| `next_cursor` | string | null | Cursor for next page |
| `total` *(required)* | integer | Total matching items |
| `data` | TranscriptionResponse[] | List of transcriptions |

### TranscriptionResponse

Unified transcription response (v4).

Replaces JobStatusResponse + JobCreatedResponse + TranscriptionResultResponse.
Used for both creation (202) and status polling (200).

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string | Transcription job ID |
| `object` | "transcription" | Object type |
| `status` *(required)* | JobStatusEnum | Job status |
| `language` | string | null | Language code (detected if 'auto') |
| `text` | string | null | Full transcribed text |
| `segments` | SegmentResponse[] | null | Transcribed segments |
| `speakers` | SpeakerInfo[] | null | Speaker info (if diarization enabled) |
| `progress` | ProgressResponse | null | Progress info |
| `duration_seconds` | number | null | Audio duration in seconds |
| `processing_time_seconds` | number | null | Processing time in seconds |
| `error` | TranscriptionErrorInfo | null | Error details (failed jobs) |
| `metadata` | object | null | Processing metadata |
| `created_at` | string | null | Creation time (ISO 8601) |
| `completed_at` | string | null | Completion time (ISO 8601) |
| `result_url` | string | null | URL to transcript JSON artifact (completed jobs) |
| `audio_url` | string | null | Canonical audio reference (lesan:// URI) |
| `audio_download_url` | string | null | Stable URL to download audio (302 redirect to signed URL) |
| `url` *(required)* | string | Self link |

### TranslationRequest

| Field | Type | Description |
| --- | --- | --- |
| `key` *(required)* | string | Your API key for authentication. |
| `text` *(required)* | string | The text to translate. |
| `src_lang` *(required)* | "en" | "am" | "ti" | ISO 639-1 language code of the source text. |
| `tgt_lang` *(required)* | "en" | "am" | "ti" | ISO 639-1 language code for the translation output. |

### TranslationResponse

| Field | Type | Description |
| --- | --- | --- |
| `tgt_text` | string | The translated text in the target language. |

### UpdateAPIKeyRequest

Request to update an API key.

| Field | Type | Description |
| --- | --- | --- |
| `name` | string | null | New name/label |
| `description` | string | null | New description |
| `scopes` | string[] | null | New permission scopes |
| `allowed_ips` | string[] | null | New IP whitelist |
| `allowed_origins` | string[] | null | New origin whitelist |
| `rate_limits` | object | null | New rate limits |

### UsageResponse

API usage and quota information.

| Field | Type | Description |
| --- | --- | --- |
| `tier` *(required)* | string | Current subscription tier |
| `requests_minute` *(required)* | QuotaInfo | Requests per minute quota |
| `requests_hour` *(required)* | QuotaInfo | Requests per hour quota |
| `requests_day` *(required)* | QuotaInfo | Requests per day quota |
| `concurrent_jobs` *(required)* | QuotaInfo | Concurrent job limit |
| `websocket_connections` *(required)* | QuotaInfo | WebSocket connection limit |
| `audio_minutes_month` | QuotaInfo | null | Audio minutes this month |
| `total_requests` *(required)* | integer | Total requests all time |
| `total_jobs` | integer | Total jobs processed |
| `total_audio_minutes` | number | Total audio minutes processed |
| `account_created_at` *(required)* | string | Account creation date |

### ValidationError

| Field | Type | Description |
| --- | --- | --- |
| `loc` *(required)* | string | integer[] |  |
| `msg` *(required)* | string |  |
| `type` *(required)* | string |  |

### WebSocketBufferClearedMessage

Sent after a CLEAR command.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `cleared_bytes` *(required)* | integer |  |
| `session_id` *(required)* | string |  |

### WebSocketChunkReceivedMessage

Acknowledgement of an audio chunk.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `chunk_number` *(required)* | integer |  |
| `buffer_size` *(required)* | integer | Current buffer size in bytes |
| `session_id` *(required)* | string |  |

### WebSocketErrorMessage

Sent when an error occurs during the session.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `error` *(required)* | string | Error description |
| `session_id` | string |  |
| `supported_formats` | string[] | Included when the error is about an unsupported format |

### WebSocketFormatAcceptedMessage

Sent after a FORMAT:<name> command is accepted.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `format` *(required)* | string |  |
| `sample_rate` *(required)* | integer |  |
| `channels` *(required)* | integer |  |
| `bit_depth` *(required)* | integer |  |
| `session_id` *(required)* | string |  |

### WebSocketPongMessage

Response to a PING command.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `session_id` *(required)* | string |  |
| `buffer_size` | integer |  |
| `session_duration_seconds` | number |  |

### WebSocketReadyMessage

Sent by the server immediately after connection is accepted.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `session_id` *(required)* | string | Unique session identifier |
| `format` *(required)* | string | Negotiated audio format |
| `sample_rate` *(required)* | integer | Sample rate in Hz |
| `job_id` | string | Job ID for post-session retrieval. Present when Redis backend is enabled. |

### WebSocketServerMessage

Union of all server-to-client WebSocket messages, discriminated by `type`.

### WebSocketTranscriptionMessage

Transcription result returned after a TRANSCRIBE or END command.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `transcription` *(required)* | string | Transcribed text |
| `language` *(required)* | string | Language code used |
| `is_final` *(required)* | boolean | True if triggered by END command |
| `is_turn` | boolean |  |
| `duration_seconds` *(required)* | number | Audio duration in seconds |
| `processing_time_seconds` *(required)* | number | ASR inference time in seconds |
| `audio_size_bytes` | integer |  |
| `session_id` *(required)* | string |  |
| `job_id` | string | Job ID for post-session retrieval. Present in the final message (is_final: true) when Redis backend is enabled. |

### WebSocketTurnDetectedMessage

Sent when server VAD detects a completed speech turn. Only emitted when turn_detection=server_vad.

| Field | Type | Description |
| --- | --- | --- |
| `type` *(required)* | string |  |
| `transcription` *(required)* | string |  |
| `language` *(required)* | string |  |
| `turn_start_ms` *(required)* | integer | Turn start offset in milliseconds |
| `turn_end_ms` *(required)* | integer | Turn end offset in milliseconds |
| `duration_seconds` | number |  |
| `processing_time_seconds` | number |  |
| `session_id` *(required)* | string |  |

### WebhookCreateRequest

Request to create a webhook.

| Field | Type | Description |
| --- | --- | --- |
| `url` *(required)* | string | URL to receive webhook notifications |
| `events` | WebhookEventEnum[] | Events to subscribe to |
| `secret` | string | null | Secret for HMAC signature verification (optional) |
| `name` | string | null | Friendly name for the webhook |
| `description` | string | null | Description |

### WebhookEventEnum

Webhook event types.

### WebhookListResponse

Paginated list of webhooks (v4).

| Field | Type | Description |
| --- | --- | --- |
| `has_more` *(required)* | boolean | Whether more results are available |
| `next_cursor` | string | null | Cursor for next page |
| `total` *(required)* | integer | Total matching items |
| `data` | WebhookResponse[] |  |

### WebhookResponse

Webhook configuration response.

| Field | Type | Description |
| --- | --- | --- |
| `id` *(required)* | string | Webhook ID |
| `object` | "webhook" | Object type |
| `url` *(required)* | string | Callback URL |
| `events` *(required)* | string[] | Subscribed events |
| `name` | string | null |  |
| `description` | string | null |  |
| `status` *(required)* | string | Status: active, paused, failed |
| `total_deliveries` | integer | Total delivery attempts |
| `successful_deliveries` | integer | Successful deliveries |
| `failed_deliveries` | integer | Failed deliveries |
| `last_delivery_at` | string | null | Last delivery timestamp |
| `last_failure_reason` | string | null | Last failure reason |
| `created_at` *(required)* | string |  |
| `updated_at` | string | null |  |

### WebhookTestResponse

| Field | Type | Description |
| --- | --- | --- |
| `success` *(required)* | boolean |  |
| `status_code` *(required)* | integer |  |
| `response_body` | string |  |
| `error` | string |  |
| `duration_ms` *(required)* | integer |  |

### WebhookUpdateRequest

Request to update a webhook.

| Field | Type | Description |
| --- | --- | --- |
| `url` | string | null | New URL |
| `events` | WebhookEventEnum[] | null | New event subscriptions |
| `secret` | string | null | New secret |
| `name` | string | null | New name |
| `description` | string | null | New description |
| `is_active` | boolean | null | Enable/disable webhook |