ASR (Automatic Speech Recognition)

Convert audio to text with high accuracy for African languages.

Convert audio to text with high accuracy for Amharic, Tigrinya, and English using our ASR API.

💡 Tip

Need to gate on duration (credits, max length) or show an ETA before transcribing? Call /v1/media/probe first — it returns duration_seconds, title, thumbnail, and a live-stream flag without downloading the media.

Overview

Our ASR service transcribes audio files into text. You can transcribe single files or batch process multiple files. The service supports both synchronous (immediate) and asynchronous (background) processing.

File Upload Methods

You can provide audio to the ASR API in three ways:

Audio URL

Provide a publicly accessible URL to your audio file. The API will download and process it:

json
{
  "audio_url": "https://example.com/audio.mp3",
  "language": "am",
  "mode": "sync"
}

Multipart Upload

Upload the audio file directly using multipart form data:

curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@recording.mp3" \
  -F "language=am" \
  -F "mode=sync"

Signed Upload URL

For large files, request a signed upload URL, upload directly to storage, then submit for transcription:

# Step 1: Get a signed upload URL
curl "https://asr.lesan.ai/v1/uploads/signed-url?filename=recording.mp3&content_type=audio/mpeg" \
  -H "Authorization: Bearer YOUR_API_KEY"


# Response: { "upload_url": "https://storage.lesan.ai/...", "file_url": "...", "file_id": "file_abc123" }


# Step 2: Upload the file to the signed upload_url
curl -X PUT "https://storage.lesan.ai/..." \
  -H "Content-Type: audio/mpeg" \
  --data-binary @recording.mp3


# Step 3: Submit for transcription using the file_url from step 1
curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"audio_url": "<file_url from step 1>", "language": "am", "mode": "async"}'

Supported Audio Formats

The ASR API supports MP3, WAV, M4A, FLAC, OGG, WebM, and AAC files up to 500 MB. See the Audio Formats reference for details and recommendations.

Basic Transcription

Submit a transcription job, then poll the status endpoint until the job is completed. In practice, even with mode=sync, the first response may return queued and you should fetch the final result using the job id.

# Step 1: submit job
curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://example.com/audio.mp3",
    "language": "am",
    "mode": "sync"
  }'


# Response: {"id": "...", "status": "queued", "url": "/v1/transcriptions/..."}


# Step 2: poll status until completed
curl -X GET "https://asr.lesan.ai/v1/transcriptions/<JOB_ID>" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response Schema

The first response may be queued with a job id. Fetch /v1/transcriptions/{job_id} to get the final completed result.

json
// Example: initial response
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "queued",
  "language": "am",
  "text": null,
  "segments": null,
  "speakers": null,
  "progress": null,
  "duration_seconds": null,
  "processing_time_seconds": null,
  "error": null,
  "metadata": null,
  "created_at": "2026-03-05T12:14:53.849727Z",
  "completed_at": null,
  "result_url": null,
  "audio_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}


// Example: completed response
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "completed",
  "language": "am",
  "text": "...",
  "segments": [
    {
      "id": 0,
      "start_ms": 174,
      "end_ms": 2354,
      "duration_ms": 2180,
      "text": "...",
      "speaker": null,
      "confidence": null
    }
  ],
  "speakers": null,
  "progress": null,
  "duration_seconds": null,
  "processing_time_seconds": 15.40681,
  "error": null,
  "metadata": {
    "language": "am"
  },
  "created_at": "2026-03-05T12:14:53.841894Z",
  "completed_at": "2026-03-05T12:15:09.257068Z",
  "result_url": null,
  "audio_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}
  • id — Unique identifier for the transcription job
  • object — Object type (always transcription)
  • status — Job status: queued, processing, completed, failed
  • text — Full transcription text (null until completed)
  • segments — Segment list (null until completed). Segments use millisecond fields like start_ms / end_ms.
  • processing_time_seconds — Processing time (available when completed)
  • url — Relative URL for this job (combine with your base URL to fetch status)

Supported Languages

  • am — Amharic
  • ti — Tigrinya
  • en — English

Processing Modes

Synchronous (sync)

Best for short audio files. The API will attempt to process the request inline, but you should still handle the case where the initial response returns queued. Always use the returned job id to fetch the final result.

Asynchronous (async)

Queue the job and return a job ID immediately. Use this for longer files. Poll the job status endpoint to check progress and retrieve results.

Async Polling Workflow

For async transcription, follow this workflow:

import requests
import time


API_URL = 'https://asr.lesan.ai'
headers = {"Authorization": "Bearer YOUR_API_KEY"}


# Step 1: Submit async job
response = requests.post(
    f"{API_URL}/v1/transcriptions",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "audio_url": "https://example.com/long-audio.mp3",
        "language": "am",
        "mode": "async"
    }
)
job = response.json()
job_id = job["id"]
print(f"Job submitted: {job_id}")


# Step 2: Poll for completion
while True:
    status_resp = requests.get(
        f"{API_URL}/v1/transcriptions/{job_id}",
        headers=headers
    )
    status = status_resp.json()


    progress = status.get("progress") or {}
    print(f"Status: {status['status']} ({progress.get('percent', 0)}%)")


    if status["status"] == "completed":
        print(f"Transcription: {status['text']}")
        break
    elif status["status"] in ("failed", "cancelled"):
        print(f"Job {status['status']}: {status.get('error', 'Unknown error')}")
        break


    time.sleep(5)  # Poll every 5 seconds

Job Lifecycle

Async transcription jobs go through these statuses:

  • queued — Job is waiting to be processed
  • processing — Transcription is in progress. The progress field shows completion percentage
  • completed — Transcription is done. Results are in the response
  • failed — An error occurred. Check the error field for details
  • cancelled — The job was cancelled via the cancel endpoint

Managing Jobs

List Jobs

Retrieve a list of your transcription jobs:

# List recent jobs
curl "https://asr.lesan.ai/v1/transcriptions?limit=10&status_filter=completed" \
  -H "Authorization: Bearer YOUR_API_KEY"

Cancel a Job

Cancel a queued or processing job:

bash
curl -X POST "https://asr.lesan.ai/v1/transcriptions/job_abc123/cancel" \
  -H "Authorization: Bearer YOUR_API_KEY"

Delete a Job

Delete a job and its results permanently:

bash
curl -X DELETE "https://asr.lesan.ai/v1/transcriptions/job_abc123" \
  -H "Authorization: Bearer YOUR_API_KEY"

Batch Processing

Submit several audio files in one request. A batch creates a single parent batch job, and each file is transcribed as its own child job. Batch processing is always asynchronous — you poll the parent job for overall progress and for per-file results.

Submit a batch

Send a JSON body with an audio_urls array. Each entry may be a public https:// URL or a lesan:// URI returned by the uploads endpoint. A batch accepts up to 100 files.

curl "https://asr.lesan.ai/v1/transcriptions/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_urls": [
      "https://example.com/audio1.mp3",
      "https://example.com/audio2.mp3"
    ],
    "language": "am"
  }'

To upload local files directly, send multipart/form-data with one files field per file instead of audio_urls:

bash
curl "https://asr.lesan.ai/v1/transcriptions/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "files=@meeting-part1.mp3" \
  -F "files=@meeting-part2.mp3" \
  -F "language=am"

Batch creation response

The API responds with 202 Accepted and a single transcription object that represents the batch job. Its top-level id is the ID you poll:

json
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "queued",
  "language": "am",
  "text": null,
  "segments": null,
  "speakers": null,
  "progress": null,
  "duration_seconds": null,
  "processing_time_seconds": null,
  "error": null,
  "metadata": {
    "batch_mode": true,
    "total_count": 2
  },
  "created_at": "2026-03-05T12:14:53.849727Z",
  "completed_at": null,
  "result_url": null,
  "audio_url": null,
  "audio_download_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}

Warning

The batch job ID is the top-level id field — there is no batch_id field, and the creation response has no top-level jobs array. Poll GET /v1/transcriptions/{id} using that id.

Poll the batch job

Poll GET /v1/transcriptions/{batch_job_id} until the batch status is completed or failed. Per-file results are nested in the metadata.jobs array:

json
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "completed",
  "language": "am",
  "text": null,
  "segments": null,
  "speakers": null,
  "progress": {
    "stage": "completed",
    "percent": 100,
    "total_segments": 2,
    "done_segments": 2
  },
  "duration_seconds": null,
  "processing_time_seconds": null,
  "error": null,
  "metadata": {
    "batch_mode": true,
    "total_count": 2,
    "completed_count": 2,
    "failed_count": 0,
    "processing_count": 0,
    "queued_count": 0,
    "jobs": [
      {
        "job_id": "15b8e1de-b0ab-43d0-825e-a09e1e973ce7",
        "url": "https://example.com/audio1.mp3",
        "status": "completed",
        "text": "..."
      },
      {
        "job_id": "dd403af8-40f9-4cde-8c92-ba5c35e36c38",
        "url": "https://example.com/audio2.mp3",
        "status": "completed",
        "text": "..."
      }
    ]
  },
  "created_at": "2026-03-05T12:14:53.841894Z",
  "completed_at": null,
  "result_url": null,
  "audio_url": null,
  "audio_download_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}

The batch poll response contains:

FieldDescription
statusOverall batch status: queued, processing, completed, or failed.
progressObject with stage, percent, total_segments, and done_segments. For a batch, total_segments / done_segments count files, not audio segments.
metadata.batch_modeAlways true for batch jobs.
metadata.total_countNumber of files in the batch.
metadata.completed_count / failed_count / processing_count / queued_countLive per-status file counts.
metadata.jobsArray with one entry per submitted file (see below).

Each metadata.jobs[] entry describes one file:

FieldDescription
job_idThe child transcription job ID. Fetch it for the full per-file result.
urlThe exact audio URL you submitted — use this to map each result back to its input.
statusThat file's status: queued, processing, completed, or failed.
textThe transcript text. Present once that file reaches completed.
progressA child progress object. Present while the file is processing.

💡 Tip

Each metadata.jobs[] entry maps to its input by the url field and carries plain text only. For the full per-file result — timed segments, speakers, processing_time_seconds — fetch the child job by its job_id: GET /v1/transcriptions/{job_id}. Each child is an ordinary transcription job with the same shape as the Response Schema above.

Note

On the batch parent job, text and segments stay null, and completed_at / processing_time_seconds may stay null even after completion — those values belong to the individual child jobs. Rely on the batch status and the metadata counts.

Batch polling example

import requests
import time


API_URL = 'https://asr.lesan.ai'
headers = {"Authorization": "Bearer YOUR_API_KEY"}


# batch_id is the "id" returned when the batch was created.
while True:
    resp = requests.get(f"{API_URL}/v1/transcriptions/{batch_id}", headers=headers)
    resp.raise_for_status()
    batch = resp.json()


    meta = batch.get("metadata", {})
    print(f"{batch['status']}: "
          f"{meta.get('completed_count', 0)} done, "
          f"{meta.get('failed_count', 0)} failed "
          f"of {meta.get('total_count', 0)}")


    if batch["status"] in ("completed", "failed"):
        break
    time.sleep(5)  # Poll every 5 seconds


# Per-file results are under metadata.jobs — correlate by "url".
for job in batch["metadata"]["jobs"]:
    print(f"  {job['url']} -> {job['status']}")
    if job["status"] == "completed":
        # For segments, speakers, and timings, fetch the child job by job_id.
        full = requests.get(
            f"{API_URL}/v1/transcriptions/{job['job_id']}", headers=headers
        ).json()
        print(f"    text: {full['text']}")

Partial failures

A batch finishes as completed even when some files fail — one bad file does not fail the whole batch. Check metadata.failed_count, then inspect each metadata.jobs[] entry whose status is failed. To read the failure reason, fetch that child job by its job_id; the child job's error object holds the code and message (see the Error Codes reference).

For real-time transcription, see the Streaming guide. For detailed API documentation, see the API Reference. For error handling patterns, see the Error Codes reference.