ASR (Automatic Speech Recognition)

Convert audio to text with high accuracy for African languages.

Convert audio to text with high accuracy for Amharic, Tigrinya, and English using our ASR API.

Overview

Our ASR service transcribes audio files into text. You can transcribe single files or batch process multiple files. The service supports both synchronous (immediate) and asynchronous (background) processing.

File Upload Methods

You can provide audio to the ASR API in three ways:

Audio URL

Provide a publicly accessible URL to your audio file. The API will download and process it:

json
{
  "audio_url": "https://example.com/audio.mp3",
  "language": "am",
  "mode": "sync"
}

Multipart Upload

Upload the audio file directly using multipart form data:

curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@recording.mp3" \
  -F "language=am" \
  -F "mode=sync"

Signed Upload URL

For large files, request a signed upload URL, upload directly to storage, then submit for transcription:

# Step 1: Get a signed upload URL
curl "https://asr.lesan.ai/v1/uploads/signed-url?file_name=recording.mp3&content_type=audio/mpeg" \
  -H "Authorization: Bearer YOUR_API_KEY"


# Response: { "upload_url": "https://storage.lesan.ai/...", "file_id": "file_abc123" }


# Step 2: Upload the file
curl -X PUT "https://storage.lesan.ai/..." \
  -H "Content-Type: audio/mpeg" \
  --data-binary @recording.mp3


# Step 3: Submit for transcription
curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file_id": "file_abc123", "language": "am", "mode": "async"}'

Supported Audio Formats

The ASR API supports MP3, WAV, M4A, FLAC, OGG, WebM, and AAC files up to 500 MB. See the Audio Formats reference for details and recommendations.

Basic Transcription

Submit a transcription job, then poll the status endpoint until the job is completed. In practice, even with mode=sync, the first response may return queued and you should fetch the final result using the job id.

# Step 1: submit job
curl "https://asr.lesan.ai/v1/transcriptions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "https://example.com/audio.mp3",
    "language": "am",
    "mode": "sync"
  }'


# Response: {"id": "...", "status": "queued", "url": "/v1/transcriptions/..."}


# Step 2: poll status until completed
curl -X GET "https://asr.lesan.ai/v1/transcriptions/<JOB_ID>" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response Schema

The first response may be queued with a job id. Fetch /v1/transcriptions/{job_id} to get the final completed result.

json
// Example: initial response
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "queued",
  "language": "am",
  "text": null,
  "segments": null,
  "speakers": null,
  "progress": null,
  "duration_seconds": null,
  "processing_time_seconds": null,
  "error": null,
  "metadata": null,
  "created_at": "2026-03-05T12:14:53.849727Z",
  "completed_at": null,
  "result_url": null,
  "audio_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}


// Example: completed response
{
  "id": "4e42d202-bb19-42f5-a0d6-f7abb9791191",
  "object": "transcription",
  "status": "completed",
  "language": "am",
  "text": "...",
  "segments": [
    {
      "id": 0,
      "start_ms": 174,
      "end_ms": 2354,
      "duration_ms": 2180,
      "text": "...",
      "speaker": null,
      "confidence": null
    }
  ],
  "speakers": null,
  "progress": null,
  "duration_seconds": null,
  "processing_time_seconds": 15.40681,
  "error": null,
  "metadata": {
    "language": "am"
  },
  "created_at": "2026-03-05T12:14:53.841894Z",
  "completed_at": "2026-03-05T12:15:09.257068Z",
  "result_url": null,
  "audio_url": null,
  "url": "/v1/transcriptions/4e42d202-bb19-42f5-a0d6-f7abb9791191"
}
  • id — Unique identifier for the transcription job
  • object — Object type (always transcription)
  • status — Job status: queued, processing, completed, failed
  • text — Full transcription text (null until completed)
  • segments — Segment list (null until completed). Segments use millisecond fields like start_ms / end_ms.
  • processing_time_seconds — Processing time (available when completed)
  • url — Relative URL for this job (combine with your base URL to fetch status)

Supported Languages

  • am — Amharic
  • ti — Tigrinya
  • en — English

Processing Modes

Synchronous (sync)

Best for short audio files. The API will attempt to process the request inline, but you should still handle the case where the initial response returns queued. Always use the returned job id to fetch the final result.

Asynchronous (async)

Queue the job and return a job ID immediately. Use this for longer files. Poll the job status endpoint to check progress and retrieve results.

Async Polling Workflow

For async transcription, follow this workflow:

import requests
import time


API_URL = 'https://asr.lesan.ai'
headers = {"Authorization": "Bearer YOUR_API_KEY"}


# Step 1: Submit async job
response = requests.post(
    f"{API_URL}/v1/transcriptions",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "audio_url": "https://example.com/long-audio.mp3",
        "language": "am",
        "mode": "async"
    }
)
job = response.json()
job_id = job["id"]
print(f"Job submitted: {job_id}")


# Step 2: Poll for completion
while True:
    status_resp = requests.get(
        f"{API_URL}/v1/transcriptions/{job_id}",
        headers=headers
    )
    status = status_resp.json()


    print(f"Status: {status['status']} ({status.get('progress', 0)}%)")


    if status["status"] == "completed":
        print(f"Transcription: {status['text']}")
        break
    elif status["status"] in ("failed", "cancelled"):
        print(f"Job {status['status']}: {status.get('error', 'Unknown error')}")
        break


    time.sleep(5)  # Poll every 5 seconds

Job Lifecycle

Async transcription jobs go through these statuses:

  • queued — Job is waiting to be processed
  • processing — Transcription is in progress. The progress field shows completion percentage
  • completed — Transcription is done. Results are in the response
  • failed — An error occurred. Check the error field for details
  • cancelled — The job was cancelled via the cancel endpoint

Managing Jobs

List Jobs

Retrieve a list of your transcription jobs:

# List recent jobs
curl "https://asr.lesan.ai/v1/transcriptions?limit=10&status_filter=completed" \
  -H "Authorization: Bearer YOUR_API_KEY"

Cancel a Job

Cancel a queued or processing job:

bash
curl -X POST "https://asr.lesan.ai/v1/transcriptions/job_abc123/cancel" \
  -H "Authorization: Bearer YOUR_API_KEY"

Delete a Job

Delete a job and its results permanently:

bash
curl -X DELETE "https://asr.lesan.ai/v1/transcriptions/job_abc123" \
  -H "Authorization: Bearer YOUR_API_KEY"

Batch Processing

Process multiple audio files at once. Batch processing is always asynchronous:

curl "https://asr.lesan.ai/v1/transcriptions/batch" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_urls": [
      "https://example.com/audio1.mp3",
      "https://example.com/audio2.mp3"
    ],
    "language": "am"
  }'

For real-time transcription, see the Streaming guide. For detailed API documentation, see the API Reference. For error handling patterns, see the Error Codes reference.