Media Probe
Inspect a media URL and retrieve its duration, title, thumbnail, and format — without downloading — before submitting a transcription.
POST /v1/media/probe returns metadata about a remote media URL in a single round-trip. Use it to check audio duration before charging credits, show an ETA on your upload screen, display a preview card, or detect live streams early — without having to start a transcription job first.
💡 Tip
When to use it
- Credit gating — compare
duration_secondsagainst the user's remaining credits before starting a potentially expensive transcription. - ETA display — show "This 2h 13m video will finish in about 4 minutes" on your upload UI.
- Live-stream detection — reject or warn when
is_liveistrue, since live streams transcribe unreliably. - Preview cards — fetch the title, uploader, and thumbnail to render a confirmation card before the user submits.
Quickstart
Send a JSON body with the URL you want to inspect:
curl "https://asr.lesan.ai/v1/media/probe" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}'Example response:
{
"object": "media_probe",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"source": "youtube",
"extractor": "youtube",
"title": "Rick Astley - Never Gonna Give You Up (Official Music Video)",
"uploader": "Rick Astley",
"duration_seconds": 213.0,
"thumbnail_url": "https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg",
"upload_date": "2009-10-25",
"is_live": false,
"language": "en",
"file_size_bytes": null,
"content_type": null,
"warnings": [],
"cached": false,
"probed_at": "2026-04-23T12:34:56Z",
"metadata": {
"view_count": 1500000000,
"like_count": 16000000,
"description": "The official video..."
}
}Request
| Field | Type | Required | Description |
|---|---|---|---|
| url | string | yes | Public http(s) URL to a media resource. Supports yt-dlp platforms (YouTube, Twitter/X, TikTok, Vimeo, etc.) and direct audio/video URLs. |
Response
| Field | Type | Description |
|---|---|---|
| object | string | Always "media_probe". |
| url | string | The URL that was probed (normalized: lowercase scheme and host). |
| source | string | Detected source platform: youtube, twitter, tiktok, vimeo, soundcloud, facebook, instagram, twitch, reddit, direct, or other. |
| extractor | string | null | yt-dlp extractor name (e.g. "youtube"), or "generic" for direct URLs. |
| title | string | null | Media title, or filename for direct URLs. |
| uploader | string | null | Channel or uploader name (yt-dlp sources only). |
| duration_seconds | number | null | Duration in seconds. null when the container does not expose duration metadata — check warnings for "duration_unknown". |
| thumbnail_url | string | null | Thumbnail image URL (yt-dlp sources). |
| upload_date | string | null | Upload date as ISO 8601 (YYYY-MM-DD). |
| is_live | boolean | true if the URL points to a live stream. false by default for direct URLs. |
| language | string | null | Declared source language, if the extractor exposes one. Hint only. |
| file_size_bytes | number | null | Content-Length for direct URLs, or filesize_approx from yt-dlp when available. |
| content_type | string | null | MIME type from the HEAD response (direct URLs only). |
| warnings | string[] | Non-fatal flags the client should surface. See "Warnings" below. |
| cached | boolean | true if this response was served from the 5-minute cache. Values like view_count may be stale. |
| probed_at | string | UTC timestamp when the probe ran (ISO 8601). |
| metadata | object | Extractor-specific extras (view_count, like_count, description, channel, webpage_url). |
Warnings
Probes return a 200 response even when something is non-ideal. Inspect the warnings array and surface matches to the end-user:
- live_stream —
is_liveis true. Transcription will likely run but the boundary is unstable; warn the user or block if you don't support it. - duration_unknown — the container did not expose duration. You still receive
file_size_bytesandtitle, so you can gate by size or prompt the user to confirm. - very_long — duration is unusually large (far above the per-job audio cap). Consider splitting or rejecting.
- age_restricted — the source reports the media requires authentication (sign-in, subscriber-only, premium-only). Our crawler will likely fail to download it when you transcribe; warn the user up front.
ℹ Note
duration_unknown is not an error. The endpoint intentionally returns a 200 so the client can still render a preview and let the user confirm. Use file_size_bytes as a coarse fallback when gating.Supported sources
yt-dlp platforms — duration, title, uploader, thumbnail, live flag, language, and extractor-specific extras:
- YouTube, Twitter/X, TikTok, Vimeo, SoundCloud
- Facebook, Instagram, Twitch, Reddit
- Anything else yt-dlp's "generic" extractor can resolve
Direct audio/video URLs — probed via HEAD (for size and content-type) plus ffprobe (for duration). Supports the same container formats as file uploads — see the Audio Formats reference.
⚠ Warning
lesan:// URLs (from /v1/uploads) cannot be probed. Query the upload or job directly instead.Caching
Successful probes are cached for 5 minutes (configurable server-side), keyed by normalized URL. A cached response carries cached: true so your client can tell whether values like view_count or is_live may be slightly stale. The probed_at timestamp is re-stamped to the current server time even on cache hits, so it always reflects when the client observed the data.
Errors
The probe endpoint emits the standard Stripe-style error envelope used across the API. See the Error Codes guide for full handling guidance.
| HTTP | Code | When it fires |
|---|---|---|
| 400 | invalid_url | URL is malformed or uses an unsupported scheme (http/https only). |
| 403 | media_unavailable | Media is private, region-blocked, or requires sign-in. |
| 404 | media_not_found | The upstream returned 404 / "video unavailable". |
| 429 | rate_limit_exceeded | Your API key has exceeded its per-minute probe budget. |
| 502 | upstream_error | yt-dlp or ffprobe failed with an unexpected error. |
| 504 | probe_timeout | The probe exceeded the configured timeout (default 20s). Safe to retry. |
Credit-gating example
The canonical pattern: probe → decide → transcribe. The snippets below check duration against the user's remaining credits before enqueuing a job.
import requests
API = "https://asr.lesan.ai"
HEADERS = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
}
def submit_if_affordable(url: str, credit_seconds_remaining: float):
# 1. Probe
probe = requests.post(
f"{API}/v1/media/probe", headers=HEADERS, json={"url": url}
)
probe.raise_for_status()
info = probe.json()
if "live_stream" in info["warnings"]:
raise ValueError("Live streams are not supported")
duration = info["duration_seconds"]
if duration is None:
# duration_unknown — fall back to size-based heuristic or ask the user
return {"status": "needs_confirmation", "probe": info}
if duration > credit_seconds_remaining:
return {
"status": "insufficient_credits",
"needed_seconds": duration,
"available_seconds": credit_seconds_remaining,
"probe": info,
}
# 2. Transcribe
job = requests.post(
f"{API}/v1/transcriptions",
headers=HEADERS,
json={"audio_url": url, "language": "auto", "mode": "async"},
).json()
return {"status": "queued", "job_id": job["id"], "probe": info}See also
- ASR Guide — submit transcription jobs once you've probed.
- Audio Formats — supported containers for direct URL probing.
- Error Codes — full list of error types and retry strategies.
- Rate Limits — probe requests count toward your read-scope quota.