A dual-fallback transcript extraction service that handles every URL format, caches intelligently, and never gives up on getting the transcript.
4
URL Formats Supported
Standard, shortened, Shorts, and embedded
99%+
Retrieval Success Rate
Dual-fallback system with yt-dlp backup
<10ms
Cached Response Time
In-memory caching for repeat requests

Developers building content analysis tools, accessibility services, AI training pipelines, and educational platforms all need the same thing: reliable programmatic access to YouTube transcripts.
The problem? It's harder than it sounds.
URL format chaos — YouTube has at least four URL formats (standard, shortened, Shorts, embedded) and most libraries only handle one or two
Missing transcripts — Not every video has a transcript, and the failure modes are inconsistent and poorly documented
Rate limiting — YouTube aggressively rate-limits transcript requests, causing cascading failures in production applications
Fragile dependencies — Single-library solutions break whenever YouTube changes their internal APIs, which happens frequently
Developers needed a service they could call, get a transcript (or a clear "not available" response), and move on — without worrying about the underlying complexity.
ProxyBoi is a production-ready FastAPI service that extracts YouTube transcripts reliably using a dual-fallback architecture designed to maximize success rates.
The primary extraction path uses the YouTube Transcript API for speed. When that fails — missing captions, geo-restrictions, format issues — the system automatically falls back to yt-dlp, which takes a different extraction approach. This dual-path architecture means ProxyBoi succeeds in cases where single-library solutions give up.
A unified URL parser normalizes all four YouTube URL formats into a canonical video ID before extraction. Standard watch URLs, youtu.be shortlinks, Shorts URLs, and embedded URLs all work identically.
In-memory caching — Repeat requests resolve in under 10ms, reducing load and cost
Configurable rate limiting — Built-in throttling (default 10 req/min) prevents YouTube from blocking the service
Rich metadata — Returns transcript text plus video title, channel, categories, and duration
API key authentication — Secure access control for multi-tenant usage
Docker-ready — Single container deployment with environment-based configuration
We built ProxyBoi as a focused, opinionated service — do one thing and do it exceptionally well.
FastAPI was chosen for its async-first design, which matters when extraction requests can take several seconds. The async architecture means the service handles concurrent requests efficiently without blocking, critical for API consumers processing batches of videos.
The dual-fallback pattern was the key architectural decision. Rather than trying to build one perfect extraction method, we embraced redundancy. The YouTube Transcript API handles the common cases fast; yt-dlp catches everything else. The fallback is transparent to the caller — they get the same response format regardless of which path succeeded.
We tested against hundreds of real YouTube videos across every edge case we could find — videos with auto-generated captions, multiple language tracks, age-restricted content, Shorts, live stream archives, and videos with captions disabled. Each failure case informed a new test and a refinement to the extraction logic.
ProxyBoi is running in production powering content analysis tools, AI training data pipelines, accessibility services, and educational platforms.
The dual-fallback architecture achieves 99%+ transcript retrieval success rates — a significant improvement over single-library approaches that typically fail on 10-15% of videos due to format edge cases and missing caption tracks.
Integration takes minutes, not days. A single REST endpoint accepts any YouTube URL format and returns structured transcript data with metadata. The Docker deployment model means teams can self-host with zero external dependencies.
Cached responses resolve in under 10ms. First-request latency depends on video length but the async architecture ensures the service remains responsive under concurrent load. Rate limiting prevents upstream throttling before it happens.
Content analysis and summarization tools
AI/ML training data extraction
Accessibility services providing searchable video transcripts
Educational platforms converting lectures to study materials
Video SEO tools analyzing competitor content
We tried three other transcript APIs before ProxyBoi. They all choked on edge cases — Shorts URLs, missing captions, rate limits. ProxyBoi just works. We integrated it in an afternoon and haven't thought about it since.
Team
Focused solo-developer sprint
See how we achieved 4 url formats supported — and what we can do for you.
Start a Conversation