Free YouTube Transcript API: Options and Limitations [2026]
Looking for a free YouTube transcript API? Learn about available options including youtube-transcript-api Python library, unofficial APIs, and rate limits to consider.
Key Takeaways
- youtube-transcript-api is the most popular free Python library
- Official YouTube Data API requires OAuth and has high unit costs
- Add rate limiting (1-2 second delays) to avoid blocks
- Free APIs are suitable for low-volume and non-critical use
- Consider paid services for production applications
- Always implement error handling for API reliability
Developers often need programmatic access to YouTube transcripts. While YouTube's official API has limitations for captions, several free options exist. Here's what's available and how to use them.
Official vs Unofficial Options
| Option | Cost | Rate Limits | Ease of Use | Reliability |
|---|---|---|---|---|
| YouTube Data API v3 | Free tier | 10K units/day | Medium | High |
| youtube-transcript-api | Free | Unofficial limits | Easy | Medium |
| yt-dlp | Free | None official | Easy | High |
| Web scraping | Free | Risk of blocks | Hard | Low |
Option 1: youtube-transcript-api (Python)
The most popular free option for transcript extraction.
Installation
pip install youtube-transcript-apiBasic Usage
from youtube_transcript_api import YouTubeTranscriptApi
# Get transcript for a video
video_id = "dQw4w9WgXcQ" # Extract from URL
transcript = YouTubeTranscriptApi.get_transcript(video_id)
# transcript is a list of dictionaries
for segment in transcript:
print(f"[{segment['start']:.2f}] {segment['text']}")Output Format
[
{'text': 'Hello and welcome', 'start': 0.0, 'duration': 2.5},
{'text': 'to this video', 'start': 2.5, 'duration': 1.8},
{'text': 'about YouTube transcripts', 'start': 4.3, 'duration': 2.2}
]Advanced Features
Get specific language:
transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'])List available languages:
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
for transcript in transcript_list:
print(f"{transcript.language} - {transcript.language_code}")Get auto-generated vs manual:
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
# Get manual transcript
manual = transcript_list.find_manually_created_transcript(['en'])
# Get auto-generated
auto = transcript_list.find_generated_transcript(['en'])Translate transcript:
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
transcript = transcript_list.find_transcript(['de'])
translated = transcript.translate('en')
print(translated.fetch())Error Handling
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import (
TranscriptsDisabled,
NoTranscriptFound,
VideoUnavailable
)
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id)
except TranscriptsDisabled:
print("Transcripts are disabled for this video")
except NoTranscriptFound:
print("No transcript found for requested language")
except VideoUnavailable:
print("Video is unavailable")Option 2: YouTube Data API v3
Official Google API with caption downloading capabilities.
Setup
- 1.Create Google Cloud project
- 2.Enable YouTube Data API v3
- 3.Create API credentials
- 4.Get API key
Quota Limits
| Operation | Cost (units) | Daily Free Quota |
|---|---|---|
| captions.list | 50 | 10,000 units |
| captions.download | 200 | 10,000 units |
Usage
from googleapiclient.discovery import build
api_key = "YOUR_API_KEY"
youtube = build('youtube', 'v3', developerKey=api_key)
# List available captions
request = youtube.captions().list(
part="snippet",
videoId=video_id
)
response = request.execute()
# Download requires OAuth (not just API key)
# More complex setup neededLimitations
- OAuth required for caption download
- High unit cost per download
- Only works on your own videos (or with permission)
- Complex setup
Option 3: yt-dlp as API
Use yt-dlp programmatically for reliable extraction.
Python Integration
import subprocess
import json
def get_transcript_ytdlp(video_url):
cmd = [
'yt-dlp',
'--write-auto-sub',
'--sub-format', 'json3',
'--skip-download',
'-o', '%(id)s',
video_url
]
subprocess.run(cmd, capture_output=True)
# Read the generated file
video_id = video_url.split('v=')[1].split('&')[0]
with open(f'{video_id}.en.json3', 'r') as f:
return json.load(f)yt-dlp Python Library
import yt_dlp
def get_subtitles(video_url):
ydl_opts = {
'writeautomaticsub': True,
'subtitlesformat': 'json3',
'skip_download': True,
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(video_url, download=False)
return info.get('subtitles', {})Rate Limits and Best Practices
Unofficial API Limits
| Library | Observed Limits | Risk |
|---|---|---|
| youtube-transcript-api | ~100-500/hour | IP blocks |
| yt-dlp | ~50-100/hour | Throttling |
| Direct scraping | Very low | Account bans |
Best Practices
1. Add delays:
import time
for video_id in video_ids:
transcript = get_transcript(video_id)
time.sleep(2) # Wait 2 seconds between requests2. Handle rate limits:
import time
from youtube_transcript_api import YouTubeTranscriptApi
def get_transcript_with_retry(video_id, max_retries=3):
for attempt in range(max_retries):
try:
return YouTubeTranscriptApi.get_transcript(video_id)
except Exception as e:
if "429" in str(e) or "rate" in str(e).lower():
wait_time = (attempt + 1) * 60 # Exponential backoff
time.sleep(wait_time)
else:
raise
raise Exception("Max retries exceeded")3. Use proxies for scale:
# Note: Check terms of service before using proxies
import os
os.environ['HTTP_PROXY'] = 'http://proxy:port'
os.environ['HTTPS_PROXY'] = 'http://proxy:port'Building a Simple API Wrapper
Flask Example
from flask import Flask, jsonify, request
from youtube_transcript_api import YouTubeTranscriptApi
app = Flask(__name__)
@app.route('/transcript/<video_id>')
def get_transcript(video_id):
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id)
return jsonify({
'success': True,
'transcript': transcript
})
except Exception as e:
return jsonify({
'success': False,
'error': str(e)
}), 400
if __name__ == '__main__':
app.run(debug=True)FastAPI Example
from fastapi import FastAPI, HTTPException
from youtube_transcript_api import YouTubeTranscriptApi
app = FastAPI()
@app.get("/transcript/{video_id}")
async def get_transcript(video_id: str, lang: str = "en"):
try:
transcript = YouTubeTranscriptApi.get_transcript(
video_id,
languages=[lang]
)
return {"transcript": transcript}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))When Free APIs Aren't Enough
Consider Paid Options When:
| Scenario | Free Limit | Paid Solution |
|---|---|---|
| High volume (1000+/day) | Too slow/risky | Paid API service |
| Production app | Unreliable | Paid service |
| SLA required | None | Enterprise API |
| Support needed | Community only | Paid service |
Paid API Alternatives
| Service | Cost | Features |
|---|---|---|
| RapidAPI providers | $10-50/mo | Various limits |
| AssemblyAI | Pay per minute | Full transcription |
| Rev.com API | Pay per minute | High accuracy |
Code Examples
Batch Processing
from youtube_transcript_api import YouTubeTranscriptApi
import time
import json
def batch_get_transcripts(video_ids, output_file):
results = {}
for i, video_id in enumerate(video_ids):
print(f"Processing {i+1}/{len(video_ids)}: {video_id}")
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id)
results[video_id] = {
'success': True,
'transcript': transcript
}
except Exception as e:
results[video_id] = {
'success': False,
'error': str(e)
}
# Rate limiting
time.sleep(1.5)
with open(output_file, 'w') as f:
json.dump(results, f, indent=2)
return results
# Usage
video_ids = ['abc123', 'def456', 'ghi789']
batch_get_transcripts(video_ids, 'transcripts.json')Extract Plain Text
def get_plain_text(video_id):
transcript = YouTubeTranscriptApi.get_transcript(video_id)
return ' '.join([segment['text'] for segment in transcript])
# Usage
text = get_plain_text('dQw4w9WgXcQ')
print(text)Save as SRT
def transcript_to_srt(video_id, output_file):
transcript = YouTubeTranscriptApi.get_transcript(video_id)
def format_time(seconds):
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = int(seconds % 60)
millis = int((seconds % 1) * 1000)
return f"{hours:02d}:{minutes:02d}:{secs:02d},{millis:03d}"
srt_content = ""
for i, segment in enumerate(transcript, 1):
start = segment['start']
end = start + segment['duration']
text = segment['text']
srt_content += f"{i}\n"
srt_content += f"{format_time(start)} --> {format_time(end)}\n"
srt_content += f"{text}\n\n"
with open(output_file, 'w') as f:
f.write(srt_content)
# Usage
transcript_to_srt('dQw4w9WgXcQ', 'transcript.srt')Frequently Asked Questions
Conclusion
Quick start:
pip install youtube-transcript-apifrom youtube_transcript_api import YouTubeTranscriptApi
transcript = YouTubeTranscriptApi.get_transcript("VIDEO_ID")Start building your transcript integration today!
Written By
The NoteLM team specializes in AI-powered video summarization and learning tools. We are passionate about making video content more accessible and efficient for learners worldwide.
Sources & References
Was this article helpful?