On This Page

features12 min read~12 min left

YouTube Transcript API: Developer Guide and Free Alternatives

Learn how to access YouTube transcripts programmatically. This developer guide covers the official YouTube Data API limitations, free third-party libraries like youtube-transcript-api (Python), and REST API alternatives for extracting video captions.

By NoteLM TeamPublished 2026-01-04
Share:

Key Takeaways

  • youtube-transcript-api (Python) is the easiest solution—free, no API key needed
  • Official YouTube Data API requires OAuth and only works for videos you own
  • yt-dlp provides command-line batch processing capabilities
  • Add delays between requests to avoid rate limiting
  • About 85% of YouTube videos have auto-generated captions available
  • You can build your own REST API wrapper using FastAPI or Flask

The easiest way to get YouTube transcripts programmatically is using the youtube-transcript-api Python library—it's free, requires no API key, and works with any video that has captions. The official YouTube Data API can access captions but requires OAuth authentication and only works for videos you own. This guide covers both approaches plus alternative methods.

YouTube Transcript API Options Overview

MethodAuth RequiredRate LimitsLanguagesBest For
youtube-transcript-api (Python)NoneReasonableAllMost developers
YouTube Data API v3OAuth10,000 units/dayAllVideo owners only
yt-dlp (CLI)NoneNoneAllBatch processing
Web scrapingNoneRiskyEnglishLast resort

Method 1: youtube-transcript-api (Python)

The most popular solution for extracting YouTube transcripts programmatically. Open-source, free, and requires no API keys.

Installation

pip install youtube-transcript-api

Basic Usage

from youtube_transcript_api import YouTubeTranscriptApi

# Get transcript for a video
video_id = "dQw4w9WgXcQ"
transcript = YouTubeTranscriptApi.get_transcript(video_id)

# Print each segment
for segment in transcript:
    print(f"[{segment['start']:.2f}] {segment['text']}")

Output Format

[
    {'text': 'Hey there', 'start': 0.0, 'duration': 1.5},
    {'text': 'welcome to the video', 'start': 1.5, 'duration': 2.0},
    {'text': 'today we are going to', 'start': 3.5, 'duration': 2.5},
    # ... more segments
]

Getting Transcripts in Different Languages

from youtube_transcript_api import YouTubeTranscriptApi

# Get Spanish transcript
transcript_es = YouTubeTranscriptApi.get_transcript(
    video_id,
    languages=['es']
)

# Try multiple languages (falls back)
transcript = YouTubeTranscriptApi.get_transcript(
    video_id,
    languages=['en', 'en-US', 'en-GB']
)

List Available Transcripts

from youtube_transcript_api import YouTubeTranscriptApi

transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)

for transcript in transcript_list:
    print(f"Language: {transcript.language}")
    print(f"Language code: {transcript.language_code}")
    print(f"Is generated: {transcript.is_generated}")
    print(f"Is translatable: {transcript.is_translatable}")
    print("---")

Translate Transcripts

# Get English transcript and translate to Spanish
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
transcript = transcript_list.find_transcript(['en'])
translated = transcript.translate('es')
text = translated.fetch()

Formatting Options

from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import (
    TextFormatter,
    SRTFormatter,
    JSONFormatter
)

transcript = YouTubeTranscriptApi.get_transcript(video_id)

# Plain text
text_formatter = TextFormatter()
text_output = text_formatter.format_transcript(transcript)

# SRT format (for subtitles)
srt_formatter = SRTFormatter()
srt_output = srt_formatter.format_transcript(transcript)

# JSON format
json_formatter = JSONFormatter()
json_output = json_formatter.format_transcript(transcript)

Error Handling

from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import (
    TranscriptsDisabled,
    NoTranscriptFound,
    VideoUnavailable
)

try:
    transcript = YouTubeTranscriptApi.get_transcript(video_id)
except TranscriptsDisabled:
    print("Transcripts are disabled for this video")
except NoTranscriptFound:
    print("No transcript found for requested language")
except VideoUnavailable:
    print("Video is unavailable (private, deleted, etc.)")
except Exception as e:
    print(f"An error occurred: {e}")

Batch Processing Multiple Videos

from youtube_transcript_api import YouTubeTranscriptApi

video_ids = ["video1_id", "video2_id", "video3_id"]

transcripts = YouTubeTranscriptApi.get_transcripts(
    video_ids,
    languages=['en']
)

# transcripts is a dict: {video_id: transcript_data}
for video_id, transcript in transcripts[0].items():
    print(f"Video: {video_id}")
    print(f"Segments: {len(transcript)}")

Method 2: YouTube Data API v3

The official API provides caption access but with significant limitations.

Requirements

  • Google Cloud project
  • YouTube Data API enabled
  • OAuth 2.0 credentials
  • Video must be owned by authenticated user OR have downloadable captions

Setup Steps

Step 1
Create project in Google Cloud Console
Step 2
Enable YouTube Data API v3
Step 3
Create OAuth 2.0 credentials
Step 4
Install client library:
pip install google-api-python-client google-auth-oauthlib

List Captions for a Video

from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow

SCOPES = ['https://www.googleapis.com/auth/youtube.force-ssl']

# Authenticate
flow = InstalledAppFlow.from_client_secrets_file(
    'client_secret.json', SCOPES
)
credentials = flow.run_local_server()

# Build API client
youtube = build('youtube', 'v3', credentials=credentials)

# List captions
request = youtube.captions().list(
    part="snippet",
    videoId="VIDEO_ID"
)
response = request.execute()

for caption in response['items']:
    print(f"Track: {caption['snippet']['name']}")
    print(f"Language: {caption['snippet']['language']}")
    print(f"ID: {caption['id']}")

Download Caption Track

# Download specific caption track
request = youtube.captions().download(
    id="CAPTION_TRACK_ID",
    tfmt="srt"  # or "sbv", "vtt"
)
caption_content = request.execute()

Limitations of Official API

LimitationDetails
OAuth requiredCan't use simple API key
Video ownershipOnly download captions for videos you own
Quota usage50 units per caption list, 200 per download
Daily quota10,000 units/day (free tier)
No auto-captionsCan't download auto-generated captions via API

When to Use Official API

✅ Managing captions on your own videos

✅ Building YouTube management tools

✅ Enterprise applications with quota needs

✅ Need official support/documentation

❌ Extracting transcripts from any video

❌ Simple transcript extraction projects

❌ Rate-limit sensitive applications

Method 3: yt-dlp (Command Line)

For batch processing or when you prefer command-line tools.

Installation

# pip
pip install yt-dlp

# Homebrew (Mac)
brew install yt-dlp

# Chocolatey (Windows)
choco install yt-dlp

Download Subtitles

# Download auto-generated English subtitles
yt-dlp --write-auto-sub --sub-lang en --skip-download "VIDEO_URL"

# Download manual subtitles
yt-dlp --write-sub --sub-lang en --skip-download "VIDEO_URL"

# Convert to SRT format
yt-dlp --write-auto-sub --convert-subs srt --skip-download "VIDEO_URL"

# Download all available subtitles
yt-dlp --all-subs --skip-download "VIDEO_URL"

Python Integration

import yt_dlp

def get_subtitles(video_url):
    ydl_opts = {
        'writeautomaticsub': True,
        'writesubtitles': True,
        'subtitleslangs': ['en'],
        'skip_download': True,
        'outtmpl': '%(id)s',
    }
    
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        info = ydl.extract_info(video_url, download=False)
        
        # Get subtitle data
        subtitles = info.get('subtitles', {})
        auto_captions = info.get('automatic_captions', {})
        
        return {
            'manual': subtitles,
            'auto': auto_captions,
            'title': info.get('title'),
            'duration': info.get('duration')
        }

# Usage
result = get_subtitles("https://youtube.com/watch?v=VIDEO_ID")
print(result)

Method 4: REST API Alternatives

Some services offer REST APIs for transcript extraction.

Building Your Own API

You can wrap the youtube-transcript-api in a Flask/FastAPI server:

from fastapi import FastAPI, HTTPException
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import (
    TranscriptsDisabled,
    NoTranscriptFound
)

app = FastAPI()

@app.get("/transcript/{video_id}")
async def get_transcript(video_id: str, lang: str = "en"):
    try:
        transcript = YouTubeTranscriptApi.get_transcript(
            video_id,
            languages=[lang, 'en']
        )
        return {
            "video_id": video_id,
            "language": lang,
            "segments": transcript,
            "text": " ".join([s['text'] for s in transcript])
        }
    except TranscriptsDisabled:
        raise HTTPException(404, "Transcripts disabled for this video")
    except NoTranscriptFound:
        raise HTTPException(404, "No transcript found")
    except Exception as e:
        raise HTTPException(500, str(e))

@app.get("/transcript/{video_id}/languages")
async def list_languages(video_id: str):
    try:
        transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
        languages = []
        for t in transcript_list:
            languages.append({
                "code": t.language_code,
                "name": t.language,
                "is_generated": t.is_generated,
                "is_translatable": t.is_translatable
            })
        return {"video_id": video_id, "languages": languages}
    except Exception as e:
        raise HTTPException(500, str(e))

Run the API

pip install fastapi uvicorn
uvicorn main:app --reload

API Endpoints

EndpointMethodDescription
/transcript/{video_id}GETGet transcript text
/transcript/{video_id}?lang=esGETGet transcript in Spanish
/transcript/{video_id}/languagesGETList available languages

Rate Limits and Best Practices

youtube-transcript-api Rate Limits

The library doesn't have official rate limits, but YouTube may throttle:

  • Reasonable use: 100-500 requests/hour
  • Heavy use: May trigger CAPTCHAs
  • Best practice: Add delays between requests
import time
from youtube_transcript_api import YouTubeTranscriptApi

video_ids = ["id1", "id2", "id3", ...]

transcripts = []
for video_id in video_ids:
    try:
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        transcripts.append((video_id, transcript))
    except Exception as e:
        print(f"Error for {video_id}: {e}")
    
    time.sleep(1)  # 1 second delay between requests

YouTube Data API Quotas

OperationCost (units)
captions.list50
captions.download200
captions.insert400
captions.update450
captions.delete50

Daily quota: 10,000 units (free tier)

Error Handling Best Practices

import logging
from tenacity import retry, stop_after_attempt, wait_exponential

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
def get_transcript_with_retry(video_id):
    """Get transcript with automatic retry on failure."""
    try:
        return YouTubeTranscriptApi.get_transcript(video_id)
    except Exception as e:
        logger.warning(f"Attempt failed for {video_id}: {e}")
        raise

# Usage
try:
    transcript = get_transcript_with_retry("VIDEO_ID")
except Exception as e:
    logger.error(f"All attempts failed: {e}")

Frequently Asked Questions

Q1Is there an official YouTube transcript API?
YouTube Data API v3 provides caption access, but it requires OAuth authentication and only allows downloading captions for videos you own. For extracting transcripts from any video, use third-party libraries like youtube-transcript-api.
Q2Do I need an API key for YouTube transcripts?
No. The youtube-transcript-api Python library doesn't require any API keys. It extracts transcripts directly from YouTube's public caption data. The official YouTube Data API requires OAuth 2.0 credentials.
Q3Can I get auto-generated captions via API?
Yes, with youtube-transcript-api. The library can access both manual and auto-generated captions. The official YouTube Data API cannot download auto-generated captions—only manual ones.
Q4What's the rate limit for transcript extraction?
youtube-transcript-api has no official rate limit, but YouTube may throttle heavy use. Best practice is adding 1-second delays between requests. The official API has a 10,000 unit daily quota.
Q5Can I extract transcripts in multiple languages?
Yes. youtube-transcript-api supports all languages available on the video. You can also use its translation feature to translate transcripts to other languages.
Q6How do I handle videos without transcripts?
Check for the NoTranscriptFound or TranscriptsDisabled exceptions. About 15% of YouTube videos have no captions at all. For these, you'd need a separate speech-to-text service.
Q7Is transcript extraction against YouTube's Terms of Service?
Extracting publicly available caption data for personal use is generally acceptable. Commercial use at scale may require legal review. Always respect rate limits and don't overload YouTube's servers.
Q8Which method is fastest for bulk extraction?
yt-dlp with batch mode is fastest for downloading many videos. youtube-transcript-api is better for selective extraction with error handling. Add delays to avoid rate limiting.

Conclusion

For most developers, the youtube-transcript-api Python library is the best solution—it's free, requires no authentication, and works with any video that has captions. Use the official YouTube Data API only when managing captions on your own videos. For batch processing, yt-dlp provides command-line efficiency.

Quick start:

pip install youtube-transcript-api
from youtube_transcript_api import YouTubeTranscriptApi
transcript = YouTubeTranscriptApi.get_transcript("VIDEO_ID")

Need a no-code solution? Try NoteLM.ai's YouTube Transcript Generator for instant transcript extraction without any programming.

Try NoteLM.ai →

Written By

NoteLM Team

The NoteLM team specializes in AI-powered video summarization and learning tools. We are passionate about making video content more accessible and efficient for learners worldwide.

AI/ML DevelopmentVideo ProcessingEducational Technology
Last verified: January 4, 2026
API behavior and rate limits may change. Always refer to official documentation for current specifications.

Was this article helpful?