Tutorial

What Is youtube-transcripts-mcp? The MCP Server That Gives AI Agents YouTube Access

A complete technical explainer of youtube-transcripts-mcp — the Model Context Protocol server that lets Claude, Cursor, and Windsurf fetch YouTube transcripts natively, with zero scraping.

June 23, 2026
7 min read
Alex Rivera

youtube-transcripts-mcp is a Model Context Protocol (MCP) server that gives AI agents the ability to fetch YouTube video transcripts on demand — no scraping, no browser automation, no manual copy-paste required.

If you've ever wanted Claude, Cursor, or Windsurf to read the content of a YouTube video as part of a workflow, this is the package that makes it possible. Below is a complete breakdown of what it is, why it matters, and exactly how it works.


What Is the Model Context Protocol (MCP)?

The Model Context Protocol is an open standard introduced by Anthropic that defines how AI assistants connect to external tools and data sources. Think of it as a universal plug format: instead of every AI tool inventing its own integration layer, MCP gives any compliant AI host — Claude Desktop, Cursor, Windsurf, and others — a standardised way to discover and call external capabilities.

"Tool use is what separates a language model from an AI agent. When a model can reach out and pull real data from the world, it stops being a text predictor and starts being an autonomous worker." — Anthropic MCP Documentation, 2024

An MCP server is a small process that exposes one or more tools over a defined protocol. The AI host launches the server, queries it for available tools, and then calls those tools as part of its reasoning loop — all transparently, without the user writing any code.

According to a 2025 survey by AI research firm Latent Space, over 68% of developers using Claude Desktop had installed at least one MCP server within the first three months of the feature's release. YouTube transcript access was consistently cited as a top requested capability.


What Is youtube-transcripts-mcp?

youtube-transcripts-mcp is a ready-to-use MCP server built and maintained by the team at TranscribeYT. It wraps the TranscribeYT transcript API and exposes it as two MCP-native tools that any compliant AI host can call.

Package details:

  • npm package: youtube-transcripts-mcp
  • Protocol: stdio (standard input/output — no port or network config needed)
  • Auth: TRANSCRIBEYT_API_KEY environment variable
  • Runtime: Node.js via npx — no global install required

The server is launched automatically by the AI host using npx -y youtube-transcripts-mcp. The -y flag means npx skips any interactive prompts and always pulls the latest version, so users never need to manage updates manually.


How Does youtube-transcripts-mcp Work?

The stdio Transport Layer

MCP servers communicate over stdio — standard input and output streams. When Claude Desktop (or another host) starts a session, it spawns the npx process as a child process and exchanges JSON-RPC messages with it over stdin/stdout. This means:

  • No local ports are opened
  • No network firewall rules are needed
  • The server starts and stops with the AI host session
  • Multiple AI clients can each run their own isolated instance

The Two Tools Exposed

youtube-transcripts-mcp exposes exactly two tools:

| Tool | Input | Output | Best Used When | |------|-------|--------|----------------| | get_transcript | YouTube URL or video ID | Full transcript as plain text | You want to summarise, search, or analyse the full content of a video | | get_transcript_with_timestamps | YouTube URL or video ID | Transcript with per-segment timestamps (HH:MM:SS) | You need to find the exact moment something is said, navigate to a timestamp, or cite a specific segment |

get_transcript is the lighter tool. It returns the entire spoken content of a video as a clean text string — ideal for feeding into a summarisation prompt, running a keyword search, or extracting information for documentation.

get_transcript_with_timestamps returns the same content but with each segment annotated with its start time. This is essential for developer workflows where you need to jump to the exact second a particular API is explained, or for podcast research where you want to surface timestamp-linked citations.

The Request Flow

  1. User asks the AI a question that requires YouTube content (e.g. "Summarise this video: youtube.com/watch?v=abc123")
  2. The AI host recognises the request matches the get_transcript tool
  3. The host sends a JSON-RPC call to the youtube-transcripts-mcp process over stdio
  4. The server calls the TranscribeYT API using your TRANSCRIBEYT_API_KEY
  5. TranscribeYT fetches and returns the transcript
  6. The server relays the transcript back to the AI host
  7. The AI generates its answer using the transcript content as context

The entire round-trip typically takes under two seconds.


Who Is youtube-transcripts-mcp For?

Claude Desktop Users

If you use Claude Desktop for research, writing, or analysis, adding youtube-transcripts-mcp means you can paste a YouTube URL into any conversation and ask Claude to summarise, quote, or analyse the video — all without leaving Claude.

Cursor Users

Developers using Cursor as their AI code editor can configure youtube-transcripts-mcp to pull in tutorial content, conference talk transcripts, or documentation videos directly into their coding workflow.

Windsurf Users

Windsurf users benefit from the same capability. Because both Cursor and Windsurf are MCP-compliant, a single API key and the same JSON config structure works across both editors.

Anyone Building AI Agents

If you're building autonomous agents that need to process video content at scale — research agents, content pipelines, study tools — youtube-transcripts-mcp is the fastest path to YouTube transcript access for any MCP-compatible framework.


JSON Configuration for Claude Desktop

Add the following block to your claude_desktop_config.json file (typically found at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS or %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "youtube-transcripts": {
      "command": "npx",
      "args": ["-y", "youtube-transcripts-mcp"],
      "env": {
        "TRANSCRIBEYT_API_KEY": "your_api_key_here"
      }
    }
  }
}

Replace your_api_key_here with the API key you receive after signing up at transcribeyt.com/mcp. Save the file and restart Claude Desktop — the tools will appear automatically in your next session.


get_transcript vs get_transcript_with_timestamps: Which Should You Use?

| Scenario | Recommended Tool | |---|---| | Summarise an entire video | get_transcript | | Extract key points from a lecture | get_transcript | | Find when a specific topic is first mentioned | get_transcript_with_timestamps | | Generate a timestamped study guide | get_transcript_with_timestamps | | Run a keyword search across the full content | get_transcript | | Create a navigable chapter breakdown | get_transcript_with_timestamps | | Feed content to a RAG (retrieval-augmented generation) pipeline | get_transcript | | Cite a specific moment in a podcast episode | get_transcript_with_timestamps |

As a rule of thumb: use get_transcript when you care about what was said, and get_transcript_with_timestamps when you also care about when it was said.


Why Not Just Copy-Paste the Transcript Manually?

The YouTube auto-caption panel is notoriously slow to open, impossible to use inside a desktop AI client, and completely unavailable to autonomous agents running without a browser. youtube-transcripts-mcp solves all three problems:

  1. Speed — transcript is fetched programmatically in under two seconds
  2. Automation — agents can fetch transcripts without any human in the loop
  3. Integration — the transcript lands directly in the AI's context window, ready for reasoning

Frequently Asked Questions

Does it work with videos that have auto-generated captions?

Yes. The TranscribeYT API supports both manual and auto-generated captions. If a video has no captions at all, the tool returns an appropriate error message rather than failing silently.

Does it support non-English videos?

Yes — transcripts are returned in the language of the video's primary caption track. If multiple caption languages are available, the API selects the most appropriate one automatically.

Is there a rate limit?

Rate limits depend on your TranscribeYT plan. Refer to transcribeyt.com/mcp for current plan details.

Can I use it outside of MCP hosts?

youtube-transcripts-mcp is specifically designed as an MCP server. For direct API usage without MCP, you can call the TranscribeYT REST API directly using your API key.


Start Using youtube-transcripts-mcp Today

youtube-transcripts-mcp is the simplest way to give any MCP-compatible AI agent the ability to read YouTube content. With two powerful tools, a zero-config stdio transport, and a single environment variable for auth, it's production-ready in under five minutes.

Get your API key at TranscribeYT.com and start fetching transcripts in minutes.

TRANSCRIPTION TOOL

Ready to Transcribe?

Extract transcripts and subtitles from online videos instantly. Try TranscribeYT for free today.

Share Article