youtube-transcripts-mcp works in both Cursor and Windsurf with a single JSON config block and your TranscribeYT API key — giving your AI coding assistant the ability to read any YouTube video on demand, without leaving your editor.
This guide covers the exact configuration for each editor and walks through four real developer workflows that become possible once the tool is active.
What youtube-transcripts-mcp Adds to Your Coding Workflow
When you're deep in a coding session, video content is almost completely inaccessible. Watching a 40-minute conference talk to find a five-minute code explanation is a workflow-killer. YouTube tutorial series, conference recordings, library author interviews — all of that knowledge sits locked in a format your AI coding assistant can't read.
youtube-transcripts-mcp solves this by exposing two tools to your AI editor:
get_transcript— retrieves the full spoken content of any YouTube video as plain text, ideal for full-context analysis and summarisationget_transcript_with_timestamps— retrieves the same content annotated with per-segment timestamps (HH:MM:SS), ideal for pinpointing exact moments in long recordings
According to the State of AI Developer Tools Report 2025, 71% of developers say they consult YouTube tutorials at least once per day during active development — yet fewer than 4% had any way to bring that content directly into their AI coding assistant before MCP-based tools became available.
Prerequisites
Before configuring either editor, make sure you have:
- Node.js 18+ installed — verify with
node --versionin your terminal - A TranscribeYT API key — sign up at transcribeyt.com/mcp and copy your key from the dashboard
- Cursor or Windsurf installed and up to date
Configuring youtube-transcripts-mcp in Cursor
Cursor reads MCP server configurations from a file located at .cursor/mcp.json in your project root (project-scoped) or ~/.cursor/mcp.json in your home directory (global, applies to all projects).
Step 1: Open or Create the Config File
For a global config that works across all your projects:
mkdir -p ~/.cursor
touch ~/.cursor/mcp.json
Open ~/.cursor/mcp.json in any text editor.
Step 2: Add the youtube-transcripts Configuration
{
"mcpServers": {
"youtube-transcripts": {
"command": "npx",
"args": ["-y", "youtube-transcripts-mcp"],
"env": {
"TRANSCRIBEYT_API_KEY": "your_api_key_here"
}
}
}
}
Step 3: Restart Cursor
Fully quit and reopen Cursor. The MCP server will be launched automatically on startup. You can verify it's active by opening the Cursor chat panel and asking: "What tools do you have available?"
Configuring youtube-transcripts-mcp in Windsurf
Windsurf reads its MCP configuration from ~/.codeium/windsurf/mcp_config.json — a global file that applies across all Windsurf sessions.
Step 1: Open or Create the Config File
mkdir -p ~/.codeium/windsurf
touch ~/.codeium/windsurf/mcp_config.json
Open the file in any text editor.
Step 2: Add the youtube-transcripts Configuration
{
"mcpServers": {
"youtube-transcripts": {
"command": "npx",
"args": ["-y", "youtube-transcripts-mcp"],
"env": {
"TRANSCRIBEYT_API_KEY": "your_api_key_here"
}
}
}
}
Step 3: Restart Windsurf
Quit and reopen Windsurf. The youtube-transcripts tools will be available in the next session.
Side-by-Side Config Comparison
| Setting | Cursor | Windsurf |
|---|---|---|
| Config file path | ~/.cursor/mcp.json | ~/.codeium/windsurf/mcp_config.json |
| Config key | mcpServers | mcpServers |
| command value | "npx" | "npx" |
| args value | ["-y", "youtube-transcripts-mcp"] | ["-y", "youtube-transcripts-mcp"] |
| Auth env var | TRANSCRIBEYT_API_KEY | TRANSCRIBEYT_API_KEY |
| Restart needed? | Yes | Yes |
The JSON structure is identical between both editors — only the file path differs. This means a single API key from transcribeyt.com/mcp works across both tools without any additional configuration.
4 Real Developer Use Cases
Use Case 1: Summarise a Tutorial Before You Start Coding
The situation: You're about to implement a new library you've never used. There's a 35-minute "getting started" video from the library author, but you want the key points without watching the whole thing.
The prompt:
"Summarise this tutorial video and list the main steps for getting started with the library: https://www.youtube.com/watch?v=VIDEO_ID"
What happens: Your AI editor calls get_transcript, reads the full tutorial, and returns a structured summary of the implementation steps — in under 10 seconds. You enter the coding session with full context, not a blank slate.
Best tool: get_transcript
Use Case 2: Find the Exact Timestamp of a Code Explanation
The situation: You watched a conference talk last week and remember the presenter explaining a clever approach to database connection pooling. You want to find exactly where in the video that explanation starts.
The prompt:
"In this conference talk, at what timestamp does the presenter explain their database connection pooling approach? https://www.youtube.com/watch?v=VIDEO_ID"
What happens: Your AI editor calls get_transcript_with_timestamps, scans all segments, and returns the exact timestamp (e.g., 14:32) along with a quote from the surrounding context. You click to the right moment instantly.
Best tool: get_transcript_with_timestamps
Use Case 3: Search for When a Package Is Mentioned in a Conference Talk
The situation: You're evaluating whether to adopt a particular package. You know it was mentioned in a long conference keynote, but you don't want to watch 90 minutes to find the three mentions.
The prompt:
"In this keynote recording, find every moment where 'Prisma' is mentioned and give me the timestamp and surrounding context for each: https://www.youtube.com/watch?v=VIDEO_ID"
What happens: Your AI editor calls get_transcript_with_timestamps and scans the full transcript for every occurrence of "Prisma," returning a list of timestamps and surrounding sentences. You get a complete picture of how the package was discussed — without watching anything.
Best tool: get_transcript_with_timestamps
Use Case 4: Build a Research Agent That Pulls Knowledge from Multiple Videos
The situation: You're architecting a new system and want to cross-reference three different expert talks on the subject before making decisions.
The prompt:
"Read these three conference talks and summarise the key architectural patterns each speaker recommends. Compare their approaches and highlight where they agree or disagree:
- https://www.youtube.com/watch?v=VIDEO_ID_1
- https://www.youtube.com/watch?v=VIDEO_ID_2
- https://www.youtube.com/watch?v=VIDEO_ID_3"
What happens: Your AI editor calls get_transcript three times in sequence, reads all three transcripts, synthesises the content, and produces a comparative analysis — turning 3+ hours of video into an actionable architectural brief.
Best tool: get_transcript
Use Case Summary Table
| Use Case | Tool Used | Time Saved |
|---|---|---|
| Summarise tutorial before coding | get_transcript | ~30 min per video |
| Find timestamp of specific explanation | get_transcript_with_timestamps | Scrubbing through entire video |
| Search for package mentions in long talk | get_transcript_with_timestamps | Watching full recording |
| Multi-video research synthesis | get_transcript | 3+ hours of viewing |
Example Agent Prompts for Your Workflow
Copy and adapt these prompts for your own use:
Daily learning:
"Give me the top 5 actionable tips from this video and format them as a numbered list: [URL]"
Code review prep:
"Does the presenter in this video mention any gotchas or common mistakes with [technology]? Give me the relevant quotes and timestamps: [URL]"
Documentation extraction:
"Extract every code example the presenter discusses in this video, with the timestamp where each appears: [URL]"
Competitive research:
"How does the presenter in this video describe the tradeoffs between [Option A] and [Option B]? Summarise their position: [URL]"
Why Use youtube-transcripts-mcp Instead of a Generic Web Browser Tool?
Some AI editors support a generic web-browsing tool that can visit any URL. YouTube video pages, however, don't render transcript content in the HTML that a browser tool sees — the transcript is loaded dynamically and gated behind the caption interface. A generic browser tool returns the page shell, not the spoken content.
youtube-transcripts-mcp bypasses this entirely by calling the TranscribeYT API directly — a purpose-built endpoint that retrieves clean transcript data, regardless of how YouTube renders its UI.
Get Started in Five Minutes
Both configurations take under five minutes to set up, and the API key you get from transcribeyt.com/mcp works across Cursor, Windsurf, and Claude Desktop simultaneously.
Get your API key at TranscribeYT.com and start fetching transcripts in minutes.