loading…
Search for a command to run...
loading…
A Model Context Protocol (MCP) server that provides comprehensive video tools: transcript retrieval, video downloading, and automatic subtitle generation using
A Model Context Protocol (MCP) server that provides comprehensive video tools: transcript retrieval, video downloading, and automatic subtitle generation using AI speech-to-text. Works with YouTube, Bilibili, Vimeo, and any platform supported by yt-dlp.
A Model Context Protocol (MCP) server that provides comprehensive video tools: transcript retrieval, video downloading, automatic subtitle generation, and direct audio transcription. Works with YouTube, Bilibili, Vimeo, and any platform supported by yt-dlp.
audio_url fetch (allowlisted), small audio_base64, chunked uploads, optional async jobs, server-side Opus compression, structured JSON results| Tool | Description |
|---|---|
get-transcript |
Retrieve existing transcripts from video platforms |
list-transcript-languages |
List available transcript languages for a video |
download-video |
Download videos to local storage |
list-downloads |
List downloaded video files |
generate-subtitles |
Generate subtitles using AI speech-to-text |
transcribe-audio |
Transcribe client-provided audio (URL / base64 / path / resource URI) |
transcribe_upload_start |
Start chunked upload for large audio payloads |
transcribe_upload_append |
Append one base64 chunk to an upload session |
transcribe_upload_finalize |
Finish upload and run transcription |
transcribe_get_job |
Poll async transcription jobs |
transcribe_cancel_job |
Cancel an async transcription job |
libopus)yt-dlp (required):
# Using Homebrew (macOS)
brew install yt-dlp
# Using pip
pip install yt-dlp
ffmpeg (required for subtitle generation):
# Using Homebrew (macOS)
brew install ffmpeg
# Using apt (Ubuntu/Debian)
sudo apt install ffmpeg
Local Whisper (optional, for local subtitle generation):
pip install openai-whisper
git clone <repository-url>
cd transcript-mcp
npm install
npm run build
npm install -g transcript-mcp
Add the MCP server to your configuration file:
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"transcript-mcp": {
"command": "node",
"args": ["/path/to/transcript-mcp/dist/index.js"],
"env": {
"TRANSCRIPT_MCP_STORAGE_DIR": "/path/to/downloads",
"OPENAI_API_KEY": "your-openai-api-key"
}
}
}
}
Cursor (~/.cursor/mcp.json):
{
"mcpServers": {
"transcript-mcp": {
"command": "node",
"args": ["/path/to/transcript-mcp/dist/index.js"],
"env": {
"TRANSCRIPT_MCP_STORAGE_DIR": "/path/to/downloads",
"OPENAI_API_KEY": "your-openai-api-key"
}
}
}
}
| Variable | Description | Default |
|---|---|---|
TRANSCRIPT_MCP_STORAGE_DIR |
Default directory for downloaded videos | ~/.transcript-mcp/downloads |
OPENAI_API_KEY |
OpenAI API key for Whisper-based subtitle generation | None |
TRANSCRIPT_MCP_WHISPER_ENGINE |
Preferred whisper engine: openai, local, or auto |
auto |
VIDEO_TOOLKIT_STORAGE_DIR |
Legacy alias for TRANSCRIPT_MCP_STORAGE_DIR |
— |
VIDEO_TOOLKIT_WHISPER_ENGINE |
Legacy alias for TRANSCRIPT_MCP_WHISPER_ENGINE |
— |
WHISPER_BINARY_PATH |
Path to local whisper binary | whisper |
WHISPER_MODEL_PATH |
Path to whisper model (for local whisper) | Auto-download |
YT_DLP_PATH |
Path to yt-dlp binary | yt-dlp |
FFMPEG_PATH |
Path to ffmpeg binary | ffmpeg |
FFPROBE_PATH |
Path to ffprobe binary | Derived from FFMPEG_PATH |
TRANSCRIPT_MCP_URL_ALLOWLIST |
Comma-separated host patterns allowed for audio_url (e.g. *.amazonaws.com,localhost). Empty disables all audio_url fetches |
empty |
DEBUG |
Enable debug logging | 0 |
Retrieve existing transcripts from video platforms.
Parameters:
url (required): Video URLlang (optional): Language code (e.g., 'en', 'es', 'zh')include_timestamps (optional): Include timestamps (default: true)Example:
Get the transcript from https://www.youtube.com/watch?v=VIDEO_ID
List available transcript languages for a video.
Parameters:
url (required): Video URLExample:
What transcript languages are available for https://www.youtube.com/watch?v=VIDEO_ID?
Download a video to local storage.
Parameters:
url (required): Video URL to downloadoutput_dir (optional): Custom output directoryfilename (optional): Custom filenameformat (optional): Video format - mp4, webm, mkv (default: mp4)quality (optional): Quality - best, 1080p, 720p, 480p, 360p, audio (default: best)Example:
Download this video: https://www.youtube.com/watch?v=VIDEO_ID
List all downloaded video files.
Parameters:
directory (optional): Directory to list (default: storage directory)Example:
List my downloaded videos
Generate subtitles for a local video file using AI speech-to-text.
Parameters:
video_path (required): Absolute path to the video fileengine (optional): openai or local (default: auto-detect)language (optional): Language code for transcriptionoutput_format (optional): srt or vtt (default: srt)Example:
Generate subtitles for /path/to/video.mp4
Transcribes audio via Whisper. Prefer audio_url (server fetches bytes; configure TRANSCRIPT_MCP_URL_ALLOWLIST). Use audio_base64 only for small clips (about 60KB raw per call; larger payloads should use chunked upload or a URL). audio_path / file:// only work when the MCP host shares a filesystem with the caller (often false in sandboxed clients).
By default the server re-encodes to Opus 16 kHz mono 16 kbps before Whisper. Set skip_compression: true if you already optimized the file.
Audio longer than 5 minutes (or when async: true) returns { job_id, status: "processing" }; poll transcribe_get_job.
Parameters (one required input):
audio_url, audio_path, audio_base64, or audio_resource_uri (file:// / data:...;base64,...)filename (optional): Hint when magic-byte detection is inconclusiveskip_compression (optional): Skip Opus recompression (default: false)engine (optional): openai, local, or auto (default: auto)language (optional): Language hint for transcriptioninclude_timestamps (optional): When as_text is true, include [MM:SS] lines (default: true)as_text (optional): If true, return plain transcript text; if false, return structured JSON (default: false)async (optional): Force async job (default: false)Examples:
Transcribe this presigned URL (after allowlisting the host): audio_url=...
Transcribe this audio file on the MCP host: /path/to/interview.m4a
For large files, split the raw bytes into base64 chunks of at most max_chunk_bytes (~60KB) from transcribe_upload_start, call transcribe_upload_append for each index, then transcribe_upload_finalize. Abandoned uploads are garbage-collected after about an hour.
Poll or cancel async jobs created by transcribe-audio (long audio or async: true).
OPENAI_API_KEY environment variablepip install openai-whisperThe tool auto-detects which engine to use:
OPENAI_API_KEY is set, uses OpenAI WhisperFor transcribe-audio, auto uses OpenAI first and falls back to local whisper when local whisper is available.
1. Download this video: https://www.youtube.com/watch?v=VIDEO_ID
2. Generate subtitles for the downloaded file
Get the transcript from https://www.youtube.com/watch?v=VIDEO_ID and summarize the key points
1. Download the video: https://vimeo.com/123456789
2. Generate English subtitles for it
Any platform supported by yt-dlp, including:
Full list: https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md
transcript-mcp/
├── src/
│ ├── index.ts # Main MCP server entry point
│ ├── transcript-fetcher.ts # Transcript fetching using yt-dlp
│ ├── video-downloader.ts # Video download functionality
│ ├── subtitle-generator.ts # AI-powered subtitle generation
│ ├── config.ts # Configuration management
│ ├── url-detector.ts # Platform detection from URLs
│ ├── parser.ts # Transcript parsing (SRT, VTT, JSON)
│ └── errors.ts # Custom error classes
├── test/
│ └── transcript.test.ts # Unit tests
├── dist/ # Compiled JavaScript (after build)
└── package.json
# Build
npm run build
# Test
npm test
# Development mode
npm run dev
brew install yt-dlp
# or
pip install yt-dlp
brew install ffmpeg
brew install ffmpeg
Either:
OPENAI_API_KEY environment variable, orpip install openai-whisperMIT
Run in your terminal:
claude mcp add video-toolkit-mcp -- npx Yes, Video Toolkit MCP is free — one-click install via Unyly at no cost.
No, Video Toolkit runs without API keys or environment variables.
Self-hosted: the server runs locally on your machine via the install command above.
Open Video Toolkit on unyly.org, pick your client tab (Claude Desktop, Claude Code, Cursor) and press Install — the config is generated automatically, no JSON editing.
Transcripts, channel stats, search
by YouTubeAI image generation using various models.
by modelcontextprotocolUnified GPU inference API with 30 AI services (LLM, image gen, video, TTS, whisper, embeddings, reranking, OCR) as MCP tools. Pay-per-use via x402 USDC or API k
by gpu-bridgeA powerful image generation tool using Google's Imagen 3.0 API through MCP. Generate high-quality images from text prompts with advanced photography, artistic,
by hamflxNot sure what to pick?
Find your stack in 60 seconds
Author?
Embed badge for your README
Browse similar
All media MCPs