Tag: speech-to-text
Local Speech-to-Text Transcription with Whisper
This skill utilizes the Whisper CLI to perform accurate, local speech-to-text transcription on various audio formats. It allows developers to process audio files without requiring external API keys or network connectivity.
Local Audio Transcription via Whisper CLI
Perform local speech-to-text transcription using the Whisper command-line interface. It supports various audio formats and model sizes to balance speed and accuracy without requiring an API key.
OpenAI Whisper Audio Transcription API
Transcribe audio files via the OpenAI Whisper API endpoint. The implementation supports custom base URLs for compatible proxies and allows for language hints or prompts.
Voice-enabled conversational agent interface
This skill facilitates natural voice interactions by integrating Speech-to-Text and Text-to-Speech capabilities via the `voicemode:converse` MCP tool. It supports advanced conversational patterns, including parallel execution of speech and …
AI Audio Transcription using Whisper AI
This tool transcribes various audio formats (mp3, wav, etc.) into text using Whisper AI. It supports auto-language detection and returns detailed segment breakdowns and timestamps via a paid API endpoint.
Local Speech-to-Text Transcription using Whisper
This tool provides local, offline speech-to-text transcription using the Whisper CLI. It supports various audio formats and allows advanced tasks such as translation and specific output formatting.
OpenAI Audio Transcription API Wrapper
This skill provides a wrapper for transcribing audio files using various OpenAI models, including gpt-4o and whisper-1. It supports advanced features such as speaker diarization, language specification, and custom prompts.