Transcribe WhatsApp Audio Messages with Whisper AI via Groq
Last edited 58 days ago
WhatsApp Audio Transcriber Bot
Overview
Automatically transcribe WhatsApp audio messages to text using AI-powered speech recognition. This workflow receives audio messages via webhook, processes them through Groq's Whisper API, and replies with the transcribed text in the same conversation.
Use Cases
- Accessibility: Help users with hearing impairments access audio content
- Workplace Communication: Quickly scan audio messages in professional settings
- Language Learning: Get text versions of audio for better comprehension
- Meeting Notes: Convert voice messages to searchable text format
- Multilingual Support: Transcribe audio in Portuguese (configurable for other languages)
How it Works
- Message Reception: Webhook receives WhatsApp messages in real-time
- Audio Detection: Filters only audio messages using Switch node
- Format Conversion: Converts base64 audio to MP3 file format
- AI Transcription: Processes audio through Groq API with Whisper Large V3 model
- Response Delivery: Sends transcribed text back to the original conversation
Key Features
- ✅ Real-time Processing: Instant transcription of incoming audio messages
- ✅ High Accuracy: Uses Whisper Large V3 model for reliable transcription
- ✅ Auto-Reply: Automatically responds in the same WhatsApp conversation
- ✅ Message Quoting: References the original audio message in the reply
- ✅ Portuguese Optimized: Configured for Brazilian Portuguese transcription
- ✅ Self-Message Filtering: Ignores messages sent by the bot itself
Prerequisites
Required Services
- Evolution API: WhatsApp integration service
- Groq API: AI transcription service (Whisper model)
- n8n Instance: Workflow automation platform
API Keys & Configuration
- Groq API key (set as environment variable:
GROQ_API_KEY) - Evolution API instance properly configured
- Webhook URL configured in Evolution API
Setup Instructions
- Import Workflow: Import the JSON workflow into your n8n instance
- Configure Environment: Set
GROQ_API_KEYenvironment variable - Setup Webhook: Configure Evolution API to send messages to the webhook endpoint
- Test Connection: Send a test audio message to verify the workflow
Workflow Nodes
- Webhook: Receives WhatsApp messages from Evolution API
- Edit Fields: Extracts relevant data (number, name, message, audio)
- Switch: Filters only audio messages (
audioMessagetype) - Convert to File: Transforms base64 audio to MP3 format
- HTTP Request: Sends audio to Groq API for transcription
- Evolution API: Sends transcribed text back to WhatsApp
Configuration Options
Groq API Settings
- Model:
whisper-large-v3 - Language:
pt(Portuguese) - Temperature:
0(maximum accuracy) - Response Format:
json
Customization Options
- Change language by modifying the
languageparameter - Adjust temperature for different accuracy/creativity balance
- Modify response format for different output styles
Response Format
*Mensagem transcrita automaticamente.*
[Transcribed text content]
Technical Specifications
- Input: Base64 encoded audio from WhatsApp
- Output: Plain text transcription
- Processing Time: Typically 2-5 seconds per audio message
- Supported Audio: MP3 format (converted from WhatsApp audio)
- Language: Portuguese (configurable)
Troubleshooting
- No Response: Check Groq API key and webhook configuration
- Poor Transcription: Ensure audio quality and check language settings
- Error Messages: Monitor n8n execution logs for detailed error information
Version History
- v0.0.1: Initial release with basic transcription functionality
You may also like
New to n8n?
Need help building new n8n workflows? Process automation for you or your company will save you time and money, and it's completely free!





