n8nworkflows.io

Narrating over a Video using Multimodal AI

Download [19.6KB]

Nodes

+6

Categories

AI & Machine Learning

Tags

#OpenAI #ElevenLabs

Created by

Last edited 151 days ago

This n8n template takes a video and extracts frames from it which are used with a multimodal LLM to generate a script. The script is then passed to the same multimodal LLM to generate a voiceover clip.

This template was inspired by Processing and narrating a video with GPT's visual capabilities and the TTS API

How it works

Video is downloaded using the HTTP node.
Python code node is used to extract the frames using OpenCV.
Loop node is used o batch the frames for the LLM to generate partial scripts.
All partial scripts are combined to form the full script which is then sent to OpenAI to generate audio from it.
The finished voiceover clip is uploaded to Google Drive.

Sample the finished product here: https://drive.google.com/file/d/1-XCoii0leGB2MffBMPpCZoxboVyeyeIX/view?usp=sharing

Requirements

OpenAI for LLM
Ideally, a mid-range (16GB RAM) machine for acceptable performance!

Customising this workflow

For larger videos, consider splitting into smaller clips for better performance
Use a multimodal LLM which supports fully video such as Google's Gemini.

You may also like

Create AI-Powered YouTube Shorts with OpenAI, ElevenLabs, 0CodeKit!

Create AI-Powered YouTube Shorts with OpenAI, ElevenLabs, 0CodeKit!

Generate AI Videos from Text with HeyGen and Voice Cloning.

Generate AI Videos from Text with HeyGen and Voice Cloning.

Transcribe YouTube Videos with AI Enhancement via Chat Interface

Transcribe YouTube Videos with AI Enhancement via Chat Interface

New to n8n?

Need help building new n8n workflows? Process automation for you or your company will save you time and money, and it's completely free!

Trending

Generate AI viral videos with NanoBanana & VEO3, shared on socials via Blotato

Generate AI viral videos with NanoBanana & VEO3, shared on socials via Blotato

Generate & Publish Professional Video Ads with Veo 3, Gemini & Creatomate

Generate & Publish Professional Video Ads with Veo 3, Gemini & Creatomate

Build a Multichannel Customer Support AI Assistant with Chatwoot & OpenRouter

Build a Multichannel Customer Support AI Assistant with Chatwoot & OpenRouter

zrGeorge Zargaryan