📥 Transform Google Drive Documents into Vector Embeddings
Categories
Created by
Last edited 10 days ago
Automatically convert documents from Google Drive into vector embeddings using OpenAI, LangChain, and PGVector — fully automated through n8n.
⚙️ What It Does
This workflow monitors a Google Drive folder for new files, supports multiple file types (PDF, TXT, JSON), and processes them into vector embeddings using OpenAI’s text-embedding-3-small
model. These embeddings are stored in a Postgres database using the PGVector extension, making them query-ready for semantic search or RAG-based AI agents.
After successful processing, files are moved to a separate “vectorized” folder to avoid duplication.
💡 Use Cases
- Powering Retrieval-Augmented Generation (RAG) AI agents
- Semantic search across private documents
- AI assistant knowledge ingestion
- Automated document pipelines for indexing or classification
🧠 Workflow Highlights
- Trigger Options: Manual or Scheduled (3 AM daily by default)
- Supported File Types: PDF, TXT, JSON
- Embedding Stack: LangChain Text Splitter, OpenAI Embeddings, PGVector
- Deduplication: Files are moved after processing
- License: CC BY-SA 4.0
- Author: AlexK1919
🛠 What You’ll Need
- Google Drive OAuth2 credentials (connected to
Search Folder
,Download File
, andMove File
nodes) - OpenAI API Key (used in the
Embeddings OpenAI
node) - Postgres + PGVector database (connected in the
Postgres PGVector Store
node)
🔧 Step-by-Step Setup Instructions
- Create Google OAuth2 credentials in n8n and connect them to all Google Drive nodes.
- Set your source folder ID in the
Search Folder
node — this is where incoming files are placed. - Set your processed folder ID in the
Move File
node — files will be moved here after vectorization. - Ensure you have a PGVector-enabled Postgres instance and input the table name and collection in the
Postgres PGVector Store
node. - Add your OpenAI credentials to the
Embeddings OpenAI
node and selecttext-embedding-3-small
. - Optional: Activate the
Schedule Trigger
node to run daily or configure your own schedule. - Run manually by triggering
When clicking ‘Test workflow’
for on-demand ingestion.
🧩 Customization Tips
Want to support more file types or enhance the pipeline?
- Add new extractors: Use
Extract from File
with other formats like DOCX, Markdown, or HTML. - Refine logic by file type: The
Switch
node routes files to the correct extraction method based on MIME type (application/pdf
,text/plain
,application/json
). - Pre-process with OCR: Add an OCR step before extraction to handle scanned PDFs or images.
- Add filters: Enhance the
Search Folder
orSwitch
node logic to skip specific files or folders.
📄 License
Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Use, adapt, and share - even commercially - as long as you give proper credit and share alike.
Full License Details
You may also like
New to n8n?
Need help building new n8n workflows? Process automation for you or your company will save you time and money, and it's completely free!