Automate loading files (PDF, JSON, text) from Google Drive into a PGVector database using n8n. This powerful n8n workflow creates a RAG knowledge base using OpenAI embeddings.
Download this n8n workflow template and start using it instantly.
This is the ideal n8n template for:
AI/ML Engineers: Who need reliable, scheduled data ingestion into a vector database.
Data Architects: Looking to build or maintain a Retrieval-Augmented Generation (RAG) system knowledge base.
Automation Specialists: Seeking to integrate cloud storage file operations with advanced vector processing using an n8n workflow.
Content Managers: Who frequently upload new documentation (PDFs, text files) that must be instantly available for AI queries.
Building a robust knowledge base for AI applications requires reliable ingestion pipelines. This specialized n8n workflow solves the problem of manually handling and converting diverse document types (PDF, text, JSON) into queryable vector embeddings.
This sophisticated n8n template leverages the power of LangChain n8n nodes alongside standard n8n file operation nodes. By running on a schedule using the n8n trigger, it ensures your Postgres PGVector store is always up-to-date with the latest organizational knowledge from Google Drive. After successful vectorization, the n8n workflow uses a cleanup step to move the source file, preventing duplicate processing. This end-to-end automation transforms raw documents into structured, embeddable data efficiently within the n8n platform.
This comprehensive n8n workflow operates primarily on a scheduled basis, ensuring continuous data loading:
Schedule Trigger n8n trigger (set for 3 AM daily) or manually via the When clicking ‘Test workflow’ n8n trigger.Search Folder Google Drive n8n node scans a designated input folder for files that need processing.Loop Over Items n8n node processes each file individually. The Download File Google Drive n8n node fetches the binary content.Switch n8n node inspects the file's MIME type (PDF, text, or JSON) and routes the binary data to the appropriate extraction pipeline.Extract from File n8n nodes handle the parsing, turning PDF documents, plain text, or JSON structures into standardized text documents.Recursive Character Text Splitter n8n node (using 50 characters of overlap) and then passed to the Embeddings OpenAI n8n node, which uses the text-embedding-3-small model to generate vectors.Postgres PGVector Store n8n node, updating the n8nvectorswfs table and n8n_wfs collection.Move File Google Drive n8n node relocates the successfully vectorized file to a 'vectorized' archive folder, completing the n8n workflow cycle.To deploy this powerful n8n workflow template, follow these steps:
Search Folder, Download File, and Move File n8n nodes. Ensure the account has access to both the source (input) and destination (vectorized) folders.Embeddings OpenAI n8n node. This is crucial for generating the high-quality vector embeddings.Postgres PGVector Store n8n node. Verify the database name, table name (n8nvectorswfs), and collection name (n8n_wfs) match your required vector store schema.Search Folder (input) and Move File (archive) n8n nodes to match your specific Google Drive directory structure.This n8n workflow template utilizes several core and specialized n8n nodes:
Schedule Trigger / Manual Trigger: The initial n8n trigger points that activate the flow, allowing for scheduled or immediate execution.
Google Drive Nodes (Search Folder, Download File, Move File): These are essential n8n nodes for handling cloud storage operations—locating source files, downloading content, and archiving processed files. The Search Folder uses a specific folder ID to filter content.
Switch n8n Node: This critical flow control n8n node inspects the MIME type of the downloaded file (application/pdf, text/plain, application/json) and determines the appropriate extraction path.
Extract from File n8n Nodes (PDF, Text, JSON): These nodes preprocess binary data into usable text based on the detected file type.
Embeddings OpenAI n8n Node: Connects to OpenAI to generate vectors. Key configuration uses the text-embedding-3-small model for efficiency and quality.
Recursive Character Text Splitter n8n Node: A LangChain n8n node that manages chunking of the documents, configured with a chunkOverlap of 50 to maintain context during vector generation.
insert mode and targets the table n8nvectorswfs with the collection name n8n_wfs.Automate video distribution across YouTube, Instagram, and TikTok using this powerful n8n workflow. Perfect for creators and marketers seeking efficient social media n8n templates.

Use this powerful n8n workflow template to automatically generate and serve dynamic Markdown documentation for all your existing n8n workflows using GPT-4 and Docsify. Includes a live Markdown editor.

Automate daily alerts for nearby garage sales and flea markets. This powerful n8n workflow scrapes web data, filters events by distance (<= 20km), and sends timely Telegram notifications. Use this n8n template for web scraping automation.

Automate logistics damage reporting using an n8n workflow. Operators send photos via Telegram, GPT-4o generates structured reports and extracts barcodes, and the final document is emailed via Gmail. Use this efficient n8n template today.

Leverage this powerful n8n workflow to automatically parse complex EDI messages received via Gmail and log the extracted order details into Google Sheets, streamlining B2B data entry.








































