Google Drive to PGVector Knowledge Base Builder - n8n Workflow

Automate loading files (PDF, JSON, text) from Google Drive into a PGVector database using n8n. This powerful n8n workflow creates a RAG knowledge base using OpenAI embeddings.

Workflow Preview

Ready to automate?

Download this n8n workflow template and start using it instantly.

Who is this best for?

This is the ideal n8n template for:

AI/ML Engineers: Who need reliable, scheduled data ingestion into a vector database.
Data Architects: Looking to build or maintain a Retrieval-Augmented Generation (RAG) system knowledge base.
Automation Specialists: Seeking to integrate cloud storage file operations with advanced vector processing using an n8n workflow.
Content Managers: Who frequently upload new documentation (PDFs, text files) that must be instantly available for AI queries.

Overview

Building a robust knowledge base for AI applications requires reliable ingestion pipelines. This specialized n8n workflow solves the problem of manually handling and converting diverse document types (PDF, text, JSON) into queryable vector embeddings.

This sophisticated n8n template leverages the power of LangChain n8n nodes alongside standard n8n file operation nodes. By running on a schedule using the n8n trigger, it ensures your Postgres PGVector store is always up-to-date with the latest organizational knowledge from Google Drive. After successful vectorization, the n8n workflow uses a cleanup step to move the source file, preventing duplicate processing. This end-to-end automation transforms raw documents into structured, embeddable data efficiently within the n8n platform.

How it Works

This comprehensive n8n workflow operates primarily on a scheduled basis, ensuring continuous data loading:


  1. Scheduled Start: The automation begins via a Schedule Trigger n8n trigger (set for 3 AM daily) or manually via the When clicking ‘Test workflow’ n8n trigger.

  2. Identify Documents: The Search Folder Google Drive n8n node scans a designated input folder for files that need processing.

  3. Iterate and Download: The Loop Over Items n8n node processes each file individually. The Download File Google Drive n8n node fetches the binary content.

  4. Type Routing: A Switch n8n node inspects the file's MIME type (PDF, text, or JSON) and routes the binary data to the appropriate extraction pipeline.

  5. Content Extraction: Specialized Extract from File n8n nodes handle the parsing, turning PDF documents, plain text, or JSON structures into standardized text documents.

  6. Vectorization Pipeline: The extracted documents are processed: they are chunked using the Recursive Character Text Splitter n8n node (using 50 characters of overlap) and then passed to the Embeddings OpenAI n8n node, which uses the text-embedding-3-small model to generate vectors.

  7. Database Storage: The resulting vector and text chunks are inserted into the Postgres PGVector Store n8n node, updating the n8nvectorswfs table and n8n_wfs collection.

  8. Cleanup: Finally, a Move File Google Drive n8n node relocates the successfully vectorized file to a 'vectorized' archive folder, completing the n8n workflow cycle.

Installation Guide

To deploy this powerful n8n workflow template, follow these steps:


  1. Import: Copy the provided JSON data and paste it directly into your n8n instance using the 'New' -> 'Import from JSON' function.

  2. Google Drive Setup: Configure the Google Drive credentials for the Search Folder, Download File, and Move File n8n nodes. Ensure the account has access to both the source (input) and destination (vectorized) folders.

  3. OpenAI Credentials: Set up the OpenAI credential within the Embeddings OpenAI n8n node. This is crucial for generating the high-quality vector embeddings.

  4. Postgres PGVector Setup: Establish the connection to your PostgreSQL database in the Postgres PGVector Store n8n node. Verify the database name, table name (n8nvectorswfs), and collection name (n8n_wfs) match your required vector store schema.

  5. Folder IDs: Update the Google Drive Folder IDs in the Search Folder (input) and Move File (archive) n8n nodes to match your specific Google Drive directory structure.

  6. Activation: Enable the n8n workflow by toggling the 'Active' switch. The n8n trigger is ready to run on its schedule.

Node Details

This n8n workflow template utilizes several core and specialized n8n nodes:

Schedule Trigger / Manual Trigger: The initial n8n trigger points that activate the flow, allowing for scheduled or immediate execution.
Google Drive Nodes (Search Folder, Download File, Move File): These are essential n8n nodes for handling cloud storage operations—locating source files, downloading content, and archiving processed files. The Search Folder uses a specific folder ID to filter content.
Switch n8n Node: This critical flow control n8n node inspects the MIME type of the downloaded file (application/pdf, text/plain, application/json) and determines the appropriate extraction path.
Extract from File n8n Nodes (PDF, Text, JSON): These nodes preprocess binary data into usable text based on the detected file type.
Embeddings OpenAI n8n Node: Connects to OpenAI to generate vectors. Key configuration uses the text-embedding-3-small model for efficiency and quality.
Recursive Character Text Splitter n8n Node: A LangChain n8n node that manages chunking of the documents, configured with a chunkOverlap of 50 to maintain context during vector generation.


  • Postgres PGVector Store n8n Node: The destination database handler. This n8n node is set to the insert mode and targets the table n8nvectorswfs with the collection name n8n_wfs.

Related n8n Workflows

Free

Nodes: 11 Nodes
Updated: December 26 2025
View all
Created by

n8n Ambassador & Verified Partner

Featured*