

N8N node for self-hosted open-source GPT templates — local orchestration of LLM agents.
NPM package that brings N8N a dedicated node for self-hosted open-source GPT models (LM Studio, Ollama, vLLM, Text Generation Inference). Goal: orchestrate LLM agents in N8N workflows without depending on a cloud API, keeping data 100% on-prem. The node exposes chat, completion and embeddings operations, with an abstraction layer that supports several inference servers through a single configuration.
LLM agent orchestration in N8N was so far monopolized by OpenAI. Organizations with GDPR or data sovereignty constraints had no option to use self-hosted open-source models in their existing workflows.
N8N node that abstracts over 4 inference servers (LM Studio, Ollama, vLLM, TGI) — configuration by URL + model, chat/completion/embeddings operations. Streaming supported for long responses. Built to drop into existing workflows without changing graph structure: prompt input, text or embedding output, standard N8N error handling.
Metrics give a quick read of the case study effects.
A simple read of the functional blocks and their interactions.
Product diagnosis, SaaS architecture, backend, interface and automations that make a platform usable.