If you have followed the recent articles in this series, you now have Ollama running locally with one or more models pulled, and probably a chat frontend like Open WebUI or AnythingLLM on top of it. Those tools are great for having conversations. But what if you want to build something with your local LLM — a customer support bot, a document Q&A endpoint, or an automated pipeline that fetches data, runs it through a model, and returns a structured result?
That is where Flowise comes in.
Flowise is an open-source, self-hosted tool for building LLM-powered applications using a visual drag-and-drop canvas. You connect nodes — an LLM node, a document loader, a memory buffer, a prompt template — and wire them together into a “chatflow.” The result is a working AI application with a REST API endpoint that you can embed in any product or call from any script. No Python, no LangChain boilerplate, no LLM framework knowledge required.
This tutorial walks you through installing Flowise on Ubuntu, connecting it to a local Ollama backend, building a basic chatbot flow, building a document RAG flow, running Flowise as a systemd service, and putting it behind an Nginx reverse proxy.
How Flowise Differs from the Other Tools
Open WebUI and LibreChat are chat interfaces — they present a conversation window and let a human interact with a model. AnythingLLM adds workspace-organized RAG on top of that.
Flowise is an application builder. The output of your work in Flowise is not a conversation you had — it is an API endpoint that someone else (or another system) can call. You are not the end user of the LLM; you are building the tool that an end user or an automation pipeline will call.
Think of it this way:
- Open WebUI / LibreChat — you type a message, the model responds
- AnythingLLM — your team queries documents through a chat UI
- Flowise — you build the AI feature that gets embedded in your internal tool, your website, or your Slack bot
Flowise is built on top of LangChain and LlamaIndex concepts, but exposes them visually. If you have heard of LangChain but find the Python SDK intimidating, Flowise is the approachable entry point.
Prerequisites
Before starting, make sure you have:
- Ubuntu 20.04, 22.04, or 24.04
- Ollama installed and running with at least one model pulled — check with
systemctl status ollama - Node.js 18 or 20 installed (covered in Step 1)
- At least 2 GB of free RAM beyond what your models use
- At least 5 GB of free disk space
- A user with
sudoprivileges
Confirm Ollama is responding:
curl http://localhost:11434/api/tags
You should get a JSON list of your downloaded models. If not, start Ollama first:
sudo systemctl start ollama
Step 1: Install Node.js 20
Flowise requires Node.js 18 or newer. Ubuntu’s default apt repository ships an older version, so use the NodeSource repository instead.
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
Confirm the installation:
node --version
npm --version
You should see v20.x.x and a matching npm version. If you already have Node.js 18 or 20 installed, skip this step.
Step 2: Install Flowise
Flowise is published as an npm package. Install it globally so it is available system-wide:
sudo npm install -g flowise
This pulls down Flowise and all its dependencies. It takes a minute or two depending on your connection. Once it finishes, verify the binary is available:
flowise --version
Step 3: Start Flowise (First Run)
Before setting up a service, start Flowise manually to confirm it works:
flowise start --port=3001
Port 3001 is used here to avoid conflicts with other tools that might be running on 3000. You should see output like:
info: Start Express server on Port: 3001
info: Flowise App is running at http://localhost:3001
Open a browser and navigate to http://your-server-ip:3001. You should see the Flowise dashboard. If you are running locally, use http://localhost:3001.
Press Ctrl+C to stop it — you will run it properly as a service in a later step.
Step 4: Create a Systemd Service
Running Flowise manually is fine for testing, but you want it to start automatically and survive reboots. Create a dedicated user and a systemd unit for it.
First, create a system user that will own the process:
sudo useradd -r -s /bin/false flowise
Create a directory for Flowise’s data (it stores flows, credentials, and uploaded files here):
sudo mkdir -p /var/lib/flowise
sudo chown flowise:flowise /var/lib/flowise
Now create the service file:
sudo nano /etc/systemd/system/flowise.service
Paste the following:
[Unit]
Description=Flowise LLM Workflow Builder
After=network.target
[Service]
Type=simple
User=flowise
WorkingDirectory=/var/lib/flowise
ExecStart=/usr/bin/flowise start --port=3001 --storageType=local
Environment=DATABASE_PATH=/var/lib/flowise
Environment=SECRETKEY_PATH=/var/lib/flowise
Environment=LOG_PATH=/var/lib/flowise/logs
Environment=BLOB_STORAGE_PATH=/var/lib/flowise/storage
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
The environment variables tell Flowise to store all its data under /var/lib/flowise. Without these, it defaults to the home directory of the running user, which is undefined for a system user.
Reload systemd, enable, and start the service:
sudo systemctl daemon-reload
sudo systemctl enable flowise
sudo systemctl start flowise
Check it is running:
sudo systemctl status flowise
You should see active (running). Check the log if something is wrong:
sudo journalctl -u flowise -f
Step 5: Set Up Nginx as a Reverse Proxy (Optional but Recommended)
Exposing Flowise directly on port 3001 works, but putting Nginx in front gives you a clean domain name, SSL termination, and the ability to add basic auth.
Install Nginx if you do not have it:
sudo apt install -y nginx
Create a server block for Flowise:
sudo nano /etc/nginx/sites-available/flowise
server {
listen 80;
server_name flowise.example.com;
location / {
proxy_pass http://127.0.0.1:3001;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s;
}
}
The proxy_read_timeout 300s line is important. LLM responses can take a long time to generate, especially with larger models on CPU. The default Nginx timeout of 60 seconds will cut connections before the model finishes. 300 seconds gives most models enough time even on slow hardware.
Enable the site and reload Nginx:
sudo ln -s /etc/nginx/sites-available/flowise /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
You can add SSL with Certbot if you have a domain pointing at this server. See the previous article on securing Nginx with Let’s Encrypt for that step.
Step 6: Enable Authentication
By default, Flowise runs without any authentication — anyone who can reach the port can access it. For anything exposed beyond localhost, enable the built-in username and password protection.
Edit the service file:
sudo nano /etc/systemd/system/flowise.service
Add two lines under the [Service] section:
Environment=FLOWISE_USERNAME=admin
Environment=FLOWISE_PASSWORD=your_secure_password_here
Reload and restart the service:
sudo systemctl daemon-reload
sudo systemctl restart flowise
Flowise will now prompt for these credentials when you open the dashboard in a browser.
Step 7: Build a Basic Chatbot Flow with Ollama
Now that Flowise is running, open the dashboard. You will see a sidebar with Chatflows and Agentflows. A Chatflow is a conversational pipeline. An Agentflow is a multi-step autonomous agent. Start with a simple Chatflow.
Click Add New to open a blank canvas.
You build flows by dragging nodes from the left panel onto the canvas and connecting them. For a basic Ollama chatbot, you need two nodes:
1. Add an Ollama Chat Model node
In the node search panel, search for “Ollama”. Drag the ChatOllama node onto the canvas. In the node’s settings panel:
- Base URL:
http://localhost:11434(orhttp://host.docker.internal:11434if Flowise runs in Docker and Ollama on the host) - Model Name: type the name of a model you have pulled, for example
llama3.2:3b
2. Add a Conversation Chain node
Search for “Conversation Chain” and drag it onto the canvas. This node handles the back-and-forth conversation logic and maintains chat history between turns.
Connect the output of ChatOllama to the “Chat Model” input of Conversation Chain.
Click Save (top right), give the flow a name like “Basic Ollama Chat”, and then click Save again.
3. Test the flow
Click the chat icon (speech bubble) at the top right to open the built-in test window. Type a message and send it. The Conversation Chain node will pass your message to Ollama, stream back a response, and remember the conversation context for the next message.
You have just built your first LLM chatflow. Flowise also generates an API endpoint for it automatically — click the </> icon to see the curl command you can use to call it programmatically.
Step 8: Build a Document RAG Flow
A more powerful use case is querying your own documents. The classic RAG flow in Flowise involves four types of nodes: a document loader, a text splitter, an embedding model, and a vector store. Here is a working setup using local storage.
Create a new Chatflow and add the following nodes:
Document Loader Search for “PDF File” and add the PDF File node. This lets you upload a PDF document to the flow.
Text Splitter
Search for “Recursive Character Text Splitter” and add it. Set Chunk Size to 1000 and Chunk Overlap to 100. This breaks your document into overlapping chunks that fit inside the model’s context window.
Embeddings
Search for “Ollama Embeddings” and add it. Set the Base URL to http://localhost:11434 and choose an embedding-capable model. Ollama ships nomic-embed-text for this purpose. If you have not pulled it yet, run ollama pull nomic-embed-text in your terminal first.
Vector Store Search for “In-Memory Vector Store” and add it. This stores the document embeddings in memory for the session. For a persistent store, use Qdrant or Chroma instead, but In-Memory is fine for getting started.
Conversational Retrieval QA Chain Add this node. It wires together the retriever (vector store), the LLM, and the conversation history.
ChatOllama
Add the same ChatOllama node you used earlier, pointing to your chat model (llama3.2:3b or similar).
Wire everything together:
- PDF File → Text Splitter (Document input)
- Text Splitter → In-Memory Vector Store (Document input)
- Ollama Embeddings → In-Memory Vector Store (Embeddings input)
- In-Memory Vector Store → Conversational Retrieval QA Chain (Vector Store input)
- ChatOllama → Conversational Retrieval QA Chain (Chat Model input)
Save the flow, then open the test chat window. Flowise will prompt you to upload a PDF file before the first message. After uploading, ask a question about the document’s content. The vector store retrieves the most relevant chunks and the model answers based on them.
Common Mistakes and Troubleshooting
Ollama connection refused
If Flowise cannot reach Ollama, check that Ollama listens on all interfaces, not just 127.0.0.1. Edit /etc/systemd/system/ollama.service and add Environment=OLLAMA_HOST=0.0.0.0 under [Service], then restart it. This is required if Flowise runs in Docker.
Chat response times out through Nginx
Increase proxy_read_timeout in your Nginx config. 300 seconds is a safe starting value for local models; bump it to 600 if you run large models on CPU.
Model name not found
Flowise passes the model name string directly to Ollama. Make sure it matches exactly what ollama list shows — including the tag (e.g., llama3.2:3b, not llama3.2).
Flowise service fails to start with permission errors
Check that /var/lib/flowise is owned by the flowise user: ls -ld /var/lib/flowise. If not, fix it with sudo chown -R flowise:flowise /var/lib/flowise.
“Cannot find module” errors in logs
This sometimes happens after an npm global install if the PATH for the service user does not include the npm bin directory. Edit the service and add Environment=PATH=/usr/local/bin:/usr/bin:/bin under [Service].
Best Practices
Use a persistent vector store for production RAG In-Memory Vector Store loses its data when Flowise restarts. For anything beyond a demo, integrate Qdrant or Chroma as the vector store. Both can be run as Docker containers on the same host.
Pin your model versions
Ollama model tags like llama3.2:3b can be re-pulled and updated. If you depend on consistent behavior, use the full SHA digest tag (ollama show llama3.2:3b --verbose shows it) in your Flowise node config to pin a specific model version.
Export your flows regularly
Flowise stores flows in a SQLite database under /var/lib/flowise. Back up this directory before any Flowise upgrade. You can also export individual flows as JSON files from the dashboard (three-dot menu → Export) and commit them to a git repo.
Do not expose Flowise publicly without authentication The built-in username/password guard is enough for a private server. If you need per-user access control or API key management, Flowise Pro adds those features — but for a self-hosted single-user or small-team setup, the basic auth plus Nginx is sufficient.
Separate embedding models from chat models
Use a small, fast embedding model like nomic-embed-text for vector search. Reserve your larger chat model for generation. Mixing them (using a 13B chat model for embeddings) wastes memory and adds latency to every document search.
Conclusion
You now have Flowise installed as a system service on Ubuntu, connected to your local Ollama backend, and configured behind Nginx. More importantly, you have built two real workflows: a basic conversational chatbot and a document RAG pipeline.
What makes Flowise valuable is the shift in mindset it enables. Instead of using an LLM as a chat tool, you are using it as a component in a pipeline. That pipeline has an API. That API can be called by your website, your Slack bot, your CI system, or a cron job. The LLM becomes infrastructure.
From here, explore the following areas in Flowise:
- Agentflows — multi-step agents that can use tools like web search, calculators, or custom API calls
- External vector stores — connecting Qdrant or Chroma for persistent embeddings that survive restarts
- Credentials management — securely storing API keys for external providers (OpenAI, Anthropic, Groq) if you want to mix cloud models with local Ollama models
- Flowise API — embedding a chatflow in an external application using the REST API or the JavaScript/Python SDK that Flowise generates per flow
The visual canvas makes LLM application patterns — RAG, tool use, memory, multi-agent routing — accessible without writing framework code. Once you understand what the nodes do, the canvas becomes a fast way to prototype, test, and iterate on AI features before committing to a code implementation.