ip·Solis -- Production Deployment Guide
This guide walks you through setting up the ip·Solis platform on a fresh on-premises server. No prior knowledge of the codebase is required.
Table of Contents
- Prerequisites
- Get the Software
- Configure Environment Variables
- SSL / TLS Certificate Setup
- Create the Production Compose Overlay
- Start the Stack
- Initial Admin Setup
- Entra ID SSO (Portal Authentication)
- Verify the Deployment
- Backup & Maintenance
- Updating to a New Version
- High-Availability Deployments
- Troubleshooting
- Clean Reset (Test Environments)
1. Prerequisites
Server Requirements
| Component | Minimum | Recommended |
|---|---|---|
| OS | Linux (Debian/Ubuntu recommended) | Ubuntu 22.04 LTS or newer |
| CPU | 2 cores | 4 cores |
| RAM | 4 GB | 8 GB |
| Disk | 20 GB | 50 GB (depends on number of managed assets) |
Software
Install the following before proceeding:
- Docker Engine >= 24.0 -- Install Docker
- Docker Compose >= 2.20 (included with Docker Engine)
- Git -- to clone the repository
After installing Docker, add the deployment user to the docker group so
docker compose commands work without sudo:
sudo usermod -aG docker $USER
# Then log out and back in (or: newgrp docker)
Verify your installation:
docker --version # Docker version 24.x or higher
docker compose version # Docker Compose version v2.20 or higher
git --version
Network Requirements
The server needs outbound access to:
| Destination | Purpose |
|---|---|
| Your Active Directory / LDAP server (port 389 or 636) | User validation, manager lookup, group membership |
| Your SMTP relay | Email notifications |
| vSphere / XenServer (if applicable) | VM lifecycle automation |
| SCCM server (if applicable) | Task sequence triggers |
Inbound: ports 80 and 443 must be reachable from your users' browsers.
2. Get the Software
Fresh environment recommended: Docker volumes (database data) survive
rm -rf /opt/ipsolis— they live under/var/lib/docker/volumes/and persist until explicitly removed. For a clean first install, ensure no old volumes exist. See Clean Reset (Test Environments).
Clone the repository and pull the images — no authentication required:
cd /opt
sudo git clone https://github.com/XenPool/ipsolis.git ipsolis
cd ipsolis
The Docker images (ghcr.io/xenpool/ipsolis-api and
ghcr.io/xenpool/ipsolis-worker) are public and pulled automatically when
you start the stack.
Licensing: ip·Solis is free for non-commercial and evaluation use. Commercial use requires a license — see LICENSE and contact sales@xenpool.de to purchase.
3. Configure Environment Variables
Copy the example file and edit it:
sudo cp .env.example .env
sudo nano .env
Required settings to change
# Secure database credentials
POSTGRES_PASSWORD=<generate-a-strong-password>
# Secure API secrets -- use random strings of 32+ characters
API_SECRET_KEY=<random-string-min-32-chars>
WEBHOOK_SECRET_TOKEN=<random-string>
ADMIN_API_KEY=<random-string-min-32-chars>
# CORS -- set to your production domain ← replace YOUR_HOSTNAME.YOUR_COMPANY.COM
CORS_ORIGINS=https://YOUR_HOSTNAME.YOUR_COMPANY.COM
FLOWER_PASSWORD=<strong-password>
Tip: Generate secure passwords with:
openssl rand -base64 32
4. SSL / TLS Certificate Setup
The platform runs behind an nginx reverse proxy that terminates SSL. You need a TLS certificate and private key.
Option A: Internal / Self-Signed Certificate (Intranet)
If your server is only accessible within your corporate network, use mkcert to generate a trusted certificate:
# Install mkcert (one-time)
# Ubuntu/Debian:
sudo apt install -y libnss3-tools
sudo curl -JLO "https://dl.filippo.io/mkcert/latest?for=linux/amd64"
chmod +x mkcert-v*-linux-amd64
sudo mv mkcert-v*-linux-amd64 /usr/local/bin/mkcert
# Install the local CA into your system trust store
sudo mkcert -install
# Generate the certificate for your hostname ← replace YOUR_HOSTNAME.YOUR_COMPANY.COM
sudo mkdir -p nginx/ssl
sudo mkcert -cert-file nginx/ssl/cert.pem -key-file nginx/ssl/key.pem YOUR_HOSTNAME.YOUR_COMPANY.COM
Important: For browsers on other machines to trust this certificate, you must distribute the root CA (
mkcert -CAROOTshows the path) to client machines via Group Policy or your enterprise CA trust store.
Installing the root CA on a Windows client:
# On the server — make the root CA available for download
sudo cp $(sudo mkcert -CAROOT)/rootCA.pem /tmp/ipsolis-rootCA.pem
sudo chmod 644 /tmp/ipsolis-rootCA.pem
Copy the file to your Windows laptop (SCP, USB, etc.), then:
Option 1 — via double-click:
- Rename the file to
ipsolis-rootCA.crt - Double-click → Install Certificate
- Local Machine → Trusted Root Certification Authorities
- Restart your browser
Option 2 — via PowerShell (as Administrator):
certutil -addstore -f "ROOT" ipsolis-rootCA.crt
After installation Chrome, Edge and Firefox (using the Windows trust store) will trust the certificate without warnings.
Option B: Certificate from your Enterprise CA (Recommended for production)
If your organization runs an internal Certificate Authority (e.g., Active Directory Certificate Services):
- Generate a CSR on the server: (replace YOUR_HOSTNAME.YOUR_COMPANY.COM)
sudo mkdir -p nginx/ssl sudo openssl req -new -newkey rsa:2048 -nodes \ -keyout nginx/ssl/key.pem \ -out nginx/ssl/server.csr \ -subj "/CN=YOUR_HOSTNAME.YOUR_COMPANY.COM" - Submit
nginx/ssl/server.csrto your CA and obtain the signed certificate. - Save the signed certificate as
nginx/ssl/cert.pem. - If your CA provides an intermediate/chain certificate, append it to
cert.pem:cat signed-cert.pem intermediate-ca.pem | sudo tee nginx/ssl/cert.pem > /dev/null
Option C: Let's Encrypt (Public-facing servers)
If your server is publicly accessible, you can use free certificates from Let's Encrypt:
sudo apt install -y certbot
sudo certbot certonly --standalone -d YOUR_HOSTNAME.YOUR_COMPANY.COM # ← replace
# Symlink into the ssl directory
sudo mkdir -p nginx/ssl
sudo ln -sf /etc/letsencrypt/live/YOUR_HOSTNAME.YOUR_COMPANY.COM/fullchain.pem nginx/ssl/cert.pem
sudo ln -sf /etc/letsencrypt/live/YOUR_HOSTNAME.YOUR_COMPANY.COM/privkey.pem nginx/ssl/key.pem
Set up auto-renewal (Option C only)
# Test renewal
sudo certbot renew --dry-run
# Add a cron job to reload nginx after renewal
echo "0 3 * * * certbot renew --quiet --post-hook 'docker exec ipsolis-nginx nginx -s reload'" | sudo crontab -
Configure nginx
The repository already ships a ready-to-use nginx/nginx.conf with the placeholder YOUR_HOSTNAME.YOUR_COMPANY.COM. Replace it with your actual FQDN (sed handles both occurrences in one pass):
sudo sed -i 's/YOUR_HOSTNAME.YOUR_COMPANY.COM/ipsolis.acme.com/g' nginx/nginx.conf
The file will look like this afterwards (for reference):
server {
listen 80;
server_name YOUR_HOSTNAME.YOUR_COMPANY.COM;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name YOUR_HOSTNAME.YOUR_COMPANY.COM;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
client_max_body_size 2g;
# WebSocket / HTMX support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
location / {
proxy_pass http://api:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Use the same hostname in the certificate generation step (Option A/B/C above).
5. Production Compose Overlay
docker-compose.prelive.yml is already included in the repository — no action needed.
The overlay adds nginx for SSL termination and removes the dev bind-mounts from
api and worker.
6. Start the Stack
cd /opt/ipsolis
# Build and start all services
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
up --build -d
# Run database migrations
docker compose exec -T api alembic upgrade head
# Verify all containers are running
docker compose ps
Expected output -- all services should show Up (healthy):
NAME STATUS
ipsolis-postgres Up (healthy)
ipsolis-redis Up (healthy)
ipsolis-api Up (healthy)
ipsolis-worker Up (healthy)
ipsolis-beat-1 Up
ipsolis-nginx Up
Verify the application:
# Direct API health check
curl -f http://localhost:8000/health | python3 -m json.tool
# Through nginx (HTTPS)
curl -fsk https://YOUR_HOSTNAME.YOUR_COMPANY.COM/health | python3 -m json.tool
7. Initial Admin Setup
First-run admin account (RBAC)
Open https://YOUR_HOSTNAME.YOUR_COMPANY.COM/ui/ in your browser. On
the very first visit (when admin_users is empty), the login page
renders a "Create first administrator" form instead of the
normal sign-in form. Fill in:
| Field | Notes |
|---|---|
| Username | 3–128 chars, allowed: [a-zA-Z0-9._@-]+. Lower-cased at write time. |
| Password | ≥ 12 chars. PBKDF2-SHA256 / 600k iterations (OWASP-2023). |
| Confirm password | Must match. |
Submitting creates the first superadmin and auto-logs you in. This is idempotent against races — if two operators hit the form at the same time, only one wins; the other gets a "use the sign-in form" message.
After the first superadmin exists, the form switches to the regular username + password sign-in.
Add additional admin users
Once signed in, navigate to Admin Users in the left nav (superadmin-only). Create per-user accounts in the role appropriate to each operator:
superadmin > admin > approver > auditor > helpdesk
The full role ladder, per-asset-type ACL grants, separation-of-duties enforcement, and password-policy options are configurable in the Admin UI under Settings → Access Control.
Legacy `ADMIN_API_KEY` fallback
The ADMIN_API_KEY from .env continues to authenticate as a
virtual superadmin even after first-run setup, so existing
scripts / X-Admin-Key headers don't break on upgrade. To use it
on the login page: leave Username blank, paste the key into
Password. Audit attribution shows up as admin:legacy_key so
auditors can tell when the fallback path was used.
For new integrations prefer Per-integration API tokens (Admin UI → API Tokens) — named, expiring, revocable bearer tokens with optional role binding and scoped permissions. The legacy single shared key is kept for back-compat only.
Install Your License
Evaluation and non-commercial use require no license file. For commercial
deployments, XenPool delivers a signed .lic file after purchase.
Install it through the Admin UI:
- Navigate to Admin → License (or open
https://YOUR_HOSTNAME.YOUR_COMPANY.COM/ui/license). - Click Upload license and select your
ipsolis.licfile. - The page reloads showing licensee name and expiry — no restart required.
Grace period: when a license expires, a 30-day grace period applies before the license status reverts to unlicensed. The Admin UI shows an amber warning banner and the daily health alert email fires each day throughout the window.
Overwriting: upload a new .lic at any time to renew. The old file
is replaced in-place; the license cache refreshes on the next request
(mtime-keyed, zero downtime).
Env-var override (air-gapped / automated deployments): mount the
.lic file into the container at an alternate path and set:
IPSOLIS_LICENSE_PATH=/run/secrets/ipsolis.lic
The default path is /app/license/ipsolis.lic (inside the ipsolis-api
container). Docker secrets or a bind-mount both work.
Configuration Checklist
The in-app Setup checklist on the dashboard guides you through all required steps. The order below matches the checklist:
1. Set application title and logo *(Essential)*
Navigate to Admin > Settings → General:
| Setting | Description |
|---|---|
app.title |
Application name shown in the portal and emails (default: ip·Solis) |
app.logo |
Logo upload (PNG/SVG recommended) |
2. Configure SMTP *(Essential)*
Navigate to Admin > Settings → Email:
| Setting | Description | Example |
|---|---|---|
smtp.host |
SMTP relay hostname | smtp.yourcompany.com |
smtp.port |
SMTP port | 587 |
smtp.user |
SMTP username (if auth required) | selfservice@yourcompany.com |
smtp.password |
SMTP password | (marked as secret) |
smtp.tls |
Use STARTTLS | true |
smtp.from |
Sender email address | noreply@yourcompany.com |
smtp.from_name |
Sender display name | ip·Solis |
Navigate to Admin > Email Templates to customize notification email text.
3. Connect to Active Directory *(Essential)*
Navigate to Admin > Settings → Active Directory:
| Setting | Description | Example |
|---|---|---|
ad.server |
AD domain controller hostname or IP | dc01.yourcompany.com |
ad.port |
LDAP port | 389 (or 636 for LDAPS) |
ad.base_dn |
Search base DN | DC=yourcompany,DC=com |
ad.domain |
NetBIOS domain name | YOURCOMPANY |
ad.username |
Service account (sAMAccountName) | svc-selfservice |
ad.password |
Service account password | (marked as secret) |
ad.use_ssl |
Use LDAPS | true or false |
Required AD permissions depend on the modules and runbook steps in use. As a baseline:
- Read on user objects (attributes:
displayName,sAMAccountName,userPrincipalName,manager,memberOf,distinguishedName)- Write
memberon group objects — required for AD group-based access assignmentAdditional permissions (e.g. on computer objects, OUs, or other attributes) may be needed depending on the runbooks and modules deployed.
4. Enable portal SSO via Entra ID *(Essential)*
See Section 8 for the full Entra ID setup.
5. Create your first asset type *(Essential)*
- Go to Admin > Asset Types > New
- Fill in the name, description, and category
- Configure the automation strategy (Group Access, Runbook, or Composite)
- Set approval requirements if needed
- Optionally restrict access with an Eligible Requestors group DN
- Save
6. Add at least one asset to the pool *(Essential)*
Go to Admin > Asset Pool > New and add at least one asset.
For pure
capacity_pooledasset types (quota without dedicated instances) this step can be skipped.
Set up Runbooks *(if applicable)*
ip·Solis ships with a fully configured example runbook: "Virtual Machine Recycler" — a standalone runbook that includes all required script modules (XenServer/XCP-ng, SCCM, Active Directory) and can serve as a template for your own automation.
Find it under Admin > Runbooks to inspect, copy, or adapt it.
To create asset-type runbooks:
- Go to Admin > Runbooks > New
- Define the steps (PowerShell modules or built-in modules)
- Link the runbook to an asset type
Any number of custom runbooks with any combination of steps can be created.
Recommended next steps
- Microsoft Teams approval cards: Go to Admin > Settings → Email and add a Teams webhook URL — approvers receive an Adaptive Card with a one-click review link in addition to email.
- Stream audit log to SIEM: Configure a Splunk HEC or webhook endpoint under Admin > Settings → Compliance.
- Issue per-integration API tokens: Go to Admin > API Tokens to create named,
revocable bearer tokens for ServiceNow, scripts, or Prometheus — replaces the
shared
X-Admin-Key.
After a DB restore: The
api_tokenstable is restored along with the database. Review all tokens under Admin > API Tokens — revoke any old or unused tokens and issue new, dedicated tokens for active integrations only.
8. Entra ID SSO (Portal Authentication)
The self-service portal supports Microsoft Entra ID (Azure AD) for single sign-on.
Register an App in Entra ID
- Go to the Azure Portal > App registrations > New registration
- Name:
ip·Solis - Redirect URI:
https://YOUR_HOSTNAME.YOUR_COMPANY.COM/portal/auth/callback(Web) - Note down the Application (client) ID and Directory (tenant) ID
- Under Certificates & secrets, create a new client secret
Configure in Admin UI
Navigate to Admin > Settings and set:
| Setting | Description |
|---|---|
entra.mode |
entra_only (Entra ID login required) or entra_with_onprem (Entra ID + on-prem LDAP check) |
entra.client_id |
Application (client) ID |
entra.client_secret |
Client secret value (marked as secret) |
entra.tenant_id |
Directory (tenant) ID |
entra.redirect_uri |
https://YOUR_HOSTNAME.YOUR_COMPANY.COM/portal/auth/callback (replace) |
entra.allowed_domains |
Comma-separated list of allowed email domains, e.g. yourcompany.com |
Use the Test Entra Connection button to verify the configuration.
When
entra.modeis set todisabled, the portal is open to anyone on the network with a shared anonymous identity — every visitor sees and can act on the same set of orders. Only use this for demo / air-gapped lab deployments. For multi-user production, setentra.mode = entra_only.
9. Verify the Deployment
Run through this checklist to confirm everything works:
- HTTPS:
https://YOUR_HOSTNAME.YOUR_COMPANY.COMloads with a valid certificate - Admin UI:
https://YOUR_HOSTNAME.YOUR_COMPANY.COM/ui/is accessible - First-run setup: visiting the admin login renders the "Create first administrator" form (or, if already done, the regular sign-in form with no error)
- Setup checklist: the dashboard shows the in-app setup checklist; tick off Essential items as you configure them
- Portal login: Users can sign in via Entra ID SSO
- AD lookup: On the order form, user validation (deputy, RDP, admin fields) resolves names
- Email: Submit a test order and confirm notification email arrives
- Health check:
curl -fsk https://YOUR_HOSTNAME.YOUR_COMPANY.COM/healthreturns{"status": "ok"} - (optional) API tokens: issue a per-integration token for any automation that previously used
X-Admin-Key - (optional) SIEM streaming: configure under Settings → Compliance if you have Splunk / Sentinel / a generic webhook receiver
- (optional) Prometheus: scrape
/metricsfrom your monitoring; the dashboard ships in docs/grafana/
10. Backup & Maintenance
Database Backup
The PostgreSQL data is stored in a Docker volume (postgres_data). Back it up regularly:
# Dump the database
docker compose exec -T postgres pg_dump -U xpuser ipsolis > backup_$(date +%Y%m%d).sql
# Restore from backup
cat backup_20260414.sql | docker compose exec -T postgres psql -U xpuser ipsolis
Logs
View container logs:
# All services
docker compose logs --tail=50
# Specific service
docker compose logs api --tail=100 -f # follow mode
docker compose logs worker --tail=100
Disk Cleanup
Periodically remove old Docker images:
docker image prune -f
11. Updating to a New Version
Pre-flight SSL check — run this before pulling. If either file is missing, the nginx container will start but serve no HTTPS traffic.
ls -la nginx/ssl/cert.pem nginx/ssl/key.pemIf missing, regenerate the certificate (see section 4) before proceeding.
cd /opt/ipsolis
# Pull the latest code
git pull origin main
# Rebuild and restart
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
up --build -d
# Run any new database migrations
docker compose exec -T api alembic upgrade head
# Restart nginx to pick up new container IPs and any config changes
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
restart nginx
# Verify health
curl -fsk https://YOUR_HOSTNAME.YOUR_COMPANY.COM/health | python3 -m json.tool
Migrations are safe to run multiple times -- Alembic tracks which have already been applied and skips them. Each feature slice typically ships its own migration; review
api/alembic/versions/between upgrades for the changeset, anddocker compose exec api alembic historyto see the chain.
Backing up before upgrade
Always snapshot the database first — pg_dump from the Postgres
container, or use the in-app Maintenance → Backups page (Admin UI)
which writes a timestamped SQL dump to the bind-mounted ./backups/
directory. Configure a daily backup schedule in the same UI so the
snapshot is fresh when an unexpected regression appears.
Beat HA failover during the restart
If you run multiple Beat replicas (--scale beat=N), docker compose up --build -d rolls the containers one at a time and the leader lock
hands over to the surviving replica within ~13 s.
For single-Beat installs there's a brief gap during the restart
where periodic tasks aren't running — usually invisible since cadences
are minutes / hours.
12. High-Availability Deployments
ip·Solis scales horizontally at the API and worker layers. The Beat scheduler supports multi-replica HA via celery-redbeat. This section covers the two tested scaling scenarios: API replicas and worker replicas.
12.1 Multi-replica API
The API is stateless by design — every replica handles every request equally and there's no need for sticky-session affinity at the load balancer.
What makes it stateless:
- Sessions use Starlette's
`SessionMiddleware`
in cookie-signed mode (
api/app/main.py): the entire session payload (admin user id, role, csrf token) lives in thexp_sessioncookie itself, signed withAPI_SECRET_KEY. No server-side session table. - Tokenized URLs (
/approve/<token>,/portal/certifications/review/<token>, etc.) are HMAC-signed with the sameAPI_SECRET_KEYand verify-only. No replay table. - All request state lives in Postgres or Redis — both shared across replicas.
What every replica MUST share:
| What | Why | How |
|---|---|---|
API_SECRET_KEY |
Signs session cookies + approval tokens. Different keys per replica = clients see "session invalid" / "approval link expired" half the time. | Pin in .env; load via env_file: in compose so every replica reads the same file. |
DATABASE_URL / CELERY_BROKER_URL / CELERY_RESULT_BACKEND |
Shared Postgres + Redis backplane. | Same as above. |
| Shared filesystem mounts | licenses/, scripts/, backups/ are bind-mounted; replicas reading the same paths must see the same content. On a single host, that's automatic. On multiple hosts, use NFS / GlusterFS / a shared volume driver — or migrate the relevant content to S3-compatible object storage (a deferred slice). |
Single-host deployments don't need any extra plumbing. |
Scaling commands:
# Single-host: bump the api replica count via compose
docker compose -f docker-compose.yml -f docker-compose.prelive.yml \
up -d --scale api=3
# Verify each replica is reachable through the load balancer
for i in 1 2 3; do
curl -fsk https://YOUR_HOSTNAME.YOUR_COMPANY.COM/health \
-H 'X-Replica-Probe: '$i
done
Load-balancer config notes:
- No sticky sessions required. Round-robin or least-connections is fine.
- Health check:
GET /health(unauthenticated). Returns{status: ok | degraded}aggregating database, redis, and beat liveness. The endpoint is fast (one Redis ping + one DB SELECT 1) so a 5–10s LB check interval is safe. - TLS termination: keep on the load balancer (or the existing
nginx sidecar from section 5). Replicas serve plain HTTP
internally; the
`https_only=True`
flag on
SessionMiddlewareguards the cookie'sSecurebit irrespective of where TLS terminates.
Rolling restart during upgrades: the upgrade flow in section 11
stops and restarts every replica together, which is fine for small
fleets where ~30s of API downtime is acceptable. For zero-downtime
rolls, fold the up --build -d step into a per-replica loop:
for i in 1 2 3; do
docker compose stop api-$i
docker compose -f docker-compose.yml -f docker-compose.prelive.yml \
up --build -d --no-deps api-$i
# Wait for the new container to pass health
until curl -fsk http://localhost/health > /dev/null 2>&1; do
sleep 2
done
done
This requires an LB that can drain one backend at a time; with the default round-robin nginx upstream, in-flight requests on the restarting replica drop. The drain logic is your LB's responsibility.
12.2 Multi-replica worker
Celery workers are stateless consumers — they pull from the named Redis queues and process tasks. Adding more workers is a one-line scale-up; the worker code itself doesn't change.
Queue topology (defined in worker/tasks/__init__.py):
| Queue | Tasks | Why a separate queue |
|---|---|---|
provision |
Order workflows (dynamic_runner, standalone_runner, ps_module_installer, sccm_probe) — anything that touches AD / SCCM / vSphere / XenServer. |
Provisioning steps shell out to PowerShell (~5–60s/step) and hold connections to external systems. Isolating them keeps a slow vSphere call from blocking quick housekeeping tasks. |
notifications |
Email senders, Teams card delivery, approval reminders, certification reminders, cost alerts. | I/O-bound, latency-sensitive (a stuck SMTP server shouldn't queue up behind a 30s SCCM probe). |
default |
Audit retention prune, SIEM streaming, license check, update checker, cost-report snapshot, DB backup, api token purge. | Background housekeeping. Mostly cron-driven, low-frequency. |
reclaim |
Asset-expiry checks (check_expiring_assets). |
Hourly Beat task; small but isolated so the hourly tick doesn't compete with order workflows for a worker slot. |
Sizing recommendations (per-queue concurrency × replica count):
| Pool size | Recommended config | Reasoning |
|---|---|---|
| Lab / single-team (≤50 users) | 1 worker replica, --concurrency=4 -Q provision,notifications,default,reclaim |
All queues on one process; concurrency 4 is plenty for the typical 1–2 orders/hour. |
| Mid (≤500 users, ≤20 orders/hour) | 2 worker replicas split by queue: replica A -Q provision --concurrency=4, replica B -Q notifications,default,reclaim --concurrency=2 |
Provisioning latency stays bounded by replica A; replica B handles housekeeping + reminders without queue-head-of-line blocking. |
| Large (≥500 users, ≥50 orders/hour, regulated SLAs) | 3+ worker replicas: dedicated provision workers (--concurrency=8 × 2 replicas), one notifications replica (--concurrency=4), one default,reclaim replica (--concurrency=2) |
Per-queue scaling matches actual load shape. |
Scaling command (single-host, all queues on each replica):
docker compose -f docker-compose.yml -f docker-compose.prelive.yml \
up -d --scale worker=3
Per-queue dedicated replicas require either separate compose
service definitions (e.g. worker-provision, worker-notifications)
each with its own command: overriding the default queue list, or
a runtime command: override:
# docker-compose.prelive.yml — per-queue split
services:
worker-provision:
image: ipsolis-worker
command: celery -A tasks worker -Q provision --concurrency=8 -l info
deploy: { replicas: 2 }
env_file: .env
worker-notifications:
image: ipsolis-worker
command: celery -A tasks worker -Q notifications --concurrency=4 -l info
deploy: { replicas: 1 }
env_file: .env
worker-housekeeping:
image: ipsolis-worker
command: celery -A tasks worker -Q default,reclaim --concurrency=2 -l info
deploy: { replicas: 1 }
env_file: .env
Beat scaling: the beat container has no fixed container_name so it can be
replicated for HA:
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
up -d --scale beat=2
Note: Celery Beat is a singleton scheduler. Multiple beat replicas only make sense with a distributed lock backend —
celery-redbeat(already configured) uses Redis locks to prevent duplicate task firing.
Liveness: each worker registers with Celery's mingle-on-startup,
which means a fresh worker is visible to Beat / other workers within
a couple of seconds. There's no separate health check to wire — if
the worker container is Up, it's consuming.
Visibility: Flower (the existing flower service in the dev
compose; see docker-compose.yml) shows live worker registration,
queue depth, and task-by-task duration breakdown. For production,
front it with the same nginx auth as the admin UI; Flower has no
built-in authn beyond HTTP basic.
12.3 Postgres high-availability
Postgres HA (streaming replication, pgBackRest, Patroni) is architecturally
possible — ip·Solis is single-primary and any connection-string switch requires
only a .env change and restart. A validated step-by-step guide is not included
in this version.
13. Troubleshooting
Container won't start
# Check container status and exit codes
docker compose ps -a
# Check logs for the failing service
docker compose logs <service-name> --tail=50
Health check fails through nginx but API is healthy
Nginx may have cached the old container IP. Restart the container
(not just nginx -s reload — Docker bind-mounts retain the old inode otherwise):
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
restart nginx
Database connection errors
# Check if postgres is running
docker compose exec postgres pg_isready -U xpuser
# Verify the connection from the API container
docker compose exec api python -c "
from sqlalchemy import create_engine, text
e = create_engine('postgresql://xpuser:<password>@postgres:5432/ipsolis')
with e.connect() as c: print(c.execute(text('SELECT 1')).scalar())
"
AD / LDAP connection issues
- Verify network connectivity from the container:
docker compose exec api curl -v telnet://dc01.yourcompany.com:389 - Check the AD settings in Admin > Settings
- Review API logs for LDAP errors:
docker compose logs api 2>&1 | grep -i "ldap\|ad_lookup"
Emails not sending
- Verify SMTP settings in Admin > Settings
- Check worker logs for SMTP errors:
docker compose logs worker 2>&1 | grep -i "smtp\|mail\|notification" - Ensure the server can reach the SMTP relay:
docker compose exec api curl -v telnet://smtp.yourcompany.com:587
Permission denied on ssl directory
sudo chmod 644 nginx/ssl/cert.pem
sudo chmod 600 nginx/ssl/key.pem
14. Clean Reset (Test Environments)
Test and staging environments only. This section permanently destroys all data. Never run on a production instance.
Docker volumes (database data, Redis data) survive rm -rf /opt/ipsolis because
they are stored under /var/lib/docker/volumes/ — independent of the repository
directory. For a fully clean reinstall:
# 1. Stop the stack and delete volumes
cd /opt/ipsolis
docker compose \
-f docker-compose.yml \
-f docker-compose.prelive.yml \
down -v
# 2. Remove the repository directory
cd /opt
sudo rm -rf ipsolis
# 3. Reinstall (continue from section 2)
sudo git clone https://github.com/XenPool/ipsolis.git ipsolis
cd ipsolis
After this reset the database contains no users, no configuration and no assets — the initial setup (section 7) must be completed again.