I've been quietly building AI systems for the past month. Most of my friends and colleagues don't know this. This is me telling you about it.
Since early January I've been doing agent development with Temporal and Go, generating images with Stable Diffusion XL via ComfyUI, and serving large language models with Ollama on rented GPUs from Vast.ai. The stack is Go, TypeScript, Kubernetes, Postgres with pgvector, and a lot of YAML.
My most recent project is called chargenai. I want to tell you about it.
What chargenai does
You type in a character description. Something like "a cyberpunk hacker with neon pink hair" or "a retired cartographer who talks to birds." The system takes that description and generates three things: a name and tagline, a full backstory written in markdown, and a portrait image. All of them match.
The backstory is generated by an 80-billion parameter language model (Qwen3-Next running through Ollama). The portrait is generated by Stable Diffusion XL running through ComfyUI. The whole pipeline is orchestrated by Temporal workflows.
Here's roughly how it works:
- You submit a prompt through a web UI (built with Preact)
- The Go API creates a database record and kicks off a Temporal workflow
- The orchestrator dispatches a "generate profile" task to an LLM worker running on a Vast.ai GPU instance
- The LLM worker generates the character's name, tagline, and full markdown profile
- The orchestrator then dispatches a "generate image" task to an SD worker on a different GPU instance
- The SD worker generates a 1024x1024 portrait using ComfyUI, uploads it to S3
- The frontend polls until everything is ready and displays the result
The control plane (API, orchestrator, Temporal, Postgres) runs on managed Kubernetes. The GPU workers run on Vast.ai and connect back to Temporal over the internet. This separation means I can scale GPU resources independently and only pay for them when I need them.
Control Plane (Kubernetes) GPU Workers (Vast.ai)
┌──────────┐ ┌──────────────┐ ┌───────────────┐
│ Browser │→│ Go API │ │ LLM Worker │
│ (Preact) │ │ │ │ (Ollama) │
└──────────┘ └──────┬───────┘ └───────┬───────┘
│ │
┌──────▼───────┐ │
│ Temporal │◄─────────────┘
│ (Workflows) │◄─────────────┐
└──────┬───────┘ │
│ ┌──────┴────────┐
┌──────▼───────┐ │ SD Worker │
│ Postgres │ │ (ComfyUI) │
│ + S3 │ └───────────────┘
└──────────────┘
How I got here
The project didn't start as a character generator. It started as an LLM-powered todo list.
A todo list (January 8)
I wanted to learn how to wire up a language model to a web app. So I built a todo list that used Ollama for natural language input, PocketBase for persistence, and server-sent events for live updates. React frontend, TypeScript backend, pnpm workspaces.
I learned: how to call an LLM from a backend, how to stream responses, and that removing error catching early on surfaces problems faster.
A matchmaker game (January 8-9)
The todo list worked. So I immediately made it harder. I pivoted to a "Matchmaker GenAI game" and added Temporal workflows for orchestration and image generation with an IP-Adapter pipeline.
I learned: Temporal's programming model (workflows, activities, task queues). I also learned what deployment pain feels like. Rsync permission errors. CORS issues. Nginx proxy misconfiguration. PocketBase connection failures through RunPod's proxy. Each one of these took hours.
Docker hell (January 9-11)
I spent two days fighting Docker. Optimizing Dockerfiles. Reducing image sizes. Fixing CI disk space issues. Iterating on PocketBase setup scripts. Switching from building ComfyUI at Docker build time to pulling it at runtime. Switching to official PyTorch base images. Switching from pip to uv for faster installs.
I was also iterating on the UI theme. I added visual effects. Then I removed them. One commit message from this period reads: "Remove disgusting, hateful, revolting, vile, piece of shit visual effects." Real frustration, real learning.
Splitting compute from control (January 11-13)
I realized the AI workloads needed to run on different machines than the web services. So I split the infrastructure: a control plane for the API and database, and separate GPU nodes for inference. I set up self-hosted CI runners on Vultr. I learned how to deploy to RunPod with Docker build matrices. I fixed ComfyUI polling and got realtime updates working.
I learned: the difference between control plane and data plane, and why you separate them.
Character generation (January 13-15)
This is when chargenai became chargenai. I added LLM-driven character generation with S3 storage for images. The theme changed several times: luxury matchmaking, cyberpunk, millionaire dating club, then I stripped all the theming and simplified to unstructured prose output.
I learned: that themes are a distraction when you're building infrastructure. Let the LLM be creative and get out of its way.
Rewriting the backend in Go (January 15-17)
I rewrote the backend from TypeScript to Go. This was a big pivot. The TypeScript backend worked, but I wanted something that compiled to a single binary and had first-class Temporal SDK support.
At the same time I moved the container registry from GitHub to Vultr, upgraded CUDA and PyTorch versions, and simplified the content classification system.
I learned: Go's simplicity is a feature when you're managing a distributed system. Less magic means fewer surprises.
Kubernetes migration (January 17-20)
Docker Compose was fine for development but I needed something more resilient for production. I migrated to Vultr's managed Kubernetes (VKE). Removed PocketBase entirely, moved everything to Postgres. Multiple build failures along the way.
I learned: Kubernetes is complex but the abstraction is worth it once you have more than a couple of services. Also, PVCs and StatefulSets will test your patience.
Vast.ai and GPU wrangling (January 20-24)
RunPod worked but I wanted more control over GPU selection. I moved the workers to Vast.ai. I learned how to cache 84GB models on persistent volumes, automate deployment, and deal with the realities of renting GPU hardware from strangers on the internet. I restricted deployments to USA-only. I replaced my deployment scripts with agent runbooks.
One commit from this period: "fix the fucking worker deploy instructions."
I learned: GPU rental is powerful but fragile. Model loading is the bottleneck. Cache everything.
UI and smart buttons (January 24-27)
With the backend stable, I focused on the frontend. Switched to Preact with nanostores for state management. Added dark/light mode with system preference detection. Built a "smart buttons" feature where the LLM suggests prompt modifications in real time as you type.
I learned: nanostores is excellent for simple state management. Debounced API calls with LLM suggestions feel like magic when they work.
Planning (January 27 - now)
After three weeks of building, I slowed down to plan. Wrote roadmaps for face consistency, multi-image generation, templates, and a GitOps migration. Organized everything into design documents.
I learned: planning after building is more effective than planning before building. I now know what's hard and what isn't, because I've hit the walls myself.
What I took away
Every phase involved trying something, hitting a wall, learning from it, and trying the next thing. The project changed identity three times. The backend language changed. The database changed. The deployment strategy changed twice. The GPU provider changed.
Nothing was wasted. The todo list taught me LLM integration. The matchmaker taught me Temporal. Docker hell taught me infrastructure. Each failure was the prerequisite for the next attempt.
If you're thinking about building with AI, my advice is: start building something small, expect to throw it away, and pay attention to what you learn when things break.
What's next
This is my first blog post on oatlab. I plan to write more about specific topics: agent architecture, Temporal workflow patterns, ComfyUI pipelines, model serving, and the experience of renting GPUs.
If any of this is interesting to you, I'd like to hear from you. Thanks for reading.