Marketing Agent System
An AI agent system that runs the marketing for Upscale Print. It plans content, generates creative, publishes to Instagram, and feeds performance data back into future decisions.
Introduction
This system runs the marketing for Upscale Print. It plans weekly content, generates images and copy, evaluates creative quality, publishes to Instagram, and feeds performance data back into future planning. The @upscaleprint Instagram account is managed almost entirely by this system.
I built it because I needed real marketing for a real product, and I wanted to find out what it actually takes to make an agent system do useful, ongoing work. Not a demo, not a one-shot prompt chain, but something that runs on its own, handles failures, and improves over time.
The architecture separates reasoning from execution. A TypeScript agent layer handles planning and decisions. A Python worker layer handles scheduled jobs, API calls, and side effects. Both share a SQLite database that serves as the control plane, audit trail, and source of truth.
Problem
Most "AI agent" demos are impressive for about five minutes. Then you ask: can this run on its own? What happens when an API call fails? Who approved that budget change? Where can I see what it did last Tuesday? The answers are usually bad.
These aren’t theoretical concerns. Hidden state that nobody can inspect, side effects that fire without oversight, no separation between the model deciding something and the system actually doing it. They show up fast when you try to run an agent against real APIs with real consequences.
So I built a system that treats these problems as design requirements. Models handle reasoning and evaluation. Explicit jobs handle execution. Everything is logged, retryable, and inspectable.
What I owned
I owned the system end to end:
- Architecture and data model
- Agent and workflow design
- Tools, handlers, and integrations
- Scheduling and execution patterns
- Operational safeguards
- Deployment, documentation, and testing
Outcome
The system has been running since early 2026:
- The @upscaleprint Instagram account is actively managed by the system
- Content planning, image generation, copy, and publishing run on a weekly cycle
- Generated creative goes through vision-based quality evaluation before publishing
- Google Ads campaigns are analyzed and optimized, with approval gates on budget changes
- Performance metrics feed back into planning so the system improves over time
- Every action is logged with full audit trails
System overview
Two processes, one database. The agent layer (TypeScript/Mastra) handles conversation, planning, and specialist reasoning. The worker layer (Python) handles scheduled jobs, API integrations, and execution. Both read and write to a shared SQLite database that serves as the control plane and audit trail.
- The system pulls recent performance data and account context
- A strategist agent builds a weekly content plan
- Posts are stored in the database with status tracking
- Copy and image prompts are generated for each post
- Creative assets are generated and evaluated by a vision model
- If quality doesn’t pass, the system revises and retries
- Approved posts are queued and published to Instagram on schedule
- Engagement metrics are collected after publishing
- Performance data feeds back into the next planning cycle
Key decisions
Separate reasoning from execution
The agent layer proposes, analyzes, and plans. The worker layer does things. When a model suggests something, that suggestion goes through an explicit job before anything happens in the real world. This makes the system retryable, inspectable, and safe to leave running.
Shared SQLite state as the control plane
Instead of spreading state across agent memory, process variables, and external services, everything lives in one SQLite database. Both processes read and write to it. Debugging is straightforward, recovery is possible, and the audit trail is automatic.
Approval gates on risky actions
Not everything the model recommends should happen automatically. Budget changes in Google Ads are a good example. The system flags these for human review instead of executing them directly.
Evaluation loops, not blind trust
Generated images go through a vision-model review before publishing. If the creative doesn’t meet quality criteria, the system revises and retries. This matters especially for brand consistency, where first-try output from image models is often not good enough.
Observability as a feature, not an afterthought
The system was built to be operated, not just launched. Status visibility, health checks, structured logging, and durable records of every action. If something breaks at 3am, I can see exactly what happened.
What mattered most
- Agents need structure around them, not just freedom
- Separating reasoning from execution makes agent systems safe to run unattended
- Observability is not optional when real APIs and real money are involved
- Evaluation loops matter more than prompt quality for consistent output
- The hard part of AI systems is everything around the model call
What I'd improve next
- Richer dashboards for monitoring system health and content performance
- Better operator UX for approvals and manual intervention
- Stronger measurement of how agent decisions affect business outcomes
- Expanding to more channels beyond Instagram and Google Ads
- Automated evaluation of agent quality over longer time horizons
Running this system taught me that the hard part of agents is not the model. It’s everything you build around it to make it safe, observable, and actually useful day after day.