The conventional wisdom says that if you want analytics, you need a data warehouse. Copy your data to Snowflake, BigQuery, or Redshift. Build a pipeline. Set up dbt. Then connect a BI tool. Six months later, you have dashboards.
For a company past a certain scale, that's the right architecture. But for most startups, early-stage companies, and small product teams, this is massive overkill and it actively slows you down.
Your production database already has all the data you need. This article explains how to run real-time analytics directly against your production database (or a read replica) without standing up a data warehouse.
The Data Warehouse Assumption Is Often Wrong
Data warehouses exist to solve specific problems:
If none of those apply to you, a data warehouse is a solution to a problem you don't have. The cost isn't just money (though Snowflake and Redshift costs add up fast). It's latency, complexity, and the maintenance burden of keeping your pipeline in sync.
Most teams under 50 employees and many teams up to a few hundred don't need a warehouse. Their production database, whether it's PostgreSQL, MySQL, Supabase, or PlanetScale, has everything they need.
What "Real-Time" Actually Means Here
"Real-time" in analytics is often oversold. For most business use cases, you don't need sub-second latency. You need data that is:
Querying your production database directly especially via a read replica satisfies all three. You get data that's current as of a few seconds ago, with no pipeline to fail, accessible to anyone with the right tool.
The Read Replica Pattern
Before running analytics directly against production, the key concern is load. A heavy analytical query one that scans millions of rows, does multiple JOINs, and computes aggregates can slow down your application.
The solution is a read replica: a copy of your database that receives writes from the primary but handles reads separately. Most managed database providers offer this:
Point your analytics tool at the read replica. Your application traffic hits the primary. Your analytical queries run on the replica without competing for the same resources.
If you're on a small database (under a few million rows) and your queries complete in under a second, you may not even need a replica yet. Just query the primary with care avoid full table scans on hot tables during peak traffic.
What You Can Build Without a Warehouse
Here's what's possible with direct database analytics using a natural language tool like AI for Database:
Revenue reporting:
SELECT
DATE_TRUNC('month', created_at) AS month,
SUM(amount) AS mrr,
COUNT(DISTINCT customer_id) AS paying_customers
FROM subscriptions
WHERE status = 'active'
GROUP BY month
ORDER BY month DESC;Ask this as: "Show me monthly recurring revenue and active customer count for the last 6 months."
Retention cohorts:
SELECT
DATE_TRUNC('week', first_login) AS cohort_week,
COUNT(DISTINCT user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN last_login >= first_login + INTERVAL '7 days' THEN user_id END) AS retained_week1
FROM user_activity
GROUP BY cohort_week
ORDER BY cohort_week DESC;Ask this as: "Show me weekly user cohorts and how many came back after 7 days."
Funnel analysis:
SELECT
COUNT(*) FILTER (WHERE signed_up = true) AS signups,
COUNT(*) FILTER (WHERE connected_db = true) AS connected,
COUNT(*) FILTER (WHERE ran_first_query = true) AS activated
FROM onboarding_events
WHERE created_at >= CURRENT_DATE - INTERVAL '30 days';Ask this as: "How many users in the last 30 days signed up, connected a database, and ran their first query?"
These are exactly the kinds of queries that non-technical business stakeholders need and they run in milliseconds on a well-indexed production database.
The Performance Concern and How to Address It
The objection you'll often hear from engineers: "We can't run analytics against production, it'll be too slow."
Sometimes that's true. Often it's not. Here's when it matters:
When direct DB analytics works fine:
When you should consider a replica or a lightweight warehouse:
The practical path for most teams: start direct. Add a read replica if you notice performance issues. Graduate to a warehouse only if the replica approach isn't sufficient. Most teams reach that point later than they expect.
Setting Up Automated Dashboards From Your Production Database
Once you've decided to query your production database (or replica) directly, the next step is making those insights available without running queries manually.
AI for Database lets you build dashboards entirely from plain-English questions. Set a refresh schedule hourly, daily, or custom and the queries run automatically. Your team wakes up to current data without anyone having to remember to run anything.
A typical startup ops dashboard built this way:
Metric | Refresh | Natural Language Question
Daily signups | Hourly | "How many users signed up in the last 24 hours?"
MRR | Daily 6am | "What is total active subscription revenue this month?"
Churn this week | Daily | "How many users cancelled in the last 7 days?"
Active users today | Hourly | "How many unique users logged in today?"
Open support tickets | Every 4h | "How many open support tickets are there right now?"
This is a complete business dashboard, built from five natural language questions, with no SQL written and no data warehouse.
Replacing the "Can You Pull a Report?" Request
One of the highest-value use cases for direct database analytics is eliminating the ad hoc "can you pull a quick report?" request that interrupts engineers throughout the day.
With natural language database access, the ops manager who wants "revenue by country for last quarter" can get it themselves in 30 seconds. The product manager asking "which onboarding step has the most drop-off this week?" can answer their own question. The CEO asking "how many paying customers do we have right now?" doesn't need to wait for a data export.
Each of these queries takes an engineer 10–30 minutes to handle when the request comes in ad hoc context switching, finding the right tables, writing and debugging SQL, formatting the output. At 10 requests per week across a team, that's hours of engineering time redirected away from product work.
Setting Up Alerts Without a Pipeline
Beyond dashboards, your production database is also the right place to watch for conditions that need immediate attention.
Instead of building a complex event pipeline, you can define alert conditions in plain English through AI for Database's action workflows:
The system checks these conditions on a schedule against your live database and fires the action automatically when the condition is met. No stored procedures. No database triggers. No separate monitoring stack to maintain.
Creating a Read-Only Analytics User
Before connecting any analytics tool to your production database, create a dedicated read-only user. This prevents any accidental writes and makes it easy to audit what the analytics connection is doing.
In PostgreSQL:
CREATE USER analytics_user WITH PASSWORD 'choose_a_strong_password';
GRANT CONNECT ON DATABASE your_database TO analytics_user;
GRANT USAGE ON SCHEMA public TO analytics_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analytics_user;
-- Also grant access to future tables:
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO analytics_user;In MySQL:
CREATE USER 'analytics_user'@'%' IDENTIFIED BY 'choose_a_strong_password';
GRANT SELECT ON your_database.* TO 'analytics_user'@'%';
FLUSH PRIVILEGES;Use these credentials when connecting AI for Database (or any analytics tool) to your database. If your credentials are ever compromised, the attacker can only read data not modify it.
When to Graduate to a Data Warehouse
Direct database analytics works until it doesn't. The signs you're outgrowing it:
At that point, a lightweight warehouse (BigQuery, Motherduck, or DuckDB) becomes the right call. But most teams reach that point later than they expect and having spent the interim period doing direct database analytics means your team already knows exactly what data they need from a warehouse and why.