The conventional wisdom says that if you want analytics, you need a data warehouse. Copy your data to Snowflake, BigQuery, or Redshift. Build a pipeline. Set up dbt. Then connect a BI tool. Six months later, you have dashboards.

For a company past a certain scale, that's the right architecture. But for most startups, early-stage companies, and small product teams, this is massive overkill and it actively slows you down.

Your production database already has all the data you need. This article explains how to run real-time analytics directly against your production database (or a read replica) without standing up a data warehouse.

The Data Warehouse Assumption Is Often Wrong

Data warehouses exist to solve specific problems:

You're running heavy analytical queries that would overwhelm your production database

You need to join data from dozens of different sources (Stripe, Salesforce, your app DB, ad platforms)

You have regulatory requirements to keep analytical data separate

Your data team is large enough to maintain ETL pipelines

If none of those apply to you, a data warehouse is a solution to a problem you don't have. The cost isn't just money (though Snowflake and Redshift costs add up fast). It's latency, complexity, and the maintenance burden of keeping your pipeline in sync.

Most teams under 50 employees and many teams up to a few hundred don't need a warehouse. Their production database, whether it's PostgreSQL, MySQL, Supabase, or PlanetScale, has everything they need.

What "Real-Time" Actually Means Here

"Real-time" in analytics is often oversold. For most business use cases, you don't need sub-second latency. You need data that is:

Fresh enough to make decisions usually minutes to hours, not milliseconds

Accurate not lagging days behind because a pipeline failed silently

Accessible not locked behind a DBA who has to run the query for you

Querying your production database directly especially via a read replica satisfies all three. You get data that's current as of a few seconds ago, with no pipeline to fail, accessible to anyone with the right tool.

The Read Replica Pattern

Before running analytics directly against production, the key concern is load. A heavy analytical query one that scans millions of rows, does multiple JOINs, and computes aggregates can slow down your application.

The solution is a read replica: a copy of your database that receives writes from the primary but handles reads separately. Most managed database providers offer this:

PostgreSQL (AWS RDS, Supabase, Neon): Create a read replica in a few clicks from the console

MySQL / PlanetScale: Read replicas are supported and recommended for analytical workloads

Supabase: Analytics queries can be routed to a separate connection string

Point your analytics tool at the read replica. Your application traffic hits the primary. Your analytical queries run on the replica without competing for the same resources.

If you're on a small database (under a few million rows) and your queries complete in under a second, you may not even need a replica yet. Just query the primary with care avoid full table scans on hot tables during peak traffic.

What You Can Build Without a Warehouse

Here's what's possible with direct database analytics using a natural language tool like AI for Database:

Revenue reporting:

SELECT
  DATE_TRUNC('month', created_at) AS month,
  SUM(amount) AS mrr,
  COUNT(DISTINCT customer_id) AS paying_customers
FROM subscriptions
WHERE status = 'active'
GROUP BY month
ORDER BY month DESC;

Ask this as: "Show me monthly recurring revenue and active customer count for the last 6 months."

Retention cohorts:

SELECT
  DATE_TRUNC('week', first_login) AS cohort_week,
  COUNT(DISTINCT user_id) AS cohort_size,
  COUNT(DISTINCT CASE WHEN last_login >= first_login + INTERVAL '7 days' THEN user_id END) AS retained_week1
FROM user_activity
GROUP BY cohort_week
ORDER BY cohort_week DESC;

Ask this as: "Show me weekly user cohorts and how many came back after 7 days."

Funnel analysis:

SELECT
  COUNT(*) FILTER (WHERE signed_up = true) AS signups,
  COUNT(*) FILTER (WHERE connected_db = true) AS connected,
  COUNT(*) FILTER (WHERE ran_first_query = true) AS activated
FROM onboarding_events
WHERE created_at >= CURRENT_DATE - INTERVAL '30 days';

Ask this as: "How many users in the last 30 days signed up, connected a database, and ran their first query?"

These are exactly the kinds of queries that non-technical business stakeholders need and they run in milliseconds on a well-indexed production database.

The Performance Concern and How to Address It

The objection you'll often hear from engineers: "We can't run analytics against production, it'll be too slow."

Sometimes that's true. Often it's not. Here's when it matters:

When direct DB analytics works fine:

Tables under roughly 10 million rows

Queries that filter on indexed columns

Dashboards that refresh every hour (not every second)

Applications with fewer than 100K daily active users

When you should consider a replica or a lightweight warehouse:

Queries that scan entire large tables with no usable index

Sub-minute refresh requirements on heavy aggregation queries

Joining five or more large tables with complex transformations

When analytical query load is visibly affecting application response time

The practical path for most teams: start direct. Add a read replica if you notice performance issues. Graduate to a warehouse only if the replica approach isn't sufficient. Most teams reach that point later than they expect.

Setting Up Automated Dashboards From Your Production Database

Once you've decided to query your production database (or replica) directly, the next step is making those insights available without running queries manually.

AI for Database lets you build dashboards entirely from plain-English questions. Set a refresh schedule hourly, daily, or custom and the queries run automatically. Your team wakes up to current data without anyone having to remember to run anything.

A typical startup ops dashboard built this way:

Metric | Refresh | Natural Language Question

Daily signups | Hourly | "How many users signed up in the last 24 hours?"

MRR | Daily 6am | "What is total active subscription revenue this month?"

Churn this week | Daily | "How many users cancelled in the last 7 days?"

Active users today | Hourly | "How many unique users logged in today?"

Open support tickets | Every 4h | "How many open support tickets are there right now?"

This is a complete business dashboard, built from five natural language questions, with no SQL written and no data warehouse.

Replacing the "Can You Pull a Report?" Request

One of the highest-value use cases for direct database analytics is eliminating the ad hoc "can you pull a quick report?" request that interrupts engineers throughout the day.

With natural language database access, the ops manager who wants "revenue by country for last quarter" can get it themselves in 30 seconds. The product manager asking "which onboarding step has the most drop-off this week?" can answer their own question. The CEO asking "how many paying customers do we have right now?" doesn't need to wait for a data export.

Each of these queries takes an engineer 10–30 minutes to handle when the request comes in ad hoc context switching, finding the right tables, writing and debugging SQL, formatting the output. At 10 requests per week across a team, that's hours of engineering time redirected away from product work.

Setting Up Alerts Without a Pipeline

Beyond dashboards, your production database is also the right place to watch for conditions that need immediate attention.

Instead of building a complex event pipeline, you can define alert conditions in plain English through AI for Database's action workflows:

"If daily new signups fall below 20, send a Slack message to #growth"

"If any transaction amount exceeds $10,000, email the finance team"

"If failed logins in the last hour exceed 100, call the security webhook"

The system checks these conditions on a schedule against your live database and fires the action automatically when the condition is met. No stored procedures. No database triggers. No separate monitoring stack to maintain.

Creating a Read-Only Analytics User

Before connecting any analytics tool to your production database, create a dedicated read-only user. This prevents any accidental writes and makes it easy to audit what the analytics connection is doing.

In PostgreSQL:

CREATE USER analytics_user WITH PASSWORD 'choose_a_strong_password';
GRANT CONNECT ON DATABASE your_database TO analytics_user;
GRANT USAGE ON SCHEMA public TO analytics_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analytics_user;
-- Also grant access to future tables:
ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT SELECT ON TABLES TO analytics_user;

In MySQL:

CREATE USER 'analytics_user'@'%' IDENTIFIED BY 'choose_a_strong_password';
GRANT SELECT ON your_database.* TO 'analytics_user'@'%';
FLUSH PRIVILEGES;

Use these credentials when connecting AI for Database (or any analytics tool) to your database. If your credentials are ever compromised, the attacker can only read data not modify it.

When to Graduate to a Data Warehouse

Direct database analytics works until it doesn't. The signs you're outgrowing it:

Analytical queries are noticeably slowing down your application even with a replica

Your data team spends significant time maintaining joins between ten or more tables

You need to combine database data with external sources (Stripe webhooks, Salesforce exports, ad platform APIs)

You have compliance requirements mandating separation of production and analytical data

Your heaviest queries take more than 30 seconds on a replica

At that point, a lightweight warehouse (BigQuery, Motherduck, or DuckDB) becomes the right call. But most teams reach that point later than they expect and having spent the interim period doing direct database analytics means your team already knows exactly what data they need from a warehouse and why.

Start querying your database for free → Connect in 2 minutes at aifordatabase.com, no SQL required.

Real-Time Database Analytics Without a Data Warehouse