Startups and mid-sized enterprises usually begin their infrastructure journey with a single, monolithic relational database—most commonly PostgreSQL. Initially, this is a brilliant architectural choice. PostgreSQL handles robust ACID compliance, foreign keys, constraints, and joins with unmatched reliability.
It handles users, passwords, sessions, financial logs, and configuration state perfectly. This is the realm of OLTP (Online Transaction Processing).
However, as the platform begins to offer heavy dashboards, historical trend tracking, and multi-dimensional filtering, response times begin to crack. The usual “solutions” are applied: adding read replicas, throwing RAM at the RDS instance, building materialized views, and applying aggressive indexations.
But at billions of rows, a fundamental truth becomes apparent: You cannot optimize an OLTP engine for exhaustive OLAP (Online Analytical Processing) workloads without eventually crashing the system or burning cash.
The Problem: Row-Based Engines vs Column-Based Queries
In PostgreSQL, data is stored in rows. If you want to compute the SUM(total_revenue) across 50 million sales records, the database engine must theoretically load entirely unnecessary fields (like customer descriptions, metadata blobs, and boolean flags) from the disk block just to access the total_revenue field.
This causes massive I/O saturation. Memory caches get flushed to make room for huge sequential scans, subsequently slowing down the core transactional duties of the app (like a user simply logging in).
The Montinegro Analytics Approach: The Structural Split
To guarantee sub-second reporting for our Business Intelligence clients regardless of incoming data scale, we strictly enforce a bifurcated data architecture in our SaaS solutions:
1. PostgreSQL (The Source of Truth)
We utilize PostgreSQL exclusively for the transactional heartbeat. User sessions, core configurations, permission matrices, and application logic. Its memory is tightly bound to maintaining sub-10ms response times for the backend REST APIs.
2. ClickHouse (The Analytical Anchor)
For all payload events, heavy logs, metric aggregations, and dimensional telemetry, the data stream is funneled into ClickHouse—a lightning-fast columnar database management system.
Because ClickHouse stores data by columns, the same SUM(total_revenue) query across 50 million rows takes milliseconds. It only grabs the physical bytes of that specific column on disk, compressing it natively via advanced encodings, completely ignoring the noise.
3. Celery & Redis (The Ingestion Buffer)
To prevent the Django backend from blocking while writing raw data to ClickHouse, we offload the processing heavy lifting. Asynchronous tasks are routed via Redis to Celery workers. These workers parse, clean, and batch-insert incoming data payloads (such as processed .pbip metadata from Power BI integrations) into ClickHouse with supreme efficiency.
The Dashboard Consumption (Apache Superset)
Finally, our visual layers—currently driven by Apache Superset (slated for headless transition to Cube.dev)—are directly hooked into ClickHouse, bypassing the Django ORM and the PostgreSQL transactional core entirely.
The result? The SaaS interface (rendered over Dart/Flutter) remains incredibly snappy. Client dashboards can drill down through billions of rows aggregated in real-time, while application users continue logging in and navigating settings on the PostgreSQL side with zero latency impact.
Conclusion
Data architecture shouldn’t be a monolith constraint. True scale requires an honest admission of limitations: Let PostgreSQL do what it was built to do—manage transactions reliably. Let ClickHouse do what it was built to do—destroy massive analytical queries in milliseconds.