Rethinking Data Architecture in the Age of AI
In the late 80s, relational databases allowed data to be managed independently from specific applications. This, for the first time, enabled a “central source of truth.” Procedural code, such as PL/SQL, ensured consistency by applying business rules directly to the data.
The volume of data managed by database servers grew rapidly. Initially, most database applications had two tiers: the database and the client, which contained both business logic and the user interface. The high license prices for relational databases became a limiting factor for client-server computing in the early 90s. One way to keep license costs at bay was to separate business logic into distinct servers. Business logic moved out of the database and into its own tier — the application server.
Database servers initially did not provide built-in, fine-grained, per-row or per-column permissions management. When that was added in 1999 by Oracle, or later by Microsoft with the release of SQL Server 2016, the ship had mostly sailed. Permissions and business logic had already been separated from the database and were now managed in the application tier.
Development shifted from “database first” to “code first.” Any change in the data model was defined in code. This caused friction and silos, as applications were written in different languages like Java or C#. The information about data structures became buried in multiple codebases, written in different languages.
PostgreSQL was initially released in 1997. Unlike MySQL, it has a permissive license, so it imposes no real license constraints. With the introduction of logical replication in version 10 in 2017, its march to dominance in the enterprise space became unstoppable.
In 2025, your data is either isolated in a multitude of SaaS solutions and accessible only through non-standardized APIs, or trapped in internal data silos hidden behind paradigms like object-relational mappers (ORMs).
Large multinational organizations and internet companies must manage staggering volumes of data requiring technologies like data warehouses or data lakes. Yet most organizations have data volumes that could easily be handled by a single PostgreSQL instance.
I believe it’s time to revive and perfect the lost art of database first:
- Keep all data in a central database.
- Enforce consistency with business rules in the database.
- Manage access control centrally.
- Provide one API to access all data.
- Empower LLM agents with secure data access.
- Use permissive licensing that doesn’t restrict use cases.
- Run it in the cloud or on your own infrastructure.