The Battle Between ETL and ELT in Data Management

Every discussion about modern data architecture eventually lands on the same question: ETL or ELT? The short answer is that the debate is mostly settled. The longer answer explains why, and why it matters.

What the letters actually mean

ETL: Extract, Transform, Load.
ELT: Extract, Load, Transform.

The same three letters. A different order.

In ETL, data is extracted from its sources, transformed on an external machine, and only then loaded into a data warehouse. The transformation happens upstream, before the data reaches its destination. Only clean, structured data lands in the warehouse.

In ELT, data is extracted from sources and loaded raw into the data warehouse first. The transformation happens inside the warehouse itself, using its own compute power.

The order of those last two steps is what changes everything.

Why ETL made sense

ETL was built for a specific era. Data warehouses were expensive. Storage cost real money. Computing power lived in on-premises hardware that teams owned and managed.

In that world, loading raw, messy data into your warehouse was wasteful. You preprocessed everything upstream so only the useful version took up space.

The weakness: when transformation logic had a bug, or business requirements changed, the raw data was never preserved in the warehouse. Re-running meant going back to every source system and starting the extract again.

Why ELT replaced it

ELT did not win because it was a better idea in the abstract. It won because infrastructure changed.

Cloud-native data platforms such as BigQuery, Snowflake, and Redshift made storage cheap and compute elastic. The warehouse became an active processing engine, not a passive container. The economic case for running a separate transformation layer upstream quietly collapsed.

When raw data is virtually free to store and the platform can transform at scale on demand, preserving raw data becomes an advantage, not a liability.

The five arguments for ELT

Cloud infrastructure scales with the data. Cloud data warehouses handle large-scale transformations on demand. No dedicated transformation machines. No on-premises hardware to maintain. Compute grows with the data.

Data is available immediately. Raw data lands in the warehouse as soon as it is extracted. Analysts can start exploring before transformation is even complete. In ETL, they wait.

Lower cost. ETL requires separate infrastructure: transformation servers, pipeline tools, licences. ELT eliminates a dedicated transformation layer, using the warehouse's native compute instead. One infrastructure footprint instead of two.

Transformation becomes iterative. With raw data preserved in the warehouse, transformation logic lives separately from the data itself. Fix a bug or apply a new business rule, re-run the SQL. No re-extraction. No waiting for the pipeline to refetch from every source. This is the one that changes how teams actually work day to day.

Teams gain direct access. When transformation is SQL running inside a warehouse, more people can read, review, and build it. Analysts stop waiting for engineers to unlock data they already need. Collaboration moves faster.

Head-to-head

Dimension	ETL	ELT
Transformation location	External, upstream	Inside the warehouse
Raw data in warehouse	No (only transformed)	Yes
Speed to first query	Slow (wait for transform)	Fast (data lands immediately)
Infrastructure overhead	High	Low
Flexibility to iterate	Low	High
Team access	Gated by upstream pipeline	SQL, broader access
Best fit	Regulated data, legacy systems	Modern data stacks

The verdict

ETL is not obsolete. For data with strict privacy requirements, sensitive personal information or regulated data, transformation must happen upstream before data reaches any storage layer. That control is sometimes legally required, not optional.

ETL also persists in organisations with years of tooling and institutional knowledge built around it. Replacing it is a real cost, not just a technical upgrade decision.

But for teams building or rethinking their data stack today, ELT is the starting assumption. The modern data stack is designed around it.

Where dbt enters

The T in ELT used to be ungoverned. Transformations written ad-hoc, with no version control, no tests, and no documentation. SQL that worked until it didn't, and nobody knew why.

dbt solves that problem. It brings version control, automated testing, and documentation to SQL transformations running inside the data warehouse. It makes the T in ELT something a team can maintain, extend, and build confidence in over time.

That is the subject of the next post.

References

dbt Labs — ETL vs ELT: What's the difference? — The authoritative comparison of both approaches, covering the five ELT advantages and where ETL still holds ground.
dbt Labs — Understanding ELT: Extract, Load, Transform — Deep dive into the ELT workflow, its benefits, and how it maps to modern cloud data platforms.
dbt Documentation — What is dbt? — Official introduction to dbt as the transformation layer in ELT workflows, covering version control, testing, and documentation.