By Shammy Narayanan
“Serverless” and “No-Code/Low Code” are two marketing spiels that seduce many CxOs into a fantasy dream of supersonic delivery in a fraction of the planned budget. “Zero ETL” is the new incarnation joining this honeytrap. While the executives shrouded with spreadsheets can get blindsided by such flowery rhetoric, the teams toiling under the hood know well that such fairy tales are more apt for Marvel comics.
ETL is the palaeolithic process of Extracting data from the source, Transforming it and Loading it into a destination. While extract and load portions are straightforward, unbridled complexity and unnerving sophistication set in during the Transform phase. No doubt that ETL has stayed beyond its expiry date and is overdue for an overhaul, but the past attempts in this direction have not resulted in spectacular success; one idea that came closer to the goalpost is “Virtualization“.
The beauty of virtualization is that it doesn’t involve real data movement from the source; instead, it builds a unified view by applying the transformation to the collated and curating data from disparate sources. Unfortunately, while this approach on paper looks like a panacea for the gnawing transformation pain, it’s more of anaesthesia than a cure. Although this technique works exceptionally well for a small volume of data, it stings and stinks as the data volume grows. Leveraging virtualization for larger volume is akin to embarking on a Trans-Atlantic tour on a leaking parachute.
The immediate fix advocated in such a scenario is to performance engineer your application or to upgrade the SaaS version; let’s face it, it’s an excruciating pain to re-engineer your time-tested application to cater to these annoying ad-hoc limitations of a SaaS vendor. The collective cons of reengineering efforts, accompanying infra bills, and an infuriated end user far outweigh the benefits of this technique. While I am not against virtualization but is not a “One size fits all” solution. Before stepping in, know the striking difference between the scenic “Cape of Miami” and the fleeting Bermuda triangle.
The most crucial rationale that works against the zero ETL concept is the absence of a common language among data elements. Enterprise gets data from various tributaries, including internal systems, partners, vendors, historical archives, Govt and free markets. From varying taxonomy, formats, data types, unavailable/missing data, inconsistency and whatnot? Problems come in all conceivable sizes and shapes. So it’s natural that significant efforts go towards transforming the data. It’s the sole saviour that delivers meaningful insights and actionable intelligence from this chaotic trove of binary dumps. So how do we even replace it?
The challenge to simplify Data Transformation shouldn’t be viewed in silos but in collaboration with the daunting threat of legacy monoliths. A fair proportion of transformation logic is seared deeply into the legacy code and is highly intertwined with core business rules. As a result, it is difficult to isolate the transformation segment into one layer and translate it into “Plug and Play” modules. This mindset is understandable, as most of our systems were built during the era when data literacy was abysmal. Any shortcuts to plumb the architecture without modernizing the application will lead to a seven times worse situation than the initial state. Instead, I’d like you to start viewing Data transformation as a comprehensive program embodying the digital tenants of Legacy modernization and DevOps. Plans to defy this approach will be like attempting a world record in solving a Rubik’s cube with boxing gloves on, good luck with it!
So with an untamed monolithic monster, non-standard data sources, ambushing SaaS vendors, and appalling data literacy, ETL is here to stay and reign over. Zero ETL remains a distant, elusive dream. What can be accomplished and is within striking range is Zero Toil ETL, wherein the pain of transformation can be minimized, and the journey be made pleasant …..till then, let’s debunk such fallacies and dwell in the truth.