GigaOm

Get your Free GigaOm account today.

Access complimentary GigaOm content by signing up for a FREE GigaOm account today — or upgrade to premium for full access to the GigaOm research catalog. Join now and uncover what you’ve been missing!

Get My Free Account

George Gilbert Jan 29, 2015 (Oct 13, 2020)

Sector Roadmap: Hadoop/Data Warehouse Interoperability

Summary
Introduction and Methodology
Usage Scenarios
Disruption Vectors
Company Analysis
Key Takeaways
About George Gilbert
About GigaOm
Copyright

1. Summary

SQL-on-Hadoop capabilities played a key role in the big data market in 2013. In 2014, their importance only grew, as did their ubiquitousness, making possible new use cases for big data. Now, with virtually every Hadoop distribution vendor and incumbent database vendor offering SQL-on-Hadoop solutions, the key factor in the market is no longer mere SQL query capability, it’s the quality and economics of the resulting integration between Hadoop and data warehouse technology.

This Sector Roadmap^TM examines that integration, reviewing SQL-on-Hadoop solutions on offer from the three major Hadoop vendors: Cloudera, Hortonworks, and MapR; incumbent data warehouse vendor Teradata; relational-database juggernaut Oracle; and Hadoop/data warehouse hybrid vendor Pivotal. With this analysis, key usage scenarios made possible by these solutions are identified, as are the architectural distinctions between them.

Vendor solutions are evaluated over six Disruption Vectors: schema flexibility, data engine interoperability, pricing model, enterprise manageability, workload role optimization, and query engine maturity. These vectors collectively measure not just how well a SQL-on-Hadoop solution can facilitate Hadoop-data warehouse integration, but how successfully it does so with respect to the emerging usage patterns discussed in this report.

Key findings in our analysis include:

In addition to the widely discussed data lake, the adjunct data warehouse is a key concept, which has a greater near-term relevance to pragmatist customers.
The adjunct data warehouse provides for production ETL, reporting, and BI on the data sources first explored in the data lake. It also offloads production ETL from the core data warehouse in order to avoid costly capacity additions on proprietary platforms at a 10- to 30-times cost premium.
MapR fared best in our comparison due to the integration powers of Apache Drill’s technology. It would have fared better still were Drill not in such a relatively early phase of development.
Hortonworks, given its enhancements to Apache Hive, and Cloudera, with its dominant Impala SQL-on-Hadoop engine, follow closely behind MapR.
Despite their conventional data warehouse pedigrees, Teradata, Pivotal, and Oracle are very much in the game as they make their comprehensive SQL languages available as a query interface over data in Hadoop.

Key:

Number indicates company’s relative strength across all vectors
Size of ball indicates company’s relative strength along individual vector

Source: Gigaom Research

Image courtesy of 3dmentat/iStock.

CxO Decision Brief CxO

Commissioned Research

Howard Holton Jul 2, 2024

CxO Decision Brief: Real-Time Data Processing and Analytics

This GigaOm CxO Decision Brief commissioned by Cogility. Cogility’s Cogynt is a continuous intelligence software platform for real-time data stream processing…

Key Criteria VP/Arch

Premium

Andrew J. Brust Apr 29, 2024

GigaOm Key Criteria for Evaluating Data Pipeline Solutions

Data pipelines are solutions that manage the movement and/or transformation of data, readying it for storage in a target repository and…

Radar Eng

Premium

Andrew J. Brust Apr 15, 2024 (May 7, 2024)

GigaOm Radar for Streaming Data Platforms

Streaming data platforms ingest, process, transform, analyze, and render action from streaming data in real time. The best tools can do…

Key Criteria VP/Arch

Premium

Andrew J. Brust Mar 26, 2024

GigaOm Key Criteria for Evaluating Streaming Data Platforms

Streaming data platforms ingest, process, transform, analyze, and render action from streaming data in real time. The best tools can do…

TCO & Benchmark VP/Arch/Eng

Premium

Commissioned Research

Eric Phenix Mar 26, 2024 (Mar 21, 2024)

GigaOm Benchmark: Testing Zoom AI Companion

This GigaOm Benchmark Report was commissioned by Zoom. Artificial Intelligence has undergone a massive leap forward with the proliferation of generative…

CxO Decision Brief CxO

Commissioned Research

Howard Holton Mar 26, 2024 (Mar 27, 2024)

CxO Decision Brief: Zoom AI Companion Amplifies Meeting Outcomes

This CxO Decision Brief commissioned by Zoom. Zoom AI Companion, embedded within the widely deployed collaboration and communications platform (at no…

Get your Free GigaOm account today.

Table of Contents

1. Summary

Related Research