SparkSQL 1.6 Faster than Impala

Datanami recently reported on a new SQL on Hadoop benchmark from AtScale. SQL-on-Hadoop Test: Each Engine Has ‘Sweet Spots’ This is a benchmark using Tableau  to generate the SQL – described as a BI for Hadoop benchmark. For 5 of 13 queries , Spark SQL 1.6 was faster than the latest version of Impala V2.3.…

Latest Cloudera Impala Benchmark

Cloudera’s Feb 11 blog posting about new Impala benchmarks was quite interesting. http://blog.cloudera.com/blog/2016/02/new-sql-benchmarks-apache-impala-incubating-2-3-uniquely-delivers-analytic-database-performance/ It gave the impression that Cloudera published Apache Impala (incubating)  performance results for  99 queries “derived” from TPC-DS. But if you read carefully you will find that Cloudera only published performance results for 47 of the 99 queries. For a single user.…

SQL Hadoop Comparison

Not so long ago, Jeff Kelly wrote on wikibon about customers shifting workloads from “traditional” DW to Hadoop, the role that SQL on Hadoop has, and that “some SQL on Hadoop offering have more SQL functionality than others” I think architecture also plays a role. I see three ways to look at vendor architectures – just…

SQL on Hadoop – Meet Big SQL 3

IBM is extending the value of Big SQL with Big SQL 3 – coming soon in the next release of Big Insights – as discussed this week at the IBM Impact conference in Las Vegas. I had previously mentioned work being done on Big SQL, in the context of Cloudera Impala. By extending the value,…

Hadoop distros are starting to diverge

A recent blog posting by Merv Adrian of Gartner, “Hadoop is in the mind of the beholder”  hit the nail on the head. While most commercial Hadoop distributions now include the same basic components (e.g. pig, hive, hbase, etc), each vendor is starting to add things that make their distribution different.  Merv Adrian stated “The…

The Intel on Cloudera Investment by Intel

Update on April 8 Interview with cloudera CEO Tom Reilly clarified that cloudera is expecting to get approximately 60 percent of the overall $900m financing,  reported on gigaom . So some money from Intel will go to cloudera to grow the business, but not all of it as originally implied. The financing is expected to…

IBM BigInsights for Hadoop moves to Hadoop 2.2 GA

This past Friday, March 28, IBM released BigInsights 2.1.2, the latest version of IBM’s Hadoop offering. This release bumps up the level of the Apache Hadoop open source components, and adds a few other goodies. Key enhancements include: BigR – end to end integration of R into BigInsights.  It enables the use of R as…

If I had $100 Million…

Big Money for Cloudera and Hortonworks – What Does it Mean? Holy cow, there sure seems to be a lot of venture capital money being thrown at what are today – unprofitable – Hadoop distribution companies. In case you missed it, Cloudera recently announced $160m in new funding, and Hortonworks announced $100m in new funding.…