Big Data How to hot swap Apache Kudu tables with Apache Impala Sometimes, there is a need to re-process production data (a process known as a historical data reload, or a backfill). Source table schema might change, or a data discrepancy might
NiFi How to connect Apache NiFi to Apache Impala I spent 4 interesting hours, trying to connect Apache NiFi to Apache Impala. It turned out to be very easy and not really any different from a JDBC compliant database, but at the same time frustrating enough to make me post about it, hoping
Alteryx Quick evaluation of Alteryx In-Database tools Back in 2015, Alteryx announced a brand new set of In-Database tools, available to all customers with no additional license required. Alteryx keeps bringing amazing value to its customers without
Big Data Benchmarking Impala on Kudu vs Parquet Why Apache Kudu Apache Kudu is a recent addition to Cloudera's CDH distribution, open sourced and fully supported by Cloudera with an enterprise subscription. Created by Cloudera and HBase veterans
Hadoop Watch out for timezones with Sqoop, Hive, Impala and Spark My head was spinning as I tried to accomplish a simple thing (as it seemed at first). I load data from 3 Oracle databases, located in different time zones, using