How to hot swap Apache Kudu tables with Apache Impala

How to hot swap Apache Kudu tables with Apache Impala

Sometimes, there is a need to re-process production data (a process known as a historical data reload, or a backfill). Source table schema might change, or a data discrepancy might be discovered, or a source system would be switched to use a different time zone for date/time fields.

One of the old techniques to reload production data with minimum downtime is the renaming. You process and load new data into a temporary(staging) table, something that might take hours or day, validate new data and after that simply rename production table to something else and rename staging one to production.

The last operation normally takes a fraction of second. Another advantage, you still have an old version of production table around in case you need it. If an unfortunate event happens or a bug is discovered with the new processing logic, you can easily restore the peace by renaming it back.

A variation of this technique employs database views - you alter a view definition to point it to another table. Personally, I never liked views, and (in my biased opinion) it will have the same effect as the first method, but with less moving pieces. It is important to realize that views do not always support all the features of regular tables (e.g. special type of indexes, or efficient execution plans).

Whatever method you pick, I think it is important to keep the following considerations in mind:

  • How to validate data in the new table before you swap it?
  • Do you want to retain security permissions, statistics (and indexes) or recreate them every time?
  • Will there be any activity on your system at that time? Are going to be fine with your process killing running queries, or do they continue running with unpredictable results? Will you have a downtime window on your system?
  • Do you have a backup plan if things go wrong? Do you retain old table data or keep it for a while? If yes, for how long and do you need to implement a clean-up process?

Apache Hive and Apache Impala

First, let's see how we can swap Apache Hive or Apache Impala (on HDFS) tables.

Move HDFS files

This is one of my favorite options. You can use LOAD DATA INPATH command to move staging table HDFS files to production table's HDFS location. This operation is very quick (seconds) since it just HDFS metadata change - HDFS blocks are not physically moved on a cluster.

You can also use HDFS file and directory manipulation commands directly, and even lock Hive tables to make sure no one can access them during the process (Locks are not supported by Impala). Refer to a good post here. You do need to consider implications of using LOCKs - I got excited first about this, but quickly realized it would not work in my case.

Exchange HDFS partitions

Hive has a very handy ALTER TABLE EXCHANGE PARTITION command (refer to documentation). It comes down to:

The EXCHANGE PARTITION command will move a partition from a source table to the target table and alter each table's metadata. When the command is executed, the source table's partition folder in HDFS will be renamed to move it to the destination table's partition folder. The Hive metastore will be updated to change the metadata of the source and destination tables accordingly.

It is a bit tricky to use but does work quite nicely.

Rename table

So what's wrong with a plain rename command? Both Hive and Impala can easily rename a table like so:

ALTER TABLE table_stage RENAME TO table_live;

The problem is that normally you will want to retain security permissions on your table and statistics (or indexes). Unless you re-assign permission for your staging table and rebuild stats, you will lose these things. If none of that is relevant in your case, renaming will work.

Apache Kudu

Unfortunately, Apache Kudu does not support (yet) LOAD DATA INPATH command. You cannot exchange partitions between Kudu tables using ALTER TABLE EXCHANGE PARTITION. I posted a question on Kudu's user mailing list and creators themselves suggested a few ideas.

One suggestion was using views (which might work well with Impala and Kudu), but I really liked an idea (thanks Todd Lipcon!) to use ALTER TABLE SET TBLPROPERTIES to rename underlying Kudu table, used by Impala. Follow the steps below!

Setup

Let's assume there is a production Impala table kudutest.person_live. This table is managed by Impala (internal). You've already set permissions for users and also built stats - we will want to retain that. Now the time comes and you need to reload entire table with new data.

Step 1: Populate staging table

Instead of messing with a production table, we create a staging table. Here is a quick example below, creating a staging table from a production table (for simplicity) using CTAS technique (Create-Table-As-Select):

DROP TABLE IF EXISTS kudutest.person_stage;

CREATE TABLE kudutest.person_stage
PRIMARY KEY (person_id)
PARTITION BY HASH(person_id) PARTITIONS 3
STORED AS KUDU
AS
SELECT * FROM kudutest.person_live;

Before you proceed with the next steps, it is crucial to make sure your staging table is good - you are about to swap production table with it.

Step 2: Point production table to a staging Kudu table

First, we have to make production table external, and only after that, we can change the underlying Kudu table using ALTER TABLE SET TBLPROPERTIES.

The reasons for that are outlined in Impala documentation:

When you create a Kudu table through Impala, it is assigned an internal Kudu table name of the form impala::db_name.table_name. You can see the Kudu-assigned name in the output of DESCRIBE FORMATTED, in the kudu.table_name field of the table properties. The Kudu-assigned name remains the same even if you use ALTER TABLE to rename the Impala table or move it to a different Impala database.

If you issue the statement ALTER TABLE impala_name SET TBLPROPERTIES('kudu.table_name' = 'different_kudu_table_name'), the effect is different depending on whether the Impala table was created with a regular CREATE TABLE statement (that is, if it is an internal or managed table), or if it was created with a CREATE EXTERNAL TABLE statement (and therefore is an external table).

Changing the kudu.table_name property of an internal table physically renames the underlying Kudu table to match the new name.

Changing the kudu.table_name property of an external table switches which underlying Kudu table the Impala table refers to; the underlying Kudu table must already exist.

Refer to Kudu documentation here and here to understand better how Kudu works with internal and external tables with Impala.

The code below will point production Impala table to Kudu staging table:

ALTER TABLE kudutest.person_live SET TBLPROPERTIES('EXTERNAL' = 'TRUE');
ALTER TABLE kudutest.person_live
SET TBLPROPERTIES (
  'kudu.table_name' = 'impala::kudutest.person_stage'
);

You can easily check if Impala table is internal or external, and see Kudu table it is using by running DESCRIBE FORMATTED table_name command in Impala.

Step 3: Rename staging Kudu table

At this point, this how things look like on Impala and Kudu side:

Impala person_live uses Kudu person_stage.
Impala person_stage uses Kudu person_stage.
Kudu person_live is not used by Impala (but still stored in Kudu).

You have a choice now:

  1. Drop Kudu person_live table along with Impala person_stage table by repointing it to Kudu person_live table first, and then rename Kudu person_stage table to person_live and repoint Impala person_live table to Kudu person_live table.
  2. Or, if you wish to retain a copy of old production table, you repoint staging table to person_live table and then rename Kudu table to person_stage.

We cannot re-point external table at a different underlying Kudu table unless that other underlying Kudu table already exists.

The end result is that tables in Impala and Kudu are now named the same way:

Impala person_live --> Kudu person_live.
Impala person_stage --> Kudu person_stage.

Next time we need to re-process entire table again, we won't be confused why Impala production table uses Kudu staging table. It will be also easier to script and automate.

Here is a code for my first example:

-- make it external so we can point Impala to another Kudu table:
ALTER TABLE kudutest.person_stage SET TBLPROPERTIES('EXTERNAL' = 'TRUE');
ALTER TABLE kudutest.person_stage
SET TBLPROPERTIES (
  'kudu.table_name' = 'impala::kudutest.person_live'
);
-- make it internal again so drop table will drop both Impala and Kudu tables:
ALTER TABLE kudutest.person_stage SET TBLPROPERTIES('EXTERNAL' = 'FALSE');
DROP TABLE kudutest.person_stage;

-- Change live Impala table to internal so we can rename underlying Kudu table to use a proper name:

ALTER TABLE kudutest.person_live SET TBLPROPERTIES('EXTERNAL' = 'FALSE');
ALTER TABLE kudutest.person_live
SET TBLPROPERTIES (
  'kudu.table_name' = 'impala::kudutest.person_live'
);

Confused?

I hope it was not too confusing (it was to me!) and I am sure things will get easier as Kudu evolves.

All the steps below took 6-8 seconds in total.

Note, I did not have to refresh or invalidate Impala metadata. Once tables were swapped, I did SELECT COUNT(1) on a new table to make sure Impala was using the proper underlying Kudu table.

Based on a user group discussion, Mike Percy opened the following improvement JIRAs:

  • KUDU-2326: Support atomic bulk load operation
  • KUDU-2327: Support atomic swap of tables or partitions

One thing I am still puzzled is how Impala was able to finish my long-running SELECT statement, that I had kicked off right before the swap. I did not get any error messages and I could clearly see that Kudu tables were getting renamed and dropped, while the query was still running in a different session and completed 10 seconds after the swap. This is still a mystery to me. The only explanation I have is that data was already in Impala daemons memory and did not need Kudu tables at that point.