Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>It seems like you are looking for dynamic partitioning, and Hive supports dynamic partition inserts as detailed <a href="https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-DynamicPartitionInsert">in this article</a>.</p> <p>First, you need to create a temporary table where you will put your flat data with no partition at all. In your case this would be:</p> <pre><code>CREATE TABLE flatTable (type string, id int, ts bigint, user string, key string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; </code></pre> <p>Then, you should load your flat data file into this directory:</p> <pre><code>LOAD DATA LOCAL INPATH '/home/spaeth/tmp/hadoop-billing-data/extracted/testData.csv' INTO TABLE flatTable; </code></pre> <p>At that point you can use the dynamic partition insert. A few things to keep in mind are that you'll need the following properties:</p> <ul> <li><code>hive.exec.dynamic.partition</code> should be set to <code>true</code> because dynamic partition is disabled by default I believe.</li> <li><code>hive.exec.dynamic.partition.mode</code> should be set to <code>nonstrict</code> because you have a single partition and strict mode enforces that you need one static partition.</li> </ul> <p>So you can run the following query:</p> <pre><code>SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=nonstrict; FROM flatTable INSERT OVERWRITE TABLE partitionedTable PARTITION(time) SELECT user, from_unixtime(ts, 'yyyy-MM-dd') AS time </code></pre> <p>This should spawn 2 MapReduce jobs, and at the end you should see something along the lines of:</p> <pre><code>Loading data to table default.partitionedtable partition (time=null) Loading partition {time=2013-02-10} Loading partition {time=2013-02-11} Loading partition {time=2013-02-13} Loading partition {time=2013-06-09} </code></pre> <p>And to verify that your partitions are indeed here:</p> <pre><code>$ hadoop fs -ls /user/hive/warehouse/partitionedTable/ Found 4 items drwxr-xr-x - username supergroup 0 2013-11-25 18:35 /user/hive/warehouse/partitionedTable/time=2013-02-10 drwxr-xr-x - username supergroup 0 2013-11-25 18:35 /user/hive/warehouse/partitionedTable/time=2013-02-11 drwxr-xr-x - username supergroup 0 2013-11-25 18:35 /user/hive/warehouse/partitionedTable/time=2013-02-13 drwxr-xr-x - username supergroup 0 2013-11-25 18:35 /user/hive/warehouse/partitionedTable/time=2013-06-09 </code></pre> <p>Please note that dynamic partitions are only supported since Hive 0.6, so if you have an older version this is probably not going to work.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload