Note that there are some explanatory texts on larger screens.

plurals
  1. POHive performance on Amazon DynamoDB
    primarykey
    data
    text
    <p>I'm using Amazon DynamoDB to collect statistics and ElasticMapReduce with Hive to process statistics and upload results to S3.</p> <p>On DynamoDB I have table prod_product_views: - id (Hash key) - product_id (Range key) - company_id - creted - price - viewed_by_company_id - viewed_by_user_id</p> <p>For now in this table there are about 7000 records.</p> <p>Problem is that hiveql running to slow. </p> <p>For example I have first step to create external table stored on DynamoDB:</p> <pre><code>CREATE EXTERNAL TABLE prod_product_views (id string, product_id bigint, company_id bigint, created bigint, price string, viewed_by_company_id bigint, viewed_by_user_id bigint) STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler' TBLPROPERTIES ("dynamodb.table.name" = "prod_product_views", "dynamodb.column.mapping" = "id:id,product_id:product_id,company_id:company_id,created:created,price:price,viewed_by_company_id:viewed_by_company_id,viewed_by_user_id:viewed_by_user_id"); </code></pre> <p>This step is ok (Time taken: 12.908 seconds)</p> <p>Secund step is to get views for last day:</p> <pre><code>SELECT * from prod_product_views WHERE created &gt; UNIX_TIMESTAMP(CONCAT(DATE_SUB(FROM_UNIXTIME(UNIX_TIMESTAMP()), 1)," ","00:00:00")) and created &lt; UNIX_TIMESTAMP(CONCAT(DATE_SUB(FROM_UNIXTIME(UNIX_TIMESTAMP()), 1)," ","23:59:59")); </code></pre> <p>This step takes a long time (about 60min) maybe more.</p> <p>This is part of output:</p> <pre><code>2013-05-23 08:23:06,097 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:07,103 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:08,109 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:09,115 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:10,121 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:11,147 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:12,153 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:13,160 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:14,169 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:15,177 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:16,183 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:17,193 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:18,219 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:19,225 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:20,234 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:21,240 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:22,247 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:23,253 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:24,259 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:25,265 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:26,273 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:27,279 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:28,290 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:29,312 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:30,318 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:31,324 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:32,333 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:33,358 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:34,364 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:35,394 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:36,400 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:37,408 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:38,418 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:39,478 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:40,538 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:41,544 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:42,550 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:43,557 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:44,563 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:45,569 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:46,579 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.39 sec 2013-05-23 08:23:47,607 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:48,613 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:49,623 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:50,633 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:51,638 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:52,650 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:53,657 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:54,665 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:55,691 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec 2013-05-23 08:23:56,697 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 3.76 sec </code></pre> <p>I'm new with this kind of services, am I doing something wrong, or is there some tricks in configuration or something to speed up this? Because this seems like a simple query and 7000 records is not a large amount of data.</p> <p>Thanks in advance!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload