Note that there are some explanatory texts on larger screens.

plurals
  1. POOut of memory due to hash maps used in map-side aggregation
    text
    copied!<p>MY Hive Query is throwing this exception.</p> <pre><code>Hadoop job information for Stage-1: number of mappers: 6; number of reducers: 1 2013-05-22 12:08:32,634 Stage-1 map = 0%, reduce = 0% 2013-05-22 12:09:19,984 Stage-1 map = 100%, reduce = 100% Ended Job = job_201305221200_0001 with errors Error during job, obtaining debugging information... Examining task ID: task_201305221200_0001_m_000007 (and more) from job job_201305221200_0001 Examining task ID: task_201305221200_0001_m_000003 (and more) from job job_201305221200_0001 Examining task ID: task_201305221200_0001_m_000001 (and more) from job job_201305221200_0001 Task with the most failures(4): ----- Task ID: task_201305221200_0001_m_000001 URL: http://ip-10-134-7-119.ap-southeast-1.compute.internal:9100/taskdetails.jsp?jobid=job_201305221200_0001&amp;tipid=task_201305221200_0001_m_000001 Possible error: Out of memory due to hash maps used in map-side aggregation. Solution: Currently hive.map.aggr.hash.percentmemory is set to 0.5. Try setting it to a lower value. i.e 'set hive.map.aggr.hash.percentmemory = 0.25;' ----- Counters: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask select uri, count(*) as hits from iislog where substr(cs_cookie,instr(cs_Cookie,'cwc'),30) like '%CWC%' and uri like '%.aspx%' and logdate = '2013-02-07' group by uri order by hits Desc; </code></pre> <p>I tried this on 8 EMR core instances with 1 large master instance on 8Gb of data. First i tried with external table (location of data is path of s3). After that i downloaded data from S3 to EMR and used native hive tables. But in both of them i got the same error. </p> <pre><code>FYI, i am using regex serde to parse the iislogs. 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ( "input.regex" ="([0-9-]+) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\".*\"|[^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\".*\"|[^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\".*\"|[^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([0-9-]+ [0-9:.]+) ([^ ]*) ([^ ]*) (\".*\"|[^ ]*) ([0-9-]+ [0-9:.]+)", "output.format.string"="%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s %10$s %11$s %12$s %13$s %14$s %15$s %16$s %17$s %18$s %19$s %20$s %21$s %22$s %23$s %24$s %25$s %26$s %27$s %28$s %29$s %30$s %31$s %32$s") location 's3://*******'; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload