Note that there are some explanatory texts on larger screens.

plurals
  1. POIs Hadoop Suitable For This?
    primarykey
    data
    text
    <p>We have some Postgres queries that take 6 - 12 hours to complete and are wondering if Hadoop is suited to doing it faster. We have (2) 64 core servers with 256GB of RAM that Hadoop could use.</p> <p>We're running PostgreSQL 9.2.4. Postgres only uses one core on one server for the query, so I'm wondering if Hadoop could do it roughly 128 times faster, minus overhead.</p> <p>We have two sets of data, each with millions of rows.</p> <p>Set One:</p> <pre> id character varying(20), a_lat double precision, a_long double precision, b_lat double precision, b_long double precision, line_id character varying(20), type character varying(4), freq numeric(10,5) </pre> <p>Set Two:</p> <pre> a_lat double precision, a_long double precision, b_lat double precision, b_long double precision, type character varying(4), freq numeric(10,5) </pre> <p>We have indexes on all lat, long, type, and freq fields, using btree. Both tables have "VACUUM ANALYZE" run right before the query.</p> <p>The Postgres query is:</p> <pre><code>SELECT id FROM setone one WHERE not exists ( SELECT 'x' FROM settwo two WHERE two.a_lat &gt;= one.a_lat - 0.000278 and two.a_lat &lt;= one.a_lat + 0.000278 and two.a_long &gt;= one.a_long - 0.000278 and two.a_long &lt;= one.a_long + 0.000278 and two.b_lat &gt;= one.b_lat - 0.000278 and two.b_lat &lt;= one.b_lat + 0.000278 and two.b_long &gt;= one.b_long - 0.000278 and two.b_long &lt;= one.b_long + 0.000278 and ( two.type = one.type or two.type = 'S' ) and two.freq &gt;= one.freq - 1.0 and two.freq &lt;= one.freq + 1.0 ) ORDER BY line_id </code></pre> <p>Is that the type of thing Hadoop can do? If so can you point me in the right direction?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload