Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>To distill the described result, this would probably simplest and fastest:</p> <pre><code>SELECT DISTINCT ON (devicename, objectid) * FROM tbl ORDER BY devicename, objectid, ts DESC; </code></pre> <p>Details and explanation <a href="https://stackoverflow.com/a/7630564/939860">in this related answer</a>.</p> <p>From your sample data, I conclude that you are going to <strong>delete large portions</strong> of the original table. It is probably faster to just <a href="http://www.postgresql.org/docs/current/interactive/sql-truncate.html" rel="nofollow noreferrer"><code>TRUNCATE</code></a> the table (or <code>DROP</code> &amp; recreate, since you should add a surrogate pk column anyway) and write the remaining rows to it. This would also provide you with a prestine table, implicitly clustered (ordered) the way it's best for your queries and save the work that VACUUM would have to do otherwise. And it's probably still faster overall:</p> <p>I would also strongly advise to add a surrogate primary key to your table, preferably a <a href="http://www.postgresql.org/docs/current/interactive/datatype-numeric.html#DATATYPE-SERIAL" rel="nofollow noreferrer"><strong><code>serial</code></strong></a> column.</p> <pre><code>BEGIN; CREATE TEMP TABLE tmp_tbl ON COMMIT DROP AS SELECT DISTINCT ON (devicename, objectid) * FROM tbl ORDER BY devicename, objectid, ts DESC; TRUNCATE tbl; ALTER TABLE tbl ADD column tbl_id serial PRIMARY KEY; -- or, if you can afford to drop &amp; recreate: -- DROP TABLE tbl; -- CREATE TABLE tbl ( -- tbl_id serial PRIMARY KEY -- , address text -- , devicename text -- , objectid text -- , ts timestamp); INSERT INTO tbl (address, devicename, objectid, ts) SELECT address, devicename, objectid, ts FROM tmp_tbl; COMMIT; </code></pre> <p>Do it all within a transaction to make sure you are not going to fail half way through.</p> <p>This is fast as long as your setting for <a href="http://www.postgresql.org/docs/current/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY" rel="nofollow noreferrer"><code>temp_buffers</code></a> is big enough to hold the temporary table. Else the system will start to swap data to disk and performance takes a dive. You can set <code>temp_buffers</code> just for the current session like this:</p> <pre><code>SET temp_buffers = 1000MB; </code></pre> <p>So you don't waste RAM that you don't normally need for <code>temp_buffers</code>. Has to be before the first use of temporary objects in the session. More information in <a href="https://stackoverflow.com/a/8290958/939860">this related answer</a>.</p> <p>Also, as the <code>INSERT</code> follows a <code>TRUNCATE</code> inside a transaction, it will be easy on the <a href="http://www.postgresql.org/docs/current/interactive/wal-intro.html" rel="nofollow noreferrer">Write Ahead Log</a> - improving performance.</p> <p>Consider <code>CREATE TABLE AS</code> for the alternative route:</p> <ul> <li><a href="https://dba.stackexchange.com/q/114856/3684">What causes large INSERT to slow down and disk usage to explode?</a></li> </ul> <p>The only downside: You need an <strong>exclusive lock</strong> on the table. This may be a problem in databases with heavy concurrent load.</p> <p>Finally, never use <code>timestamp</code> as column name. It's a <a href="http://www.postgresql.org/docs/current/interactive/sql-keywords-appendix.html" rel="nofollow noreferrer">reserved word</a> in every SQL standard and a type name in PostgreSQL. I renamed the column to <code>ts</code> as you may have noticed.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload