Note that there are some explanatory texts on larger screens.

plurals
  1. POParallel execution in cx-oracle
    text
    copied!<p>I have recently joined a new company and am new to python (their preferred scripting language) and have been working with cx_oracle to create some ETL processes. The scripts I have built so far have been single-threaded jobs that select the subset of columns I need from an Oracle source DB and write the output to a named pipe where an external process is waiting to read that data and insert it into the target.</p> <p>This has worked fine until I get to some tables that are in the 500 million -2 billion row range. The job still works, but it is taking many hours to complete. These large source tables are partitioned so I have been trying to research ways to coordinate parallel reads of different partitions so I can get two or more threads working concurrently, each writing to a separate named pipe. </p> <p>Is there an elegant way in cx-oracle to handle multiple threads reading from different partitions of the same table?</p> <p>Here's my current (simple) code:</p> <pre><code>import cx_Oracle import csv # connect via SQL*Net string or by each segment in a separate argument connection = cx_Oracle.connect("user/password@TNS") csv.register_dialect('pipe_delimited', escapechar='\\' delimiter='|',quoting=csv.QUOTE_NONE) cursor = connection.cursor() f = open("&lt;path_to_named_pipe&gt;", "w") writer = csv.writer(f, dialect='pipe_delimited', lineterminator="\n") r = cursor.execute("""SELECT &lt;column_list&gt; from &lt;SOURCE_TABLE&gt;""") for row in cursor: writer.writerow(row) f.close() </code></pre> <p>Some of my source tables have over 1000 partitions so hard-coding the partition names in isn't the preferred option. I have been thinking about setting up arrays of partition names and iterating through them, but if folks have other ideas I'd love to hear them.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload