Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to force STORE (overwrite) to HDFS in Pig?
    text
    copied!<p>When developing Pig scripts that use the <em>STORE</em> command I have to delete the output directory for every run or the script stops and offers:</p> <pre><code>2012-06-19 19:22:49,680 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 6000: Output Location Validation Failed for: 'hdfs://[server]/user/[user]/foo/bar More info to follow: Output directory hdfs://[server]/user/[user]/foo/bar already exists </code></pre> <p>So I'm <strong>searching for an in-Pig solution to automatically remove the directory</strong>, also one that doesn't choke if the directory is non-existent at call time.</p> <p>In the Pig Latin Reference I found the shell command invoker <em>fs</em>. Unfortunately the Pig script breaks whenever anything produces an error. So I can't use</p> <pre><code>fs -rmr foo/bar </code></pre> <p>(i. e. remove recursively) since it breaks if the directory doesn't exist. For a moment I thought I may use</p> <pre><code>fs -test -e foo/bar </code></pre> <p>which is a test and shouldn't break or so I thought. However, Pig again interpretes <code>test</code>'s return code on a non-existing directory as a failure code and breaks.</p> <p>There is a <a href="https://issues.apache.org/jira/browse/PIG-259" rel="noreferrer">JIRA ticket</a> for the Pig project addressing my problem and suggesting an optional parameter <em>OVERWRITE</em> or <em>FORCE_WRITE</em> for the <em>STORE</em> command. Anyway, I'm using Pig 0.8.1 out of necessity and there is no such parameter.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload