Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I would first try to just scrape the links to the relevant data files and use the resulting information to construct the full download path that includes user logins and so on. As others have suggested, <code>lapply</code> would be convenient for batch downloading.</p> <p>Here's an easy way to extract the URLs. Obviously, modify the example to suit your actual scenario.</p> <p>Here, we're going to use the <code>XML</code> package to identify all the links available at the CRAN archives for the Amelia package (<a href="http://cran.r-project.org/src/contrib/Archive/Amelia/" rel="nofollow">http://cran.r-project.org/src/contrib/Archive/Amelia/</a>).</p> <pre><code>&gt; library(XML) &gt; url &lt;- "http://cran.r-project.org/src/contrib/Archive/Amelia/" &gt; doc &lt;- htmlParse(url) &gt; links &lt;- xpathSApply(doc, "//a/@href") &gt; free(doc) &gt; links href href href "?C=N;O=D" "?C=M;O=A" "?C=S;O=A" href href href "?C=D;O=A" "/src/contrib/Archive/" "Amelia_1.1-23.tar.gz" href href href "Amelia_1.1-29.tar.gz" "Amelia_1.1-30.tar.gz" "Amelia_1.1-32.tar.gz" href href href "Amelia_1.1-33.tar.gz" "Amelia_1.2-0.tar.gz" "Amelia_1.2-1.tar.gz" href href href "Amelia_1.2-2.tar.gz" "Amelia_1.2-9.tar.gz" "Amelia_1.2-12.tar.gz" href href href "Amelia_1.2-13.tar.gz" "Amelia_1.2-14.tar.gz" "Amelia_1.2-15.tar.gz" href href href "Amelia_1.2-16.tar.gz" "Amelia_1.2-17.tar.gz" "Amelia_1.2-18.tar.gz" href href href "Amelia_1.5-4.tar.gz" "Amelia_1.5-5.tar.gz" "Amelia_1.6.1.tar.gz" href href href "Amelia_1.6.3.tar.gz" "Amelia_1.6.4.tar.gz" "Amelia_1.7.tar.gz" </code></pre> <p>For the sake of demonstration, imagine that, ultimately, we only want the links for the 1.2 versions of the package.</p> <pre><code>&gt; wanted &lt;- links[grepl("Amelia_1\\.2.*", links)] &gt; wanted href href href "Amelia_1.2-0.tar.gz" "Amelia_1.2-1.tar.gz" "Amelia_1.2-2.tar.gz" href href href "Amelia_1.2-9.tar.gz" "Amelia_1.2-12.tar.gz" "Amelia_1.2-13.tar.gz" href href href "Amelia_1.2-14.tar.gz" "Amelia_1.2-15.tar.gz" "Amelia_1.2-16.tar.gz" href href "Amelia_1.2-17.tar.gz" "Amelia_1.2-18.tar.gz" </code></pre> <p>You can now use that vector as follows:</p> <pre><code>wanted &lt;- links[grepl("Amelia_1\\.2.*", links)] GetMe &lt;- paste(url, wanted, sep = "") lapply(seq_along(GetMe), function(x) download.file(GetMe[x], wanted[x], mode = "wb")) </code></pre> <hr> <h3>Update (to clarify your question in comments)</h3> <p>The last step in the example above <em>downloads</em> the specified files to your current working directory (use <code>getwd()</code> to verify where that is). If, instead, you know for sure that <code>read.csv</code> works on the data, you can also try to modify your anonymous function to read the files directly:</p> <pre><code>lapply(seq_along(GetMe), function(x) read.csv(GetMe[x], header = TRUE, sep = "|", as.is = TRUE)) </code></pre> <p>However, I think a <em>safer</em> approach might be to download all the files into a single directory first, and then use <code>read.delim</code> or <code>read.csv</code> or whatever works to read in the data, similar to as was suggested by @Andreas. I say <em>safer</em> because it gives you more flexibility in case files aren't fully downloaded and so on. In that case, instead of having to redownload everything, you would only need to download the files which were not fully downloaded.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload