StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
15956432
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
4
CommunityOwnedDate
CreationDate
2013-04-11T18:44:21.967
FavoriteCount
0
LastActivityDate
2013-04-13T04:28:07.583
LastEditDate
2013-04-13T04:28:07.583
LastEditorUserId
1270695
OwnerUserId
1270695
ParentId
15954463
PostTypeId
2
Score
4
ViewCount
0
LastEditorDisplayName
text
Body
I would first try to just scrape the links to the relevant data files and use the resulting information to construct the full download path that includes user logins and so on. As others have suggested, <code>lapply</code> would be convenient for batch downloading. Here's an easy way to extract the URLs. Obviously, modify the example to suit your actual scenario. Here, we're going to use the <code>XML</code> package to identify all the links available at the CRAN archives for the Amelia package (<a href="http://cran.r-project.org/src/contrib/Archive/Amelia/" rel="nofollow">http://cran.r-project.org/src/contrib/Archive/Amelia/</a>). <pre><code>> library(XML) > url <- "http://cran.r-project.org/src/contrib/Archive/Amelia/" > doc <- htmlParse(url) > links <- xpathSApply(doc, "//a/@href") > free(doc) > links href href href "?C=N;O=D" "?C=M;O=A" "?C=S;O=A" href href href "?C=D;O=A" "/src/contrib/Archive/" "Amelia_1.1-23.tar.gz" href href href "Amelia_1.1-29.tar.gz" "Amelia_1.1-30.tar.gz" "Amelia_1.1-32.tar.gz" href href href "Amelia_1.1-33.tar.gz" "Amelia_1.2-0.tar.gz" "Amelia_1.2-1.tar.gz" href href href "Amelia_1.2-2.tar.gz" "Amelia_1.2-9.tar.gz" "Amelia_1.2-12.tar.gz" href href href "Amelia_1.2-13.tar.gz" "Amelia_1.2-14.tar.gz" "Amelia_1.2-15.tar.gz" href href href "Amelia_1.2-16.tar.gz" "Amelia_1.2-17.tar.gz" "Amelia_1.2-18.tar.gz" href href href "Amelia_1.5-4.tar.gz" "Amelia_1.5-5.tar.gz" "Amelia_1.6.1.tar.gz" href href href "Amelia_1.6.3.tar.gz" "Amelia_1.6.4.tar.gz" "Amelia_1.7.tar.gz" </code></pre> For the sake of demonstration, imagine that, ultimately, we only want the links for the 1.2 versions of the package. <pre><code>> wanted <- links[grepl("Amelia_1\\.2.*", links)] > wanted href href href "Amelia_1.2-0.tar.gz" "Amelia_1.2-1.tar.gz" "Amelia_1.2-2.tar.gz" href href href "Amelia_1.2-9.tar.gz" "Amelia_1.2-12.tar.gz" "Amelia_1.2-13.tar.gz" href href href "Amelia_1.2-14.tar.gz" "Amelia_1.2-15.tar.gz" "Amelia_1.2-16.tar.gz" href href "Amelia_1.2-17.tar.gz" "Amelia_1.2-18.tar.gz" </code></pre> You can now use that vector as follows: <pre><code>wanted <- links[grepl("Amelia_1\\.2.*", links)] GetMe <- paste(url, wanted, sep = "") lapply(seq_along(GetMe), function(x) download.file(GetMe[x], wanted[x], mode = "wb")) </code></pre> <hr> <h3>Update (to clarify your question in comments)</h3> The last step in the example above downloads the specified files to your current working directory (use <code>getwd()</code> to verify where that is). If, instead, you know for sure that <code>read.csv</code> works on the data, you can also try to modify your anonymous function to read the files directly: <pre><code>lapply(seq_along(GetMe), function(x) read.csv(GetMe[x], header = TRUE, sep = "|", as.is = TRUE)) </code></pre> However, I think a safer approach might be to download all the files into a single directory first, and then use <code>read.delim</code> or <code>read.csv</code> or whatever works to read in the data, similar to as was suggested by @Andreas. I say safer because it gives you more flexibility in case files aren't fully downloaded and so on. In that case, instead of having to redownload everything, you would only need to download the files which were not fully downloaded.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. PORead list of file names from web into R
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USA5C1D2H2I1M1N2O1R2T1
UserOwnerUserId
1. USA5C1D2H2I1M1N2O1R2T1
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.