Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<pre><code>library(RCurl) library(XML) # Download page using RCurl # You may need to set proxy details, etc., in the call to getURL theurl &lt;- "http://en.wikipedia.org/wiki/Brazil_national_football_team" webpage &lt;- getURL(theurl) # Process escape characters webpage &lt;- readLines(tc &lt;- textConnection(webpage)); close(tc) # Parse the html tree, ignoring errors on the page pagetree &lt;- htmlTreeParse(webpage, error=function(...){}) # Navigate your way through the tree. It may be possible to do this more efficiently using getNodeSet body &lt;- pagetree$children$html$children$body divbodyContent &lt;- body$children$div$children[[1]]$children$div$children[[4]] tables &lt;- divbodyContent$children[names(divbodyContent)=="table"] #In this case, the required table is the only one with class "wikitable sortable" tableclasses &lt;- sapply(tables, function(x) x$attributes["class"]) thetable &lt;- tables[which(tableclasses=="wikitable sortable")]$table #Get columns headers headers &lt;- thetable$children[[1]]$children columnnames &lt;- unname(sapply(headers, function(x) x$children$text$value)) # Get rows from table content &lt;- c() for(i in 2:length(thetable$children)) { tablerow &lt;- thetable$children[[i]]$children opponent &lt;- tablerow[[1]]$children[[2]]$children$text$value others &lt;- unname(sapply(tablerow[-1], function(x) x$children$text$value)) content &lt;- rbind(content, c(opponent, others)) } # Convert to data frame colnames(content) &lt;- columnnames as.data.frame(content) </code></pre> <p><strong>Edited to add:</strong></p> <p>Sample output</p> <pre><code> Opponent Played Won Drawn Lost Goals for Goals against  % Won 1 Argentina 94 36 24 34 148 150 38.3% 2 Paraguay 72 44 17 11 160 61 61.1% 3 Uruguay 72 33 19 20 127 93 45.8% ... </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload