Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Also note that some data are proprietary and is considered by their owners as Intellectual Property. Some sites like currency exchange sites, search engines and stock market trackers particularly don't like their data being crawled since their business is basically selling the very data you're crawling.</p> <p>That being said, in the US, you cannot copyright data itself - just how you format the data. So according to US law it's OK to grab crawled data as long as you don't store it in its original formatting (HTML).</p> <p>But, in a lot of European countries data itself can be copyrighted. And the web is a global beast. People from Europe can visit your site. Which according to the law in some countries means that you are doing business in those countries. So even if you are protected legally in the US it doesn't mean that you won't get sued elsewhere in the world.</p> <p>My advice is go through the site and read about usage policy. If the site explicitly disallows crawling then you shouldn't do it. And as Jim mentioned, respect robots.txt.</p> <p>Then again, there is ample legal precedent from courts around the world that makes search engines legal. And search engines are themselves voracious web crawlers. On the other hand it looks like almost every year at least one news agency sues or tries to sue Google for web crawling.</p> <p>With all the above in mind, be very careful what you do with crawled data. I would say private use is OK as long as you don't overload the servers. I myself do it regularly to get TV programming schedule etc.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload