Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Okay, the original answer and my comments seemed to require more information than was comfortable in commenting section, so I decided to create a new answer.</p> <p>First off, what you are attempting to do <em>IS</em> a programming task already, which <em>WILL</em> require some programming aptitude, depending on your exact needs.</p> <p>Secondly, there have been some answers provided that suggest you use loops of char finding and regexps. Both of these are <em>horribly</em> error-prone ways to do things, as discussed, for example, <a href="http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html" rel="nofollow">here</a>.</p> <p>The normal way for parsing HTML/XML stuff nowadays is by using an external library designed for this. In fact these libraries are by now sort of standard and in many programming languages they are already built-in.</p> <p>For your particular needs, I am rusty on both C and XPath either, but it should work approximately like this:</p> <ul> <li>start up an XML/HTML parser.</li> <li>load into it your HTML document as character string</li> <li>tell the parser to find all instances of tag (using XPath)</li> <li>it will return to you a "set of nodes"</li> <li>process the set of nodes in a loop, doing with each tag whatever you need</li> </ul> <p>I found some other examples, maybe this one is better: <a href="http://xmlsoft.org/example.html" rel="nofollow">http://xmlsoft.org/example.html</a></p> <p>As you can see there, there is an XML document (which doesn't matter, since HTML is just subset of XML, your HTML document should work too).</p> <p>In Python or similar language this would be extremely easy, in some pseudocode this would look like this:</p> <pre><code>p=new HTMLParser p-&gt;load(my html document) resultset=p-&gt;XPath_Search("//a") # this will find all A elements in the HTML document for each result of resultset: write(result.href) end for </code></pre> <p>this would generally write out HREF part of all A elements in document. A decent tutorial on what can you use XPath for is eg <a href="http://zvon.org/comp/r/tut-XPath_1.html#Pages~List_of_XPaths" rel="nofollow">here</a>.</p> <p>I am afraid in C this would be somewhat more convoluted, but the idea is the same and it IS a programming task.</p> <p>If this is some quick-and-dirty work you might use suggested strstr() or regexp searches, with no external libraries. However, please keep in mind that depending on your exact task, you are very likely to miss a number of outgoing links or misread their contents.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload