Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Since you are new to this, I'll explain that you can use PHP's HTML parser known as <a href="http://php.net/domdocument" rel="nofollow"><code>DOMDocument</code></a> to extract what you need. You should <strong>not</strong> use a regular expression as they are inherently error prone when it comes to parsing HTML, and can easily result in many false positives.</p> <p>To start, lets say you have your HTML:</p> <pre><code>$html = '&lt;a href="http://www.mydomain.com/galeria/thumbnails.php?album=774" target="_blank"&gt;&lt;img alt="/" src="http://img255.imageshack.us/img00/000/000001.png" height="133" width="113"&gt;&lt;/a&gt;'; </code></pre> <p>And now, we load that into DOMDocument:</p> <pre><code>$doc = new DOMDocument; $doc-&gt;loadHTML( $html); </code></pre> <p>Now, we have that HTML loaded, it's time to find the elements that we need. Let's assume that you can encounter other <code>&lt;a&gt;</code> tags within your document, so we want to find those <code>&lt;a&gt;</code> tags that have a direct <code>&lt;img&gt;</code> tag as a child. Then, check to make sure we have the correct nodes, we need to make sure we extract the correct information. So, let's have at it:</p> <pre><code>$results = array(); // Loop over all of the &lt;a&gt; tags in the document foreach( $doc-&gt;getElementsByTagName( 'a') as $a) { // If there are no children, continue on if( !$a-&gt;hasChildNodes()) continue; // Find the child &lt;img&gt; tag, if it exists foreach( $a-&gt;childNodes as $child) { if( $child-&gt;nodeType == XML_ELEMENT_NODE &amp;&amp; $child-&gt;tagName == 'img') { // Now we have the &lt;a&gt; tag in $a and the &lt;img&gt; tag in $child // Get the information we need: parse_str( parse_url( $a-&gt;getAttribute('href'), PHP_URL_QUERY), $a_params); $results[] = array( $a_params['album'], $child-&gt;getAttribute('src')); } } } </code></pre> <p>A <code>print_r( $results);</code> now <a href="http://3v4l.org/2e2m4#v512" rel="nofollow">leaves us with</a>:</p> <pre><code>Array ( [0] =&gt; Array ( [0] =&gt; 774 [1] =&gt; http://img255.imageshack.us/img00/000/000001.png ) ) </code></pre> <p>Note that this omits basic error checking. One thing you can add is in the inner <code>foreach</code> loop, you can check to make sure you successfully parsed an <code>album</code> parameter from the <code>&lt;a&gt;</code>'s <code>href</code> attribute, like so:</p> <pre><code>if( isset( $a_params['album'])) { $results[] = array( $a_params['album'], $child-&gt;getAttribute('src')); } </code></pre> <p>Every function I've used in this can be found in the <a href="http://php.net/manual" rel="nofollow">PHP documentation</a>.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload