Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Edit, Dec 2013:</strong> Google has deprecated the old <code>Xml</code> service, replacing it with <a href="https://developers.google.com/apps-script/reference/xml-service/xml-servic" rel="nofollow noreferrer"><code>XmlService</code></a>. The script in this answer has been updated to use the new service. The new service requires standard-compliant XML &amp; HTML, while the old one was forgiving of such problems as missing close-tags.</p> <hr> <p>Have a look at the <a href="https://developers.google.com/apps-script/articles/XML_tutorial" rel="nofollow noreferrer">Tutorial: Parsing an XML Document</a>. (As of Dec 2013, this tutorial is still on line, although the Xml service is deprecated.) Starting with that foundation, you can take advantage of the XML parsing in Script Services to navigate the page. Here's a small script operating on your example:</p> <pre><code>function getProgrammeList() { txt = '&lt;html&gt; &lt;body&gt; &lt;div&gt; &lt;div&gt; &lt;div id="here"&gt;hello world!!&lt;/div&gt; &lt;/div&gt; &lt;/div&gt; &lt;/html&gt;' // Put the receieved xml response into XMLdocument format var doc = Xml.parse(txt,true); Logger.log(doc.html.body.div.div.div.id +" = " +doc.html.body.div.div.div.Text ); /// here = hello world!! debugger; // Pause in debugger - examine content of doc } </code></pre> <p>To get the real page, start with this:</p> <pre><code>var url = 'http://blah.blah/whatever?querystring=foobar'; var txt = UrlFetchApp.fetch(url).getContentText(); .... </code></pre> <p>If you look at the documentation for <a href="https://developers.google.com/apps-script/reference/xml/xml-element#getElements%28String%29" rel="nofollow noreferrer"><code>getElements</code></a> you'll see that there is support for retrieving specific tags, for example "div". That finds direct children of a specific element, it doesn't explore the entire XML document. You should be able to write a function that traverses the document examining the <code>id</code> of each <code>div</code> element until it finds your programme list.</p> <pre><code>var programmeList = findDivById(doc,"here"); </code></pre> <hr> <h3>Edit - I couldn't help myself...</h3> <p>Here's a utility function that will do just that.</p> <pre><code>/** * Find a &lt;div&gt; tag with the given id. * &lt;pre&gt; * Example: getDivById( html, 'tagVal' ) will find * * &lt;div id="tagVal"&gt; * &lt;/pre&gt; * * @param {Element|Document} * element XML document or element to start search at. * @param {String} id HTML &lt;div&gt; id to find. * * @return {XmlElement} First matching element (in doc order) or null. */ function getDivById( element, id ) { // Call utility function to do the work. return getElementByVal( element, 'div', 'id', id ); } /** * !Now updated for XmlService! * * Traverse the given Xml Document or Element looking for a match. * Note: 'class' is stripped during parsing and cannot be used for * searching, I don't know why. * &lt;pre&gt; * Example: getElementByVal( body, 'input', 'value', 'Go' ); will find * * &lt;input type="submit" name="btn" value="Go" id="btn" class="submit buttonGradient" /&gt; * &lt;/pre&gt; * * @param {Element|Document} * element XML document or element to start search at. * @param {String} elementType XML element type, e.g. 'div' for &lt;div&gt; * @param {String} attr Attribute or Property to compare. * @param {String} val Search value to locate * * @return {Element} First matching element (in doc order) or null. */ function getElementByVal( element, elementType, attr, val ) { // Get all descendants, in document order var descendants = element.getDescendants(); for (var i =0; i &lt; descendants.length; i++) { var elem = descendants[i]; var type = elem.getType(); // We'll only examine ELEMENTs if (type == XmlService.ContentTypes.ELEMENT) { var element = elem.asElement(); var htmlTag = element.getName(); if (htmlTag === elementType) { if (val === element.getAttribute(attr).getValue()) { return element; } } } } // No matches in document return null; } </code></pre> <p>Applying this to your example, we get this:</p> <pre><code>function getProgrammeList() { txt = '&lt;html&gt; &lt;body&gt; &lt;div&gt; &lt;div&gt; &lt;div id="here"&gt;hello world!!&lt;/div&gt; &lt;/div&gt; &lt;/div&gt; &lt;/html&gt;' // Get the receieved xml response into an XML document var doc = XmlService.parse(txt); var found = getDivById(doc.getElement(),'here'); Logger.log(found.getAttribute(attr).getValue() + " = " + found.getValue()); /// here = hello world!! } </code></pre> <p><strong>Note:</strong> See <a href="https://stackoverflow.com/questions/16858731/html-div-nesting-using-google-fetchurl/16860598#16860598">this answer</a> for a practical example of the use of these utilities.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload