Note that there are some explanatory texts on larger screens.

plurals
  1. POBuild a text from html using xpath
    text
    copied!<p>I receive an html like that below from a server. I rebuild the textual part by using the XPath exp @"//text()" and appending the "nodeContent" value to a string. The code is something like this:</p> <pre><code>for (int i=2; i&lt;[resultXPathQuery count]; i++) { [mytext appendString:[[resultXPathQuery objectAtIndex:i] objectForKey:@"nodeContent"]]; [mytext appendString:@"\n"]; } </code></pre> <p>I obtain:</p> <pre><code>Line 1 line 2 line 3 line 4 </code></pre> <p>How could I build the textual part also considering the empty node? <br>I would to obtain:</p> <pre><code>Line 1 line 2 line 3 line 4 </code></pre> <hr> <pre><code>&lt;html&gt;&lt;head&gt;&lt;title&gt;A title&lt;/title&gt;&lt;style type="text/css"&gt; ol{margin:0;padding:0}p{margin:0} .c0{font-size:12pt;background-color:#ffffff;font-family:Times New Roman} .c6{width:432.0pt;background-color:#ffffff;padding:72.0pt 90.0pt 72.0pt 90.0pt} .c7{color:#aaaaaa;font-family:Times New Roman} .c3{color:#0000ee;text-decoration:underline} .c5{color:inherit;text-decoration:inherit} .c2{font-size:12pt;font-family:Times New Roman} .c4{height:12pt}.c1{direction:ltr} body{color:#000000;font-size:12pt;font-family:Times New Roman} h1{padding-top:12.0pt;line-height:1.0;text-align:left;color:#000000;font-size:24pt;font- family:Times New Roman;font-weight:bold;padding-bottom:12.0pt} h2{padding-top:11.25pt;line-height:1.0;text-align:left;color:#000000;font-size:18pt;font-family:Times New Roman;font-weight:bold;padding-bottom:11.25pt} h3{padding-top:12.0pt;line-height:1.0;text-align:left;color:#000000;font-size:14pt;font-family:Times New Roman;font-weight:bold;padding-bottom:12.0pt} h4{padding-top:12.75pt;line-height:1.0;text-align:left;color:#000000;font-size:12pt;font-family:Times New Roman;font-weight:bold;padding-bottom:12.75pt} h5{padding-top:12.75pt;line-height:1.0;text-align:left;color:#000000;font-size:9pt;font-family:Times New Roman;font-weight:bold;padding-bottom:12.75pt} h6{padding-top:18.0pt;line-height:1.0;text-align:left;color:#000000;font-size:8pt;font-family:Times New Roman;font-weight:bold;padding-bottom:18.0pt}&lt;/style&gt; &lt;/head&gt; &lt;body class="c6"&gt; &lt;p class="c1"&gt;&lt;span class="c2"&gt;A title&lt;/span&gt;&lt;/p&gt; &lt;p class="c1 c4"&gt;&lt;span class="c2"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c4 c1"&gt;&lt;span class="c2"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c7"&gt;Line 1&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c7"&gt;line 2&lt;/span&gt;&lt;/p&gt; &lt;p class="c4 c1"&gt;&lt;span class="c7"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c7"&gt;line 3&lt;/span&gt;&lt;/p&gt; &lt;p class="c4 c1"&gt;&lt;span class="c7"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c4 c1"&gt;&lt;span class="c7"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c3 c2"&gt;&lt;span class="c1"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c7"&gt;line 4&lt;/span&gt;&lt;/p&gt; &lt;/body&gt;&lt;/html&gt; </code></pre> <p><strong>EDIT</strong></p> <p>Really, I noticed that the html can be more "complicated", so it's not enough selecting all the span elements or p elements. Moreover, more span elements can appear in the same p element, so in that case I have not to create a new line in my string.</p> <p>This is the body of a more complicated returned html:</p> <pre><code>&lt;body class="c13"&gt; &lt;p class="c5"&gt;&lt;span&gt;gfgfgfd&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c5 c10"&gt;&lt;span&gt;ghhgfhgfh hghg hgkfhjgk ghjgkh ghjgjhg gjhjg gjhj gjhgjhgjhg gfhjkgjg jghjgfhjgf fghfj jghfj fghjggf jhgjgjgkjg&lt;/span&gt;&lt;/p&gt; &lt;p class="c1 c10"&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c4"&gt;&lt;span&gt;gfgfgfd&lt;/span&gt;&lt;/p&gt; &lt;p class="c4"&gt;&lt;span&gt;f&lt;/span&gt;&lt;/p&gt; &lt;p class="c4"&gt; &lt;span&gt;gfdgfdg&lt;/span&gt; &lt;span class="c7"&gt;hg&lt;/span&gt;&lt;/p&gt; &lt;p class="c4"&gt;&lt;span class="c7"&gt;ghgfhgfh&lt;/span&gt;&lt;/p&gt; &lt;p class="c4"&gt;&lt;span class="c7"&gt;gfhgfhgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt; &lt;span class="c7"&gt;hgfh &lt;/span&gt; &lt;span class="c0"&gt;gfdgfg&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;fgfdgfdgfd&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;gdfgdfgfd&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;gfgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0 c8"&gt;&lt;a class="c12" href="http://www.google.com"&gt;www.google.com&lt;/a&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;fgfdgfdg&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt; &lt;span class="c0"&gt;fgffgfdgfg&lt;/span&gt; &lt;span class="c0 c11"&gt;gfgfdgfd fgd fd&lt;/span&gt; &lt;span class="c0"&gt;fdgfdg&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;fgfdgfdgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;gfd&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;gfgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0 c8"&gt;&lt;a class="c12" href="mailto:…."&gt;...&lt;/a&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;ol class="c9" start="1"&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;gfgfd&lt;/span&gt;&lt;/li&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;gfdgfd&lt;/span&gt;&lt;/li&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;gfdgfd&lt;/span&gt;&lt;/li&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;gdfgfd&lt;/span&gt;&lt;/li&gt; &lt;/ol&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;hgfhgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;gfhgfh&lt;/span&gt;&lt;/p&gt; &lt;p class="c5"&gt;&lt;span class="c0"&gt;hgfhgf&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;ol class="c2" start="1"&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;gfhg&lt;/span&gt;&lt;/li&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;hgfh&lt;/span&gt;&lt;/li&gt; &lt;li class="c3"&gt;&lt;span class="c0"&gt;hgf&lt;/span&gt;&lt;/li&gt; &lt;/ol&gt; &lt;p class="c1"&gt;&lt;span class="c0"&gt;&lt;/span&gt;&lt;/p&gt; &lt;h1 class="c5 c15"&gt;&lt;a name="h.kafwflosthlg"&gt;&lt;/a&gt;&lt;span class="c7 c14"&gt;hgfhgfh&lt;/span&gt;&lt;/h1&gt; &lt;p class="c1"&gt;&lt;span class="c6"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c6"&gt;&lt;/span&gt;&lt;/p&gt; &lt;p class="c1"&gt;&lt;span class="c6"&gt;&lt;/span&gt;&lt;/p&gt; &lt;/body&gt; </code></pre> <p>I'd need an XPath expression that selects p, h1, h2,..., h6, li elements, and considers the inner textual part in such way that new line and empty lines are properly detected.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload