Note that there are some explanatory texts on larger screens.

plurals
  1. POReplacing <p>, <div> tags within <td> tags?
    text
    copied!<p>I'm working on a specialized HTML stripper. The current stripper replaces &lt;td&gt; tags with tabs then &lt;p&gt; and &lt;div&gt; tags with double carriage-returns. However, when stripping code like this:</p> <pre><code>&lt;td&gt;First Text&lt;/td&gt;&lt;td style="background:#330000"&gt;&lt;p style="color:#660000;text-align:center"&gt;Some Text&lt;/p&gt;&lt;/td&gt; </code></pre> <p>It (obviously) produces</p> <pre><code>First Text Some Text </code></pre> <p>We'd like to have the &lt;p&gt; replaced with nothing in this case, so it produces:</p> <pre><code>First Text (tab) Some Text </code></pre> <p>However, we'd like to keep the double carriage-return replacement for other code where the &lt;p&gt; tag is not surrounded by &lt;td&gt; tags.</p> <p>Basically, we're trying to replace &lt;td&gt; tags with \t always and &lt;p&gt; and &lt;div&gt; tags with \r\r ONLY when they're not surrounded by &lt;td&gt; tags.</p> <p>Current code: (C#)</p> <pre><code> // insert tabs in places of &lt;TD&gt; tags result = System.Text.RegularExpressions.Regex.Replace(result, @"&lt;td\b(?:[^&gt;""']|""[^""]*""|'[^']*')*&gt;", "\t", System.Text.RegularExpressions.RegexOptions.IgnoreCase); // insert line paragraphs (double line breaks) in place // of &lt;P&gt;, &lt;DIV&gt; and &lt;TR&gt; tags result = System.Text.RegularExpressions.Regex.Replace(result, @"&lt;(div|tr|p)\b(?:[^&gt;""']|""[^""]*""|'[^']*')*&gt;", "\r\r", System.Text.RegularExpressions.RegexOptions.IgnoreCase); </code></pre> <p>(there's more code to the stripper; this is the relevant part)</p> <p>Any ideas on how to do this without completely rewriting the entire stripper?</p> <p>EDIT: I'd prefer to not use a library due to the headaches of getting it signed off on and included with the project (which itself is a library to be included in another project), not to mention the legal issues. If there is no other solution, though, I'll probably use the HTML Agility Pack.</p> <p>Mostly, the stripper just strips out anything it finds that looks like a tag (done with a large regex based on a regex in Regular Expressions Cookbook. This, replacing line break tags with /r, and dealing with multiple tabs is the brunt of the custom stripping code.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload