Note that there are some explanatory texts on larger screens.

plurals
  1. POVERY slow running regular expression when using large documents
    primarykey
    data
    text
    <p>I need to convert inline css style attributes to their HTML tag equivelants. The solution I have works but runs VERY slowly using the Microsoft .Net Regex namespace and long documents (~40 pages of html). I've tried several variations but with no useful results. I've done a little wrapping around executing the expressions but in the end it's just the built-in regex Replace method that gets called.</p> <p>I'm sure I'm abusing the greediness of the regex but I'm not sure of a way around it to achieve what I want using a single regex.</p> <p>I want to be able to run the following unit tests:</p> <pre><code>[Test] public void TestCleanReplacesFontWeightWithB() { string html = "&lt;font style=\"font-weight:bold\"&gt;Bold Text&lt;/font&gt;"; html = Q4.PrWorkflow.Helper.CleanFormatting(html); Assert.AreEqual("&lt;b&gt;Bold Text&lt;/b&gt;", html); } [Test] public void TestCleanReplacesMultipleAttributesFontWeightWithB() { string html = "&lt;font style=\"font-weight:bold; color: blue; \"&gt;Bold Text&lt;/font&gt;"; html = Q4.PrWorkflow.Helper.CleanFormatting(html); Assert.AreEqual("&lt;b&gt;Bold Text&lt;/b&gt;", html); } [Test] public void TestCleanReplaceAttributesBoldAndUnderlineWithHtml() { string html = "&lt;span style=\"font-weight:bold; color: blue; text-decoration: underline; \"&gt;Bold Text&lt;/span&gt;"; html = Q4.PrWorkflow.Helper.CleanFormatting(html); Assert.AreEqual("&lt;u&gt;&lt;b&gt;Bold Text&lt;/b&gt;&lt;/u&gt;", html); } [Test] public void TestCleanReplaceAttributesBoldUnderlineAndItalicWithHtml() { string html = "&lt;span style=\"font-weight:bold; color: blue; font-style: italic; text-decoration: underline; \"&gt;Bold Text&lt;/span&gt;"; html = Q4.PrWorkflow.Helper.CleanFormatting(html); Assert.AreEqual("&lt;u&gt;&lt;b&gt;&lt;i&gt;Bold Text&lt;/i&gt;&lt;/b&gt;&lt;/u&gt;", html); } [Test] public void TestCleanReplacesFontWeightWithSpaceWithB() { string html = "&lt;font size=\"10\" style=\"font-weight: bold\"&gt;Bold Text&lt;/font&gt;"; html = Q4.PrWorkflow.Helper.CleanFormatting(html); Assert.AreEqual("&lt;b&gt;Bold Text&lt;/b&gt;", html); } </code></pre> <p>The regular expresion I am using to achieve this logic works but is VERY slow. The regex in the c# code looks like this:</p> <pre><code>public static IReplacePattern IncludeInlineItalicToITag(ICleanUpHtmlFactory factory) { return factory.CreateReplacePattern("(&lt;(span|font) .*?style=\".*?font-style:\\s*italic[^&gt;]*&gt;)(.*?)&lt;/\\2&gt;", "$1&lt;i&gt;$3&lt;/i&gt;&lt;/$2&gt;"); } public static IReplacePattern IncludeInlineBoldToBTag(ICleanUpHtmlFactory factory) { return factory.CreateReplacePattern("(&lt;(span|font) .*?style=\".*?font-weight:\\s*bold[^&gt;]*&gt;)(.*?)&lt;/\\2&gt;", "$1&lt;b&gt;$3&lt;/b&gt;&lt;/$2&gt;"); } public static IReplacePattern IncludeInlineUnderlineToUTag(ICleanUpHtmlFactory factory) { return factory.CreateReplacePattern("(&lt;(span|font) .*?style=\".*?text-decoration:\\s*underline[^&gt;]*&gt;)(.*?)&lt;/\\2&gt;", "$1&lt;u&gt;$3&lt;/u&gt;&lt;/$2&gt;"); } </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload