Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Since some people have already given you warnings I'll skip ahead to the regex solution.</p> <p>First off, I'll lay out a couple of assumptions that aren't set in stone but allow the problem to be approached as you presented it without me doing extra work:</p> <ol> <li>You can use LINQ (otherwise this will need to be updated)</li> <li>Font/Span tags will be in lowercase (<code>font</code> and <code>span</code> not <code>FONT</code> or <code>SpAn</code>)</li> <li>Each style attribute value will be properly formatted, ending with a semi-colon <code>;</code> similar to your samples</li> </ol> <p>Case-sensitivity can be worked in rather simply via the <code>RegexOptions.IgnoreCase</code> although, in turn, the dictionary values will need to be stored as <code>ToLower</code> to keep everything constant when the values are later accessed. The 3rd point ensures splitting text doesn't go haywire.</p> <p>Below is a sample program that demonstrates the replacements.</p> <pre><code>Sub Main Dim inputs As String() = { _ "&lt;font size=""10""&gt;some text&lt;/font&gt;", _ "&lt;font color=""#000000""&gt;some text&lt;/font&gt;", _ "&lt;font size=""10"" color=""#000000""&gt;some text&lt;/font&gt;", _ "&lt;font color=""#000000"" size=""10""&gt;some text&lt;/font&gt;", _ "&lt;font size=""10""&gt;some text&lt;/font&gt; other text &lt;font color=""#000000""&gt;some text&lt;/font&gt;", _ "&lt;span style=""font-size:10px;""&gt;some text&lt;/span&gt;", _ "&lt;span style=""color:#000000;""&gt;some text&lt;/span&gt;", _ "&lt;span style=""font-size:10px; color:#000000;""&gt;some text&lt;/span&gt;", _ "&lt;span style=""color:#000000; font-size:10px;""&gt;some text&lt;/span&gt;", _ "&lt;span style=""color:#000000; font-size:10px;""&gt;some text&lt;/span&gt; other &lt;font color=""#000000"" size=""10""&gt;some text&lt;/font&gt;" _ } Dim pattern As String = "&lt;(?&lt;Tag&gt;font|span)\b(?&lt;Attributes&gt;[^&gt;]+)&gt;(?&lt;Content&gt;.+?)&lt;/\k&lt;Tag&gt;&gt;" Dim rx As New Regex(pattern) For Each input As String In inputs Dim result As String = rx.Replace(input, AddressOf TransformTags) Console.WriteLine("Before: " &amp; input) Console.WriteLine("After: " &amp; result) Console.WriteLine() Next End Sub Public Function TransformTags(ByVal m As Match) As String Dim rx As New Regex("(?&lt;Key&gt;\b[a-zA-Z]+)=""(?&lt;Value&gt;.+?)""") Dim attributes = rx.Matches(m.Groups("Attributes").Value).Cast(Of Match)() _ .ToDictionary(Function(attribute) attribute.Groups("Key").Value, _ Function(attribute) attribute.Groups("Value").Value) If m.Groups("Tag").Value = "font" Then Dim newAttributes = String.Join("; ", attributes.Select(Function(item) _ If(item.Key = "size", "font-size", item.Key) _ &amp; ":" _ &amp; If(item.Key = "size", item.Value &amp; "px", item.Value)) _ .ToArray()) _ &amp; ";" Return "&lt;span style=""" &amp; newAttributes &amp; """&gt;" &amp; m.Groups("Content").Value &amp; "&lt;/span&gt;" Else Dim newAttributes = String.Join(" ", attributes("style") _ .Split(New Char() {";"c}, StringSplitOptions.RemoveEmptyEntries) _ .Select(Function(s) _ s.Trim().Replace("px", "").Replace("font-", "").Replace(":", "=""") _ &amp; """") _ .ToArray()) Return "&lt;font " &amp; newAttributes &amp; "&gt;" &amp; m.Groups("Content").Value &amp; "&lt;/font&gt;" End If End Function </code></pre> <p>If you have any questions let me know. Some enhancements can be made if a large amount of text is expected to be processed. For example, the regex object in the TransformTags method can be moved to the class level so it isn't recreated on every transformation.</p> <p><strong>EDIT:</strong> Here's the explanation of the first pattern: <code>&lt;(?&lt;Tag&gt;font|span)\b(?&lt;Attributes&gt;[^&gt;]+)&gt;(?&lt;Content&gt;.+?)&lt;/\k&lt;Tag&gt;&gt;</code></p> <ul> <li><code>&lt;(?&lt;Tag&gt;font|span)\b</code> - opening <code>&lt;</code> and matches the font or span tag and uses a named group of <code>Tag</code>. The <code>\b</code> matches a word boundary to ensure nothing beyond the tag names specified are matched.</li> <li><code>(?&lt;Attributes&gt;[^&gt;]+)&gt;</code> - named group, <code>Attributes</code>, matches everything else in the tag as long as it is not a <code>&gt;</code> symbol, then it matches the closing <code>&gt;</code></li> <li><code>(?&lt;Content&gt;.+?)</code> - named group, <code>Content</code>, matches anything between the tag</li> <li><code>&lt;/\k&lt;Tag&gt;&gt;</code> - matches the closing tag by back-referencing the <code>Tag</code> group</li> </ul> <p>The second pattern is used to match key-value pairs for the attributes: <code>(?&lt;Key&gt;\b[a-zA-Z]+)=""(?&lt;Value&gt;.+?)""</code></p> <ul> <li><code>(?&lt;Key&gt;\b[a-zA-Z]+)</code> - named group, <code>Key</code>, matches any word (alphabets) starting at a word boundary </li> <li><code>=""</code> - matches the equal symbol and opening quotation</li> <li><code>(?&lt;Value&gt;.+?)</code> - named group, <code>Value</code>, matches anything up till the closing quotation mark. It is non-greedy by specifying the <code>?</code> symbol after the <code>+</code> symbol. It could've been <code>[^""]+</code> similar to how the <code>Attributes</code> group was handled in the first pattern.</li> <li><code>""</code> - matches the closing quotation</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload