Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Assuming you can have one or more space afer <code>&lt;a</code>, and zero or more space around the <code>=</code> signs, the following should work:</p> <pre><code>$ cat in.txt &lt;a href="http://www.wowhead.com/?search=Superior Mana Oil"&gt; &lt;a href="http://www.wowhead.com/?search=Tabard of Brute Force"&gt; &lt;a href="http://www.wowhead.com/?search=Tabard of the Wyrmrest Accord"&gt; &lt;a href="http://www.wowhead.com/?search=Tattered Hexcloth Sack"&gt; # # The command to do the substitution # $ sed -e 's#&lt;a[ \t][ \t]*href[ \t]*=[ \t]*".*search[ \t]*=[ \t]*\([^"]*\)"&gt;#&amp;\1&lt;/a&gt;#' in.txt &lt;a href="http://www.wowhead.com/?search=Superior Mana Oil"&gt;Superior Mana Oil&lt;/a&gt; &lt;a href="http://www.wowhead.com/?search=Tabard of Brute Force"&gt;Tabard of Brute Force&lt;/a&gt; &lt;a href="http://www.wowhead.com/?search=Tabard of the Wyrmrest Accord"&gt;Tabard of the Wyrmrest Accord&lt;/a&gt; &lt;a href="http://www.wowhead.com/?search=Tattered Hexcloth Sack"&gt;Tattered Hexcloth Sack&lt;/a&gt; </code></pre> <p>If you're sure you don't have the extra spaces, the pattern simplifies to:</p> <pre><code>s#&lt;a href=".*search=\([^"]*\)"&gt;#&amp;\1&lt;/a&gt;# </code></pre> <p>In <code>sed</code>, <code>s</code> followed by any character (<code>#</code> in this case) starts substitution. The pattern to be substituted is until the second appearance of the same character. So, in our second example, the pattern to be substituted is: <code>&lt;a href=".*search=\([^"]*\)"&gt;</code>. I used <code>\([^"]*\)</code> to mean, any sequence of non-<code>"</code> characters, and saved it in backreference <code>\1</code> (the <code>\(\)</code> pair denotes a backreference). Finally, the next token delimited by <code>#</code> is the replacement. <code>&amp;</code> in <code>sed</code> stands for "whatever matched", which in this case is the whole line, and <code>\1</code> just matches the link text.</p> <p>Here's the pattern again:</p> <pre><code>'s#&lt;a[ \t][ \t]*href[ \t]*=[ \t]*".*search[ \t]*=[ \t]*\([^"]*\)"&gt;#&amp;\1&lt;/a&gt;#' </code></pre> <p>and its explanation:</p> <pre><code>' quote so as to avoid shell interpreting the characters s substitute # delimiter &lt;a[ \t][ \t]* &lt;a followed by one or more whitespace href[ \t][ \t]*=[ \t]* href followed by optional space, = followed by optional space ".*search[ \t]*=[ \t]* " followed by as many characters as needed, followed by search, optional space, =, followed by optional space \([^"]*\) a sequence of non-" characters, saved in \1 "&gt; followed by "&gt; # delimiter, replacement pattern starts &amp;\1 the matched pattern, followed by backreference \1. &lt;/a&gt; end the &lt;/a&gt; tag # end delimiter ' end quote </code></pre> <p>If you're <em>really</em> sure that there will always be <code>search=</code> followed by the text you want, you can do:</p> <pre><code>$ sed -e 's#.*search=\(.*\)"&gt;#&amp;\1&lt;/a&gt;#' </code></pre> <p>Hope that helps.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload