Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<blockquote> <p>Should I use XPath? Is there something more efficient?</p> </blockquote> <p>I would use Ruby with the Nokogiri library, it gives you much more flexibility than just XPath/XSLT:</p> <pre><code>#!/usr/bin/ruby require 'rubygems' require 'nokogiri' src = &lt;&lt;EOS &lt;div&gt; &lt;span style="display:block;float:left;color:#888"&gt; 2:56 PM&amp;nbsp; &lt;/span&gt; &lt;span style="display:block;padding-left:6em"&gt; &lt;span&gt; &lt;span style="font-weight:bold"&gt;me&lt;/span&gt;: i'm trying to think of a good way to parse gmail chat logs &lt;/span&gt; &lt;/span&gt; &lt;span style="display:block;float:left;color:#888"&gt; &amp;nbsp;&amp;nbsp; &lt;/span&gt; &lt;span style="display:block;padding-left:6em"&gt; &lt;span&gt; and reformat that into something like an xml format &lt;/span&gt; &lt;/span&gt; &lt;/div&gt; EOS chatlog = [] last_timestamp = nil doc = Nokogiri::HTML(src) doc.xpath('//div/span').each do |span| style = span.attributes['style'].value if style.include?('color:') last_timestamp = span.content.strip elsif style.include?('padding-left:') chatlog &lt;&lt; {:timestamp =&gt; last_timestamp, :message =&gt; span.content.strip} end end builder = Nokogiri::XML::Builder.new do |doc| doc.chatlog { chatlog.each do |line| doc.line { doc.time line[:timestamp] doc.message line[:message] } end } end </code></pre> <p>Returns:</p> <pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;chatlog&gt; &lt;line&gt; &lt;time&gt;2:56 PM &lt;/time&gt; &lt;message&gt;me: i'm trying to think of a good way to parse gmail chat logs&lt;/message&gt; &lt;/line&gt; &lt;line&gt; &lt;time&gt;  &lt;/time&gt; &lt;message&gt;and reformat that into something like an xml format&lt;/message&gt; &lt;/line&gt; &lt;/chatlog&gt; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload