Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to create robust access logs using Apache Tomcat Valve Component?
    text
    copied!<p>We are working with Apache Tomcat 7 and trying to setup the Valve Component to store our access logs, ready for processing in <a href="https://github.com/snowplow/snowplow" rel="nofollow noreferrer">SnowPlow</a>.</p> <p>The problem we have is how to make these logs robust. To give an example - we can separate fields with tabs and extract the user agent string like so:</p> <pre><code>pattern="%{yyyy-MM-dd}t&amp;#9;%{hh:mm:ss}t&amp;#9;%{User-Agent}i&amp;#9;" </code></pre> <p>The problem is that the Valve Component does not (as far as I can see) escape <code>%{User-Agent}i</code>, so a stray tab in a useragent will corrupt the data (row will look like it contains four fields, not three).</p> <p>As far as solutions, unless there's a way of escaping the useragent which I've missed, I can see a couple of solutions:</p> <ol> <li>Use a really obscure field delimiter (or combination of field delimiters) which is very unlikely to crop up in a useragent string. We tried Ctrl-A (HTML <code>&amp;#1;</code>?) but that didn't seem to work</li> <li>Write a custom <code>AccessLogValve</code> which either supports escaping or sanitizes tabs - perhaps similar to this post <a href="https://stackoverflow.com/questions/5812238/sanitizing-tomcat-access-log-entries">Sanitizing Tomcat access log entries</a></li> </ol> <p>A bit puzzled that I can't find anything else about this online - does nobody parse their Tomcat access logs?</p> <p>What do you recommend? We're a little stuck...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload