Note that there are some explanatory texts on larger screens.

plurals
  1. POPython - lexical analysis and tokenization
    text
    copied!<p>I'm looking to speed along my discovery process here quite a bit, as this is my first venture into the world of lexical analysis. Maybe this is even the wrong path. First, I'll describe my problem:</p> <p>I've got very large properties files (in the order of 1,000 properties), which when distilled, are really just about 15 important properties and the rest can be generated or rarely ever change. </p> <p>So, for example:</p> <pre><code>general { name = myname ip = 127.0.0.1 } component1 { key = value foo = bar } </code></pre> <p>This is the type of format I want to create to tokenize something like:</p> <pre><code>property.${general.name}blah.home.directory = /blah property.${general.name}.ip = ${general.ip} property.${component1}.ip = ${general.ip} property.${component1}.foo = ${component1.foo} </code></pre> <p>into</p> <pre><code>property.mynameblah.home.directory = /blah property.myname.ip = 127.0.0.1 property.component1.ip = 127.0.0.1 property.component1.foo = bar </code></pre> <p>Lexical analysis and tokenization sounds like my best route, but this is a very simple form of it. It's a simple grammar, a simple substitution and I'd like to make sure that I'm not bringing a sledgehammer to knock in a nail.</p> <p>I could create my own lexer and tokenizer, or ANTlr is a possibility, but I don't like re-inventing the wheel and ANTlr sounds like overkill.</p> <p>I'm not familiar with compiler techniques, so pointers in the right direction &amp; code would be most appreciated.</p> <p><strong>Note</strong>: I can change the input format.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload