Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Long time ago, I made up my own parsing algorithm, that I couldn't find in any books on parsing (like the Dragon Book). Looking at the pointers to the Shunting Yard algorithm, I do see the resemblance.</p> <p>About 2 years ago, I made a post about it, complete with Perl source code, on <a href="http://www.perlmonks.org/?node_id=554516" rel="noreferrer">http://www.perlmonks.org/?node_id=554516</a>. It's easy to port to other languages: the first implementation I did was in Z80 assembler.</p> <p>It's ideal for direct calculation with numbers, but you can use it to produce a parse tree if you must.</p> <p><strong>Update</strong> Because more people can read (or run) Javascript, I've reimplemented my parser in Javascript, after the code has been reorganized. The whole parser is under 5k of Javascript code (about 100 lines for the parser, 15 lines for a wrapper function) including error reporting, and comments.</p> <p>You can find a live demo at <a href="http://users.telenet.be/bartl/expressionParser/expressionParser.html" rel="noreferrer">http://users.telenet.be/bartl/expressionParser/expressionParser.html</a>.</p> <pre class="lang-js prettyprint-override"><code>// operator table var ops = { '+' : {op: '+', precedence: 10, assoc: 'L', exec: function(l,r) { return l+r; } }, '-' : {op: '-', precedence: 10, assoc: 'L', exec: function(l,r) { return l-r; } }, '*' : {op: '*', precedence: 20, assoc: 'L', exec: function(l,r) { return l*r; } }, '/' : {op: '/', precedence: 20, assoc: 'L', exec: function(l,r) { return l/r; } }, '**' : {op: '**', precedence: 30, assoc: 'R', exec: function(l,r) { return Math.pow(l,r); } } }; // constants or variables var vars = { e: Math.exp(1), pi: Math.atan2(1,1)*4 }; // input for parsing // var r = { string: '123.45+33*8', offset: 0 }; // r is passed by reference: any change in r.offset is returned to the caller // functions return the parsed/calculated value function parseVal(r) { var startOffset = r.offset; var value; var m; // floating point number // example of parsing ("lexing") without aid of regular expressions value = 0; while("0123456789".indexOf(r.string.substr(r.offset, 1)) &gt;= 0 &amp;&amp; r.offset &lt; r.string.length) r.offset++; if(r.string.substr(r.offset, 1) == ".") { r.offset++; while("0123456789".indexOf(r.string.substr(r.offset, 1)) &gt;= 0 &amp;&amp; r.offset &lt; r.string.length) r.offset++; } if(r.offset &gt; startOffset) { // did that work? // OK, so I'm lazy... return parseFloat(r.string.substr(startOffset, r.offset-startOffset)); } else if(r.string.substr(r.offset, 1) == "+") { // unary plus r.offset++; return parseVal(r); } else if(r.string.substr(r.offset, 1) == "-") { // unary minus r.offset++; return negate(parseVal(r)); } else if(r.string.substr(r.offset, 1) == "(") { // expression in parens r.offset++; // eat "(" value = parseExpr(r); if(r.string.substr(r.offset, 1) == ")") { r.offset++; return value; } r.error = "Parsing error: ')' expected"; throw 'parseError'; } else if(m = /^[a-z_][a-z0-9_]*/i.exec(r.string.substr(r.offset))) { // variable/constant name // sorry for the regular expression, but I'm too lazy to manually build a varname lexer var name = m[0]; // matched string r.offset += name.length; if(name in vars) return vars[name]; // I know that thing! r.error = "Semantic error: unknown variable '" + name + "'"; throw 'unknownVar'; } else { if(r.string.length == r.offset) { r.error = 'Parsing error at end of string: value expected'; throw 'valueMissing'; } else { r.error = "Parsing error: unrecognized value"; throw 'valueNotParsed'; } } } function negate (value) { return -value; } function parseOp(r) { if(r.string.substr(r.offset,2) == '**') { r.offset += 2; return ops['**']; } if("+-*/".indexOf(r.string.substr(r.offset,1)) &gt;= 0) return ops[r.string.substr(r.offset++, 1)]; return null; } function parseExpr(r) { var stack = [{precedence: 0, assoc: 'L'}]; var op; var value = parseVal(r); // first value on the left for(;;){ op = parseOp(r) || {precedence: 0, assoc: 'L'}; while(op.precedence &lt; stack[stack.length-1].precedence || (op.precedence == stack[stack.length-1].precedence &amp;&amp; op.assoc == 'L')) { // precedence op is too low, calculate with what we've got on the left, first var tos = stack.pop(); if(!tos.exec) return value; // end reached // do the calculation ("reduce"), producing a new value value = tos.exec(tos.value, value); } // store on stack and continue parsing ("shift") stack.push({op: op.op, precedence: op.precedence, assoc: op.assoc, exec: op.exec, value: value}); value = parseVal(r); // value on the right } } function parse (string) { // wrapper var r = {string: string, offset: 0}; try { var value = parseExpr(r); if(r.offset &lt; r.string.length){ r.error = 'Syntax error: junk found at offset ' + r.offset; throw 'trailingJunk'; } return value; } catch(e) { alert(r.error + ' (' + e + '):\n' + r.string.substr(0, r.offset) + '&lt;*&gt;' + r.string.substr(r.offset)); return; } } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload