StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
18211221
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2013-08-13T14:01:26.427
FavoriteCount
0
LastActivityDate
2014-04-13T17:31:20.257
LastEditDate
2014-04-13T17:31:20.257
LastEditorUserId
926251
OwnerUserId
926251
ParentId
18187249
PostTypeId
2
Score
24
ViewCount
0
LastEditorDisplayName
text
Body
Well there are two parts to your question. The first part is parsing the expression <pre><code>(+ 1 2 (+ 3 4)) </code></pre> The second part is transforming the output to the result that you want. To get a good understanding of these principles, I highly recommend Udacity's <a href="https://www.udacity.com/course/cs262">Programming Languages course.</a> Carin Meier's <a href="http://gigasquidsoftware.com/blog/2013/05/01/growing-a-language-with-clojure-and-instaparse/">blog post</a> is also quite helpful. The best way to understand how the parser will work, is to break it down into smaller parts. So in the first we'll just examine some parsing rules, and in the second part we'll build our sexps. <ol> <li>A simple example You will first need to write a grammar that tells instaparse how to parse the given expression. We'll start by just parsing the number <code>1</code>: <pre><code>(def parser (insta/parser "sexp = number number = #'[0-9]+' ")) </code></pre> sexp describes the highest level grammar for the sexpression. Our grammar states that the sexp can only have a number. The next line states that the number can be any digit 0-9, and the <code>+</code> is similar to the regex <code>+</code> which means that it must have one number repeated any number of times. If we run our parser we get the following parse tree: <pre><code>(parser "1") => [:sexp [:number "1"]] </code></pre> Ingoring Parenthesis We can ignore certain values by adding angled brackets <code><</code> to our grammar. So if we want to parse <code>"(1)"</code> as simply <code>1</code> we can right our grammar as: <pre><code>(def parser (insta/parser "sexp = lparen number rparen <lparen> = <'('> <rparen> = <')'> number = #'[0-9]+' ")) </code></pre> and if we run the parser again, it will ignore the left and right parenthesis: <pre><code>(parser "(1)") => [:sexp [:number "1"]] </code></pre> This will become helpful when we write the grammar for sexp below. Adding Spaces Now happens if we add spaces and run <code>(parser "( 1 )")</code>? Well we get an error: <pre><code>(parser "( 1 )") => Parse error at line 1, column 2: ( 1 ) ^ Expected: #"[0-9]+" </code></pre> That's because we haven't defined the concept of space in our grammar! So we can add spaces as such: <pre><code>(def parser (insta/parser "sexp = lparen space number space rparen <lparen> = <'('> <rparen> = <')'> number = #'[0-9]+' <space> = <#'[ ]*'> ")) </code></pre> Again the <code>*</code> is similar to the regex <code>*</code> and it means zero or more than one occurrence of a space. That means the following examples will all return the same result: <pre><code>(parser "(1)") => [:sexp [:number "1"]] (parser "( 1 )") => [:sexp [:number "1"]] (parser "( 1 )") => [:sexp [:number "1"]] </code></pre></li> <li>Building the Sexp We're slowly going to build our grammar from the ground up. It might be useful to look at the final product <a href="https://gist.github.com/pooster/6221672">here</a>, just to give an overview of where we're headed. So, an sexp contains more than just numbers as defined by our simple grammar. One high level view we can have of sexp is to view them as an operation between two parenthesis. So basically as a <code>( operation )</code>. We can write this directly into our grammar. <pre><code>(def parser (insta/parser "sexp = lparen operation rparen <lparen> = <'('> <rparen> = <')'> operation = ??? ")) </code></pre> As I stated above, the angled brackets <code><</code> tell instaparse to ignore these values when it is making the parse tree. Now what is an operation? Well an operation consists of an operator, like <code>+</code>, and some arguments, like the numbers <code>1</code> and <code>2</code>. So we can say write our grammar as: <pre><code>(def parser (insta/parser "sexp = lparen operation rparen <lparen> = <'('> <rparen> = <')'> operation = operator + args operator = '+' args = number number = #'[0-9]+' ")) </code></pre> We stated only one possible operator, <code>+</code>, just to keep things simple. We have also included the number grammar rule from the simple example above. Our grammar, however, is very limited. The only valid sexp it can parse is <code>(+1)</code>. That's because we haven't included the concept of spaces, and have stated that args can have only one number. So in this step we will do two things. We will add spaces, and we will state that args can have more than one number. <pre><code>(def parser (insta/parser "sexp = lparen operation rparen <lparen> = <'('> <rparen> = <')'> operation = operator + args operator = '+' args = snumber+ <snumber> = space number <space> = <#'[ ]*'> number = #'[0-9]+' ")) </code></pre> We added <code>space</code> by using the space grammar rule we defined in the simple example. We created a new <code>snumber</code> which is defined as <code>space</code> and a <code>number</code>, and added the <code>+</code> to snumber to state that it must appear once but it can repeat any number of times. So we can run our parser as so: <pre><code>(parser "(+ 1 2)") => [:sexp [:operation [:operator "+"] [:args [:number "1"] [:number "2"]]]] </code></pre> We can make our grammar more robust by having <code>args</code> reference back to <code>sexp</code>. That way we can have sexp in our sexp! We can do this by creating <code>ssexp</code> which adds a <code>space</code> to <code>sexp</code> and then add <code>ssexp</code> to <code>args</code>. <pre><code>(def parser (insta/parser "sexp = lparen operation rparen <lparen> = <'('> <rparen> = <')'> operation = operator + args operator = '+' args = snumber+ ssexp* <ssexp> = space sexp <snumber> = space number <space> = <#'[ ]*'> number = #'[0-9]+' ")) </code></pre> Now we can run <pre><code>(parser "(+ 1 2 (+ 1 2))") => [:sexp [:operation [:operator "+"] [:args [:number "1"] [:number "2"] [:sexp [:operation [:operator "+"] [:args [:number "1"] [:number "2"]]]]]]] </code></pre></li> <li>Transformations This step can be done using any number of tools that work on trees, such enlive, zippers, match, and tree-seq. Instaparse, however, also includes its own useful function called <code>insta\transform</code>. We can build our transformations by replacing the keys in our parse tree by the valid clojure functions. For example, <code>:number</code> becomes <code>read-string</code> to turn our strings into valid numbers, <code>:args</code> becomes <code>vector</code> to build our arguments. So, we want to transform this: <pre><code> [:sexp [:operation [:operator "+"] [:args [:number "1"] [:number "2"]]]] </code></pre> Into this: <pre><code> (identity (apply + (vector (read-string "1") (read-string "2")))) => 3 </code></pre> We can do that by defining our transformation options: <pre><code>(defn choose-op [op] (case op "+" +)) (def transform-options {:number read-string :args vector :operator choose-op :operation apply :sexp identity }) </code></pre> The only tricky thing here was adding the function <code>choose-op</code>. What we want, is to pass the function <code>+</code> to <code>apply</code>, but if we replace <code>operator</code> with <code>+</code> it will use <code>+</code> as a regular function. So it will transform our tree to this: <pre><code> ... (apply (+ (vector ... </code></pre> But by using <code>choose-op</code> it will pass <code>+</code> as an argument to <code>apply</code> as such: <pre><code> ... (apply + (vector ... </code></pre></li> </ol> Conclusion We can now run our little interpreter by putting the parser and transformer together: <pre><code>(defn lisp [input] (->> (parser input) (insta/transform transform-options))) (lisp "(+ 1 2)") => 3 (lisp "(+ 1 2(+ 3 4))") => 10 </code></pre> You can find the final code used in this tutorial <a href="https://gist.github.com/pooster/6221672">here</a>. Hopefully, this short introduction is enough to get going on your own projects. You can new lines by declaring a grammar for <code>\n</code> and you can even choose to not ignore spaces in your parse tree by removing the angled brackets <code><</code>. That might be helpful given that you're trying to keep the indentation. Hope this helps, If not just write a comment! 
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow do we define a grammar for clojure code using instaparse?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USpooya72
UserOwnerUserId
1. USpooya72
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POHow do we define a grammar for clojure code using instaparse?
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.