Note that there are some explanatory texts on larger screens.

plurals
  1. POPattern matching on abstract forms
    text
    copied!<p><strong>Disclaimer</strong>: I kept this because some things may be useful to others, however, it does not solve what I had initially tried to do.</p> <p>Right now, I'm trying to solve the following:</p> <p>Given something like {a, B, {c, D}} I want to scan through Erlang forms given to parse_transform/2 and find each use of the send operator (!). Then I want to check the message being sent and determine whether it would fit the pattern {a, B, {c, D}}.</p> <p>Therefore, consider finding the following form:</p> <pre><code>{op,17,'!', {var,17,'Pid'}, {tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]}}]}]} </code></pre> <p>Since the message being sent is:</p> <pre><code>{tuple,17,[{atom,17,a},{integer,17,5},{var,17,'SomeVar'}]} </code></pre> <p>which is an encoding of {a, 5, SomeVar}, this would match the original pattern of {a, B, {c, D}}.</p> <p>I'm not exactly sure how I'm going to go about this but do you know of any API functions which could help?</p> <p>Turning the given {a, B, {c, D}} into a form is possible by first substituting the variables with something, e.g. strings (and taking a note of this), else they'll be unbound, and then using:</p> <pre><code>&gt; erl_syntax:revert(erl_syntax:abstract({a, "B", {c, "D"}})). {tuple,0, [{atom,0,a}, {string,0,"B"}, {tuple,0,[{atom,0,c},{string,0,"D"}]}]} </code></pre> <p>I was thinking that after getting them in the same format like this, I could analyze them together:</p> <pre><code>&gt; erl_syntax:type({tuple,0,[{atom,0,a},{string,0,"B"},{tuple,0,[{atom,0,c},string,0,"D"}]}]}). tuple %% check whether send argument is also a tuple. %% then, since it's a tuple, use erl_syntax:tuple_elements/1 and keep comparing in this way, matching anything when you come across a string which was a variable... </code></pre> <p>I think I'll end up missing something out (and for example recognizing some things but not others ... even though they should have matched). Are there any API functions which I could use to ease this task? And as for a pattern match test operator or something along those lines, that does not exist right? (i.e. only suggested here: <a href="http://erlang.org/pipermail/erlang-questions/2007-December/031449.html" rel="nofollow">http://erlang.org/pipermail/erlang-questions/2007-December/031449.html</a>).</p> <p><strong>Edit:</strong> (Explaining things from the beginning this time)</p> <p>Using erl_types as Daniel suggests below is probably doable if you play around with the erl_type() returned by t_from_term/1 i.e. t_from_term/1 takes a term with no free variables so you'd have to stay changing something like <code>{a, B, {c, D}}</code> into <code>{a, '_', {c, '_'}}</code> (i.e. fill the variables), use t_from_term/1 and then go through the returned data structure and change the '_' atoms to variables using the module's t_var/1 or something.</p> <p>Before explaining how I ended up going about it, let me state the problem a bit better.</p> <p><strong>Problem</strong></p> <p>I'm working on a pet project (ErlAOP extension) which I'll be hosting on SourceForge when ready. Basically, another project already exists (<a href="https://sourceforge.net/projects/erlaop/" rel="nofollow">ErlAOP</a>) through which one can inject code before/after/around/etc... function calls (see <a href="http://erlaop.sourceforge.net/" rel="nofollow">doc</a> if interested).</p> <p>I wanted to extend this to support injection of code at the send/receive level (because of another project). I've already done this but before hosting the project, I'd like to make some improvements.</p> <p>Currently, my implementation simply finds each use of the send operator or receive expression and injects <em>a function</em> before/after/around (receive expressions have a little gotcha because of tail recursion). Let's call this function <strong>dmfun</strong> (dynamic match function).</p> <p>The user will be specifying that when a message of the form e.g. {a, B, {c, D}} is being sent, then the function do_something/1 should be evaluated before the sending takes place. Therefore, the current implementation injects dmfun before each use of the send op in the source code. Dmfun would then have something like:</p> <pre><code>case Arg of {a, B, {c, D}} -&gt; do_something(Arg); _ -&gt; continue end </code></pre> <p>where Arg can simply be passed to dmfun/1 because you have access to the forms generated from the source code. </p> <p>So the problem is that <em>any</em> send operator will have dmfun/1 injected before it (and the send op's message passed as a parameter). But when sending messages like 50, {a, b}, [6, 4, 3] etc... these messages will certainly not match {a, B, {c, D}}, so injecting dmfun/1 at sends with these messages is a waste.</p> <p>I want to be able to pick out <em>plausible</em> send operations like e.g. Pid ! {a, 5, SomeVar}, or Pid ! {a, X, SomeVar}. In both of these cases, it makes sense to inject dmfun/1 because if at runtime, SomeVar = {c, 50}, then the user supplied do_something/1 should be evaluated (but if SomeVar = 50, then it should not, because we're interested in {a, B, {c, D}} and 50 does not match {c, D}).</p> <p><strong>I wrote the following prematurely. It doesn't solve the problem I had. I ended up not including this feature. I left the explanation anyway, but if it were up to me, I'd delete this post entirely... I was still experimenting and I don't think what there is here will be of any use to anyone.</strong></p> <p>Before the explanation, let:</p> <p>msg_format = the user supplied message format which will determine which messages being sent/received are interesting (e.g. {a, B, {c, D}}).</p> <p>msg = the actual message being sent in the source code (e.g. Pid ! {a, X, Y}).</p> <p>I gave the explanation below in a previous edit, but later found out that it wouldn't match some things it should. E.g. when msg_format = {a, B, {c, D}}, msg = {a, 5, SomeVar} wouldn't match when it should (by "match" I mean that dmfun/1 should be injected.</p> <p>Let's call the "algorithm" outlined below Alg. The approach I took was to execute Alg(msg_format, msg) and Alg(msg, msg_format). The explanation below only goes through one of these. By repeating the same thing only getting a different matching function (<code>matching_fun(msg_format)</code> instead of <code>matching_fun(msg)</code>), and injecting dmfun/1 only if at least one of Alg(msg_format, msg) or Alg(msg, msg_format) returns true, then the result should be the injection of dmfun/1 where the desired message can actually be generated at runtime.</p> <ol> <li><p>Take the message form you find in the [Forms] given to parse_transform/2 e.g. lets say you find: <code>{op,24,'!',{var,24,'Pid'},{tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}}</code> So you would take <code>{tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}</code> which is the message being sent. (bind to Msg).</p></li> <li><p>Do fill_vars(Msg) where:</p> <pre><code>-define(VARIABLE_FILLER, "_"). -spec fill_vars(erl_parse:abstract_form()) -&gt; erl_parse:abstract_form(). %% @doc This function takes an abstract_form() and replaces all {var, LineNum, Variable} forms with %% {string, LineNum, ?VARIABLE_FILLER}. fill_vars(Form) -&gt; erl_syntax:revert( erl_syntax_lib:map( fun(DeltaTree) -&gt; case erl_syntax:type(DeltaTree) of variable -&gt; erl_syntax:string(?VARIABLE_FILLER); _ -&gt; DeltaTree end end, Form)). </code></pre></li> <li><p>Do form_to_term/1 on 2's output, where:</p> <pre><code>form_to_term(Form) -&gt; element(2, erl_eval:exprs([Form], [])). </code></pre></li> <li><p>Do term_to_str/1 on 3's output, where:</p> <pre><code>-define(inject_str(FormatStr, TermList), lists:flatten(io_lib:format(FormatStr, TermList))). term_to_str(Term) -&gt; ?inject_str("~p", [Term]). </code></pre></li> <li><p>Do <code>gsub(v(4), "\"_\"", "_")</code>, where v(4) is 4's output and gsub is: (taken from <a href="http://onerlang.blogspot.com/2009/06/string-substitution-missing-in-erlang.html" rel="nofollow">here</a>)</p> <pre><code>gsub(Str,Old,New) -&gt; RegExp = "\\Q"++Old++"\\E", re:replace(Str,RegExp,New,[global, multiline, {return, list}]). </code></pre></li> <li><p>Bind a variable (e.g. M) to matching_fun(v(5)), where:</p> <pre><code>matching_fun(StrPattern) -&gt; form_to_term( str_to_form( ?inject_str( "fun(MsgFormat) -&gt; case MsgFormat of ~s -&gt; true; _ -&gt; false end end.", [StrPattern]) ) ). str_to_form(MsgFStr) -&gt; {_, Tokens, _} = erl_scan:string(end_with_period(MsgFStr)), {_, Exprs} = erl_parse:parse_exprs(Tokens), hd(Exprs). end_with_period(String) -&gt; case lists:last(String) of $. -&gt; String; _ -&gt; String ++ "." end. </code></pre></li> <li><p>Finally, take the user supplied message format (which is given as a string), e.g. MsgFormat = "{a, B, {c, D}}", and do: MsgFormatTerm = form_to_term(fill_vars(str_to_form(MsgFormat))). Then you can M(MsgFormatTerm).</p></li> </ol> <p>e.g. with user supplied message format = {a, B, {c, D}}, and Pid ! {a, B, C} found in code:</p> <pre><code>2&gt; weaver_ext:fill_vars({tuple,24,[{atom,24,a},{var,24,'B'},{var,24,'C'}]}). {tuple,24,[{atom,24,a},{string,0,"_"},{string,0,"_"}]} 3&gt; weaver_ext:form_to_term(v(2)). {a,"_","_"} 4&gt; weaver_ext:term_to_str(v(3)). "{a,\"_\",\"_\"}" 5&gt; weaver_ext:gsub(v(4), "\"_\"", "_"). "{a,_,_}" 6&gt; M = weaver_ext:matching_fun(v(5)). #Fun&lt;erl_eval.6.13229925&gt; 7&gt; MsgFormatTerm = weaver_ext:form_to_term(weaver_ext:fill_vars(weaver_ext:str_to_form("{a, B, {c, D}}"))). {a,"_",{c,"_"}} 8&gt; M(MsgFormatTerm). true 9&gt; M({a, 10, 20}). true 10&gt; M({b, "_", 20}). false </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload