Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I think what you are after for "instruction dependency trees" is <a href="http://en.wikipedia.org/wiki/Data-flow_analysis" rel="nofollow">data flow analysis</a>. This is classic compiler technology that determines for each code element (primitive operations in a programming language), what information is delivered to it from other code elements. When you are done, you end up with what amounts to a graph with nodes being code elements (in your case, individual instructions) and directed arcs between them showing what information has to flow so that the later elements can execute on results produced by "earlier" elements in the graph.</p> <p>You can see <a href="http://www.semanticdesigns.com/Products/DMS/FlowAnalysis.html" rel="nofollow">some examples of such flow analysis</a> at my website (focused on tool that does program analysis; this tool is probably not appropriate for binary analysis but the examples should be helpful).</p> <p>There's a ton of literature in the compiler books on <em>doing</em> data flow analysis. See any compiler text book.</p> <p>There are a number of issues you need to handle:</p> <ul> <li><p>Parsing the code to extract the code elements. You sound like you have access to all the instructions already.</p></li> <li><p>Determining the operands required by, and the values produced, by each code element. This is pretty simple for "ADD register,register", but you may find this daunting for production x86 CPU which has an astonishingly big and crazy instruction set. You have to collect this for every instruction the CPU might execute, and that means all of them pretty much. Nontheless, this is just sweat and a lot of time spent looking at the instruction reference manuals.</p></li> <li><p>Loops. Values can flow from an instruction, through other instructions, back to that same instruction, so the dataflows can form cycles (lots of cycles for complex loops). The dataflow literature will tell you how to handle this in terms of computing the dataflow arcs in the graph. What these mean for your protection scheme I don't know.</p></li> <li><p>Conservative Analysis: You can't get the ideal data flow, because in fact you are analyzing arbitrary algorithms (e.g., a Turing machine); pointers aggravate this problem pretty severely and machine code is full of pointers. So what the data flow analysis engines often do when unable to decide if "x feeds y", is to simply assume "x (may) feed y". The dataflow graph turns conceptually from "x (must) feed y" into the pragmatic "x (may) feed y" type arcs; the literature in fact is full of "must" and "may" algorithms because of this. Again, the literature tells you many ways to do [conservative] flow analysis (mostly having different degrees of conservatism; in fact the most conservatinve data flow analysis simply says "every x feeds every y"!). What this means in practice for your scheme, I don't know.</p></li> </ul> <p>There are a lot of people interested in binary code analysis (e.g., the NSA), and they do data flow analysis on machines instructions complete with pointer analysis. You might find this presentation interesting: <a href="http://research.cs.wisc.edu/wisa/presentations/2002/0114/gogul/gogul.1.14.02.pdf" rel="nofollow">http://research.cs.wisc.edu/wisa/presentations/2002/0114/gogul/gogul.1.14.02.pdf</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload