Note that there are some explanatory texts on larger screens.

plurals
  1. POHow does address operand affect performance and size of machine code?
    primarykey
    data
    text
    <p>Starting with 32-bit CPU mode, there are extended address operands available for x86 architecture. One can specify the base address, a displacement, an index register and a scaling factor.</p> <p>For example, we would like to stride through a list of 32-bit integers (every first two from an array of 32-byte-long data structures, <code>%rdi</code> as data index, <code>%rbx</code> as base pointer).</p> <pre><code>addl $8, %rdi # skip eight values: advance index by 8 movl (%rbx, %rdi, 4), %eax # load data: pointer + scaled index movl 4(%rbx, %rdi, 4), %edx # load data: pointer + scaled index + displacement </code></pre> <p>As I know, such complex addressing fits into a single machine-code instruction. But what is the cost of such operation and how does it compare to simple addressing with independent pointer calculation:</p> <pre><code>addl $32, %rbx # skip eight values: move pointer forward by 32 bytes movl (%rbx), %eax # load data: pointer addl $4, %rbx # point next value: move pointer forward by 4 bytes movl (%rbx), %edx # load data: pointer </code></pre> <p>In the latter example, I have introduced one extra instruction and a dependency. But integer addition is very fast, I gained simpler address operands, and there are no multiplications any more. On the other hand, since the allowed scaling factors are powers of 2, the multiplication comes down to a bit shift, which is also a very fast operation. Still, two additions and a bit shift can be replaced with one addition.</p> <p>What are the performance and code size differences between these two approaches? Are there any best practices for using the extended addressing operands?</p> <p>Or, asking it from a C programmer's point of view, what is faster: array indexing or pointer arithmetic?</p> <hr> <p>Is there any assembly editor meant for size/performance tuning? I wish I could see the machine-code size of each assembly instruction, its execution time in clock cycles or a dependency graph. There are thousands of assembly freaks that would benefit from such application, so I bet that something like this already exists! </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload