Note that there are some explanatory texts on larger screens.

plurals
  1. POClass template for cache aligned memory usage in C++
    text
    copied!<p>(to provide the information you need to understand my question is a lot, however it is already compressed)</p> <p>i try to implement a class template to allocate and access data cache aligned. This works very good, however trying to implement support for arrays is a problem.</p> <p>Semantically the code shall provide this mapping in memory for a single element like this:</p> <pre><code>cache_aligned&lt;element_type&gt;* my_el = new(cache_line_size) cache_aligned&lt;element_type&gt;(); | element | buffer | </code></pre> <p>the access (so far) looks like this:</p> <pre><code>*my_el; // returns cache_aligned&lt;element_type&gt; **my_el; //returns element_type *my_el-&gt;member_of_element(); </code></pre> <p>HOWEVER for an array, i'd like to have this:</p> <pre><code> cache_aligned&lt;element_type&gt;* my_el_array = new(cache_line_size) cache_aligned&lt;element_type()[N]; | element 0 | buffer | element 1 | buffer | ... | element (N-1) | buffer | </code></pre> <p>So far i have the following code</p> <pre><code>template &lt;typename T&gt; class cache_aligned { private: T instance; public: cache_aligned() {} cache_aligned(const T&amp; other) :instance(other.instance) {} static void* operator new (size_t size, uint c_line_size) { return c_a_malloc(size, c_line_size); } static void* operator new[] (size_t size, uint c_line_size) { int num_el = (size - sizeof(cache_aligned&lt;T&gt;*) / sizeof(cache_aligned&lt;T&gt;); return c_a_array(sizeof(cache_aligned&lt;T&gt;), num_el, c_line_size); } static void operator delete (void* ptr) { free_c_a(ptr); } T* operator-&gt; () { return &amp;instance; } T&amp; operator * () { return instance; } }; </code></pre> <p>the functions cache_aligned_malloc</p> <pre><code>void* c_a_array(uint size, ulong num_el, uint c_line_size) { void* mem = malloc((size + c_line_size) * num_el + sizeof(void*)); void** ptr = (void**)((long)mem + sizeof(void*)); ptr[-1] = mem; return ptr; } void free_c_a(void ptr) { free(((void**)ptr)[-1]); } </code></pre> <p>The problem is here, the access to the data should work like this:</p> <pre><code>my_el_array[i]; // returns cache_aligned&lt;element_type&gt; *(my_el_array[i]); // returns element_type my_el_array[i]-&gt;member_of_element(); </code></pre> <p>My ideas to solve it, are:</p> <p>(1) something similar to this, to overload sizeof operator:</p> <pre><code>static size_t operator sizeof () { return sizeof(cache_aligned&lt;T&gt;) + c_line_size; } </code></pre> <p>--> not possible since overloading sizeof operator is illegal</p> <p>(2) something like this, to overload the operator [] for the pointer type:</p> <pre><code>static T&amp; operator [] (uint index, cache_aligned&lt;T&gt;* ptr) { return ptr + ((sizeof(cache_aligned&lt;T&gt;) + c_line_size) * index); } </code></pre> <p>--> not possible in C++, anyway</p> <p>(3) totally trivial solution</p> <pre><code>template &lt;typename T&gt; cache_aligned { private: T instance; bool buffer[CACHE_LINE_SIZE]; // CACHE_LINE_SIZE defined as macro public: // trivial operators and methods ;) }; </code></pre> <p>--> i don't know whether this is reliable, actually i'm using gcc-4.5.1 in linux ...</p> <p>(4) Replacing T instance; by T* instance_ptr; in the class template and using the operator [] to calculate the position of the element, like this:</p> <p>| pointer-to-instance | ----> | element 0 | buffer | ... | element (N-1) | buffer |</p> <p>this is not the intended semantic, since the instance of the class template becomes the bottleneck when calculating the address of the elements.</p> <p>Thanks for reading! I dont' know how to shorten the problem. It would be great, if you can help! Any work around would help a lot.</p> <p>I know alignment is an extension in C++0x. However, in gcc it is not available yet.</p> <p>Greetz, sema</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload