Note that there are some explanatory texts on larger screens.

plurals
  1. POMemory access after ioremap very slow
    primarykey
    data
    text
    <p>I'm working on a Linux kernel driver that makes a chunk of physical memory available to user space. I have a working version of the driver, but it's currently very slow. So, I've gone back a few steps and tried making a small, simple driver to recreate the problem.</p> <p>I reserve the memory at boot time using the kernel parameter <code>memmap=2G$1G</code>. Then, in the driver's <code>__init</code> function, I <code>ioremap</code> some of this memory, and initialize it to a known value. I put in some code to measure the timing as well:</p> <pre><code>#define RESERVED_REGION_SIZE (1 * 1024 * 1024 * 1024) // 1GB #define RESERVED_REGION_OFFSET (1 * 1024 * 1024 * 1024) // 1GB static int __init memdrv_init(void) { struct timeval t1, t2; printk(KERN_INFO "[memdriver] init\n"); // Remap reserved physical memory (that we grabbed at boot time) do_gettimeofday( &amp;t1 ); reservedBlock = ioremap( RESERVED_REGION_OFFSET, RESERVED_REGION_SIZE ); do_gettimeofday( &amp;t2 ); printk( KERN_ERR "[memdriver] ioremap() took %d usec\n", usec_diff( &amp;t2, &amp;t1 ) ); // Set the memory to a known value do_gettimeofday( &amp;t1 ); memset( reservedBlock, 0xAB, RESERVED_REGION_SIZE ); do_gettimeofday( &amp;t2 ); printk( KERN_ERR "[memdriver] memset() took %d usec\n", usec_diff( &amp;t2, &amp;t1 ) ); // Register the character device ... return 0; } </code></pre> <p>I load the driver, and check dmesg. It reports:</p> <pre><code>[memdriver] init [memdriver] ioremap() took 76268 usec [memdriver] memset() took 12622779 usec </code></pre> <p>That's 12.6 seconds for the memset. That means the memset is running at <em><strong>81 MB/sec</strong></em>. Why on earth is it so slow?</p> <p>This is kernel 2.6.34 on Fedora 13, and it's an x86_64 system.</p> <p>EDIT:</p> <p>The goal behind this scheme is to take a chunk of physical memory and make it available to both a PCI device (via the memory's bus/physical address) and a user space application (via a call to <code>mmap</code>, supported by the driver). The PCI device will then continually fill this memory with data, and the user-space app will read it out. If <code>ioremap</code> is a bad way to do this (as Ben suggested below), I'm open to other suggestions that'll allow me to get any large chunk of memory that can be directly accessed by both hardware and software. I can probably make do with a smaller buffer also.</p> <hr> <p>See my eventual solution below.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload