Note that there are some explanatory texts on larger screens.

plurals
  1. POBest way of "looping over a 2-D array", using Repa
    primarykey
    data
    text
    <p>I find the array library Repa for Haskell very interesting, and wanted to make a simple program, to try to understand how to use it. I also made a simple implementation using lists, which proved to be much faster. My main question is how I could improve the Repa code below to make it the most efficient (and hopefully also very readable). I am quite new using Haskell, and I couldn't find any easily understandable tutorial on Repa [<strong>edit</strong> there is one at the <a href="http://haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial" rel="noreferrer">Haskell Wiki</a>, that I somehow forgot when I wrote this], so don't assume I know anything. :) For example, I'm not sure when to use force or deepSeqArray.</p> <p>The program is used to approximately calculate the volume of a sphere in the following way: </p> <ol> <li>The center point and radius of the sphere is specified, as well as equally spaced coordinates within a cuboid, which are known to encompass the sphere.</li> <li>The program takes each of the coordinates, calculates the distance to the center of the sphere, and if it is smaller than the radius of the sphere, it is used to add up on the total (approximate) volume of the sphere.</li> </ol> <p>Two versions are shown below, one using lists and one using repa. I know the code is inefficient, especially for this use case, but the idea is to make it more complicated later on.</p> <p>For the values below, and compiling with "ghc -Odph -fllvm -fforce-recomp -rtsopts -threaded", the list version takes 1.4 s, while the repa version takes 12 s with +RTS -N1 and 10 s with +RTS -N2, though no sparks are converted (I have a dual-core Intel machine (Core 2 Duo E7400 @ 2.8 GHz) running Windows 7 64, GHC 7.0.2 and llvm 2.8). (Comment out the correct line in main below to just run one of the versions.)</p> <p>Thank you for any help!</p> <pre><code>import Data.Array.Repa as R import qualified Data.Vector.Unboxed as V import Prelude as P -- Calculate the volume of a sphere by putting it in a bath of coordinates. Generate coordinates (x,y,z) in a cuboid. Then, for each coordinate, check if it is inside the sphere. Sum those coordinates and multiply by the coordinate grid step size to find an approximate volume. particles = [(0,0,0)] -- used for the list alternative --[(0,0,0),(0,2,0)] particles_repa = [0,0,0::Double] -- used for the repa alternative, can currently just be one coordinate -- Radius of the particle a = 4 -- Generate the coordinates. Could this be done more efficiently, and at the same time simple? In Matlab I would use ndgrid. step = 0.1 --0.05 xrange = [-10,-10+step..10] :: [Double] yrange = [-10,-10+step..10] zrange = [-10,-10+step..10] -- All coordinates as triples. These are used directly in the list version below. coords = [(x,y,z) | x &lt;- xrange, y &lt;- yrange, z &lt;- zrange] ---- List code ---- volumeIndividuals = fromIntegral (length particles) * 4*pi*a**3/3 volumeInside = step**3 * fromIntegral (numberInsideParticles particles coords) numberInsideParticles particles coords = length $ filter (==True) $ P.map (insideParticles particles) coords insideParticles particles coord = any (==True) $ P.map (insideParticle coord) particles insideParticle (xc,yc,zc) (xp,yp,zp) = ((xc-xp)^2+(yc-yp)^2+(zc-zp)^2) &lt; a**2 ---- End list code ---- ---- Repa code ---- -- Put the coordinates in a Nx3 array. xcoords = P.map (\(x,_,_) -&gt; x) coords ycoords = P.map (\(_,y,_) -&gt; y) coords zcoords = P.map (\(_,_,z) -&gt; z) coords -- Total number of coordinates num_coords = (length xcoords) ::Int xcoords_r = fromList (Z :. num_coords :. (1::Int)) xcoords ycoords_r = fromList (Z :. num_coords :. (1::Int)) ycoords zcoords_r = fromList (Z :. num_coords :. (1::Int)) zcoords rcoords = xcoords_r R.++ ycoords_r R.++ zcoords_r -- Put the particle coordinates in an array, then extend (replicate) this array so that its size becomes the same as that of rcoords particle = fromList (Z :. (1::Int) :. (3::Int)) particles_repa particle_slice = slice particle (Z :. (0::Int) :. All) particle_extended = extend (Z :. num_coords :. All) particle_slice -- Calculate the squared difference between the (x,y,z) coordinates of the particle and the coordinates of the cuboid. squared_diff = deepSeqArrays [rcoords,particle_extended] ((force2 rcoords) -^ (force2 particle_extended)) **^ 2 (**^) arr pow = R.map (**pow) arr xslice = slice squared_diff (Z :. All :. (0::Int)) yslice = slice squared_diff (Z :. All :. (1::Int)) zslice = slice squared_diff (Z :. All :. (2::Int)) -- Calculate the distance between each coordinate and the particle center sum_squared_diff = [xslice,yslice,zslice] `deepSeqArrays` xslice +^ yslice +^ zslice -- Do the rest using vector, since I didn't get the repa variant working. ssd_vec = toVector sum_squared_diff -- Determine the number of the coordinates that are within the particle (instead of taking the square root to get the distances above, I compare to the square of the radius here, to improve performance) total_within = fromIntegral (V.length $ V.filter (&lt;a**2) ssd_vec) --total_within = foldAll (\x acc -&gt; if x &lt; a**2 then acc+1 else acc) 0 sum_squared_diff -- Finally, calculate an approximation of the volume of the sphere by taking the volume of the cubes with side step, multiplied with the number of coordinates within the sphere. volumeInside_repa = step**3 * total_within -- Helper function that shows the size of a 2-D array. rsize = reverse . listOfShape . (extent :: Array DIM2 Double -&gt; DIM2) ---- End repa code ---- -- Comment out the list or the repa version if you want to time the calculations separately. main = do putStrLn $ "Step = " P.++ show step putStrLn $ "Volume of individual particles = " P.++ show volumeIndividuals putStrLn $ "Volume of cubes inside particles (list) = " P.++ show volumeInside putStrLn $ "Volume of cubes inside particles (repa) = " P.++ show volumeInside_repa </code></pre> <p><strong>Edit</strong>: Some background that explains why I have written the code as it is above:</p> <p>I mostly write code in Matlab, and my experience of performance improvement comes mostly from that area. In Matlab, you usually want to make your calculations using functions operating on matrices directly, to improve performance. My implementation of the problem above, in Matlab R2010b, takes 0.9 seconds using the matrix version shown below, and 15 seconds using nested loops. Although I know Haskell is very different from Matlab, my hope was that going from using lists to using Repa arrays in Haskell would improve the performance of the code. The conversions from lists->Repa arrays->vectors are there because I'm not skilled enough to replace them with something better. This is why I ask for input. :) The timing numbers above is thus subjective, since it may measure <em>my</em> performance more than that of the abilities of the languages, but it is a valid metric for me right now, since what decides what I will use depends on if <em>I</em> can make it work or not.</p> <p>tl;dr: I understand that my Repa code above may be stupid or pathological, but it's the best I can do right now. I would love to be able to write better Haskell code, and I hope that you can help me in that direction (dons already did). :)</p> <pre><code>function archimedes_simple() particles = [0 0 0]'; a = 4; step = 0.1; xrange = [-10:step:10]; yrange = [-10:step:10]; zrange = [-10:step:10]; [X,Y,Z] = ndgrid(xrange,yrange,zrange); dists2 = bsxfun(@minus,X,particles(1)).^2+ ... bsxfun(@minus,Y,particles(2)).^2+ ... bsxfun(@minus,Z,particles(3)).^2; inside = dists2 &lt; a^2; num_inside = sum(inside(:)); disp(''); disp(['Step = ' num2str(step)]); disp(['Volume of individual particles = ' num2str(size(particles,2)*4*pi*a^3/3)]); disp(['Volume of cubes inside particles = ' num2str(step^3*num_inside)]); end </code></pre> <p><strong>Edit 2</strong>: New, faster and simpler version of the Repa code</p> <p>I have now read up a bit more on Repa, and thought a bit. Below is a new Repa version. In this case, I create the x, y, and z coordinates as 3-D arrays, using the Repa extend function, from a list of values (similar to how ndgrid works in Matlab). I then map over these arrays to calculate the distance to the spherical particle. Finally, I fold over the resulting 3-D distance array, count how many coordinates are within the sphere, and then multiply it by a constant factor to get the approximate volume. My implementation of the algorithm is now much more similar to the Matlab version above, and there are no longer any conversion to vector.</p> <p>The new version runs in about 5 seconds on my computer, a considerable improvement from above. The timing is the same if I use "threaded" while compiling, combined with "+RTS -N2" or not, but the threaded version does max out both cores of my computer. I did, however, see a few drops of the "-N2" run to 3.1 seconds, but couldn't reproduce them later. Maybe it is very sensitive to other processes running at the same time? I have shut most programs on my computer when benchmarking, but there are still some programs running, such as background processes. </p> <p>If we use "-N2" and add the runtime switch to turn off parallel GC (-qg), the time consistently goes down to ~4.1 seconds, and using -qa to "use the OS to set thread affinity (experimental)", the time was shaved down to ~3.5 seconds. Looking at the output from running the program with "+RTS -s", much less GC is performed using -qg.</p> <p>This afternoon I will see if I can run the code on an 8-core computer, just for fun. :)</p> <pre><code>import Data.Array.Repa as R import Prelude as P import qualified Data.List as L -- Calculate the volume of a spherical particle by putting it in a bath of coordinates. Generate coordinates (x,y,z) in a cuboid. Then, for each coordinate, check if it is inside the sphere. Sum those coordinates and multiply by the coordinate grid step size to find an approximate volume. particles :: [(Double,Double,Double)] particles = [(0,0,0)] -- Radius of the spherical particle a = 4 volume_individuals = fromIntegral (length particles) * 4*pi*a^3/3 -- Generate the coordinates. step = 0.1 coords_list = [-10,-10+step..10] :: [Double] num_coords = (length coords_list) :: Int coords :: Array DIM1 Double coords = fromList (Z :. (num_coords ::Int)) coords_list coords_slice :: Array DIM1 Double coords_slice = slice coords (Z :. All) -- x, y and z are 3-D arrays, where the same index into each array can be used to find a single coordinate, e.g. (x(i,j,k),y(i,j,k),z(i,j,k)). x,y,z :: Array DIM3 Double x = extend (Z :. All :. num_coords :. num_coords) coords_slice y = extend (Z :. num_coords :. All :. num_coords) coords_slice z = extend (Z :. num_coords :. num_coords :. All) coords_slice -- Calculate the squared distance from each coordinate to the center of the spherical particle. dist2 :: (Double, Double, Double) -&gt; Array DIM3 Double dist2 particle = ((R.map (squared_diff xp) x) + (R.map (squared_diff yp) y) + (R.map ( squared_diff zp) z)) where (xp,yp,zp) = particle squared_diff xi xa = (xa-xi)^2 -- Count how many of the coordinates are within the spherical particle. num_inside_particle :: (Double,Double,Double) -&gt; Double num_inside_particle particle = foldAll (\acc x -&gt; if x&lt;a^2 then acc+1 else acc) 0 (force $ dist2 particle) -- Calculate the approximate volume covered by the spherical particle. volume_inside :: [Double] volume_inside = P.map ((*step^3) . num_inside_particle) particles main = do putStrLn $ "Step = " P.++ show step putStrLn $ "Volume of individual particles = " P.++ show volume_individuals putStrLn $ "Volume of cubes inside each particle (repa) = " P.++ (P.concat . ( L.intersperse ", ") . P.map show) volume_inside -- As an alternative, y and z could be generated from x, but this was slightly slower in my tests (~0.4 s). --y = permute_dims_3D x --z = permute_dims_3D y -- Permute the dimensions in a 3-D array, (forward, cyclically) permute_dims_3D a = backpermute (swap e) swap a where e = extent a swap (Z :. i:. j :. k) = Z :. k :. i :. j </code></pre> <p><strong>Space profiling for the new code</strong></p> <p>The same types of profiles as Don Stewart made below, but for the new Repa code.</p> <p><img src="https://i.stack.imgur.com/uDKDy.png" alt=""> <img src="https://i.stack.imgur.com/hriMd.png" alt=""> <img src="https://i.stack.imgur.com/1Zmvl.png" alt=""></p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload