Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I had this same problem. I think what needs to be developed is a framework to have a HexagonalGrid object which can then be applied to many different data sets (and it would be awesome to do it for N dimensions). This is possible and it surprises me that neither Scipy or Numpy has anything for it (furthermore there seems to be nothing else like it except perhaps <a href="https://github.com/kevinschaul/binify" rel="nofollow">binify</a>) </p> <p>That said, I assume you want to use hexbinning to compare multiple binned data sets. This requires some common base. I got this to work using matplotlib's hexbin the following way:</p> <pre><code>import numpy as np import matplotlib.pyplot as plt def get_data (mean,cov,n=1e3): """ Quick fake data builder """ np.random.seed(101) points = np.random.multivariate_normal(mean=mean,cov=cov,size=int(n)) x, y = points.T return x,y def get_centers (hexbin_output): """ about 40% faster than previous post only cause you're not calculating the min/max every time """ paths = hexbin_output.get_paths() v = paths[0].vertices[:-1] # adds a value [0,0] to the end vx,vy = v.T idx = [3,0,5,2] # index for [xmin,xmax,ymin,ymax] xmin,xmax,ymin,ymax = vx[idx[0]],vx[idx[1]],vy[idx[2]],vy[idx[3]] half_width_x = abs(xmax-xmin)/2.0 half_width_y = abs(ymax-ymin)/2.0 centers = [] for i in xrange(len(paths)): cx = paths[i].vertices[idx[0],0]+half_width_x cy = paths[i].vertices[idx[2],1]+half_width_y centers.append((cx,cy)) return np.asarray(centers) # important parts ==&gt; class Hexagonal2DGrid (object): """ Used to fix the gridsize, extent, and bins """ def __init__ (self,gridsize,extent,bins=None): self.gridsize = gridsize self.extent = extent self.bins = bins def hexbin (x,y,hexgrid): """ To hexagonally bin the data in 2 dimensions """ fig = plt.figure() ax = fig.add_subplot(111) # Note mincnt=0 so that it will return a value for every point in the # hexgrid, not just those with count&gt;mincnt # Basically you fix the gridsize, extent, and bins to keep them the same # then the resulting count array is the same hexbin = plt.hexbin(x,y, mincnt=0, gridsize=hexgrid.gridsize, extent=hexgrid.extent, bins=hexgrid.bins) # you could close the figure if you don't want it # plt.close(fig.number) counts = hexbin.get_array().copy() return counts, hexbin # Example ===&gt; if __name__ == "__main__": hexgrid = Hexagonal2DGrid((21,5),[-70,70,-20,20]) x_data,y_data = get_data((0,0),[[-40,95],[90,10]]) x_model,y_model = get_data((0,10),[[100,30],[3,30]]) counts_data, hexbin_data = hexbin(x_data,y_data,hexgrid) counts_model, hexbin_model = hexbin(x_model,y_model,hexgrid) # if you want the centers, they will be the same for both centers = get_centers(hexbin_data) # if you want to ignore the cells with zeros then use the following mask. # But if want zeros for some bins and not others I'm not sure an elegant way # to do this without using the centers nonzero = counts_data != 0 # now you can compare the two data sets variance_data = counts_data[nonzero] square_diffs = (counts_data[nonzero]-counts_model[nonzero])**2 chi2 = np.sum(square_diffs/variance_data) print(" chi2={}".format(chi2)) </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload