Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: sliding window of variable width
    text
    copied!<p>I'm writing a program in Python that's processing some data generated during experiments, and it needs to estimate the slope of the data. I've written a piece of code that does this quite nicely, but it's horribly slow (and I'm not very patient). Let me explain how this code works:</p> <p>1) It grabs a small piece of data of size dx (starting with 3 datapoints)</p> <p>2) It evaluates whether the difference (i.e. |y(x+dx)-y(x-dx)| ) is larger than a certain minimum value (40x std. dev. of noise)</p> <p>3) If the difference is large enough, it will calculate the slope using OLS regression. If the difference is too small, it will increase dx and redo the loop with this new dx</p> <p>4) This continues for all the datapoints</p> <p>[See updated code further down]</p> <p>For a datasize of about 100k measurements, this takes about 40 minutes, whereas the rest of the program (it does more processing than just this bit) takes about 10 seconds. I am certain there is a much more efficient way of doing these operations, could you guys please help me out?</p> <p>Thanks</p> <p>EDIT:</p> <p>Ok, so I've got the problem solved by using only binary searches, limiting the number of allowed steps by 200. I thank everyone for their input and I selected the answer that helped me most.</p> <p>FINAL UPDATED CODE:</p> <pre><code>def slope(self, data, time): (wave1, wave2) = wt.dwt(data, "db3") std = 2*np.std(wave2) e = std/0.05 de = 5*std N = len(data) slopes = np.ones(shape=(N,)) data2 = np.concatenate((-data[::-1]+2*data[0], data, -data[::-1]+2*data[N-1])) time2 = np.concatenate((-time[::-1]+2*time[0], time, -time[::-1]+2*time[N-1])) for n in xrange(N+1, 2*N): left = N+1 right = 2*N for i in xrange(200): mid = int(0.5*(left+right)) diff = np.abs(data2[n-mid+N]-data2[n+mid-N]) if diff &gt;= e: if diff &lt; e + de: break right = mid - 1 continue left = mid + 1 leftlim = n - mid + N rightlim = n + mid - N y = data2[leftlim:rightlim:int(0.05*(rightlim-leftlim)+1)] x = time2[leftlim:rightlim:int(0.05*(rightlim-leftlim)+1)] xavg = np.average(x) yavg = np.average(y) xlen = len(x) slopes[n-N] = (np.dot(x,y)-xavg*yavg*xlen)/(np.dot(x,x)-xavg*xavg*xlen) return np.array(slopes) </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload