StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow can I perform a least-squares fitting over multiple data sets fast?
primarykey
Id
8780912
data
AcceptedAnswerId
8783634
AnswerCount
1
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2012-01-08T20:17:06.357
FavoriteCount
13
LastActivityDate
2012-01-11T20:37:57.483
LastEditDate
2012-01-11T01:46:59.533
LastEditorUserId
94104
OwnerUserId
94104
ParentId
0
PostTypeId
1
Score
11
ViewCount
7846
LastEditorDisplayName
text
Body
I am trying to make a gaussian fit over many data points. E.g. I have a 256 x 262144 array of data. Where the 256 points need to be fitted to a gaussian distribution, and I need 262144 of them. Sometimes the peak of the gaussian distribution is outside the data-range, so to get an accurate mean result curve-fitting is the best approach. Even if the peak is inside the range, curve-fitting gives a better sigma because other data is not in the range. I have this working for one data point, using code from <a href="http://www.scipy.org/Cookbook/FittingData" rel="nofollow noreferrer">http://www.scipy.org/Cookbook/FittingData</a> . I have tried to just repeat this algorithm, but it looks like it is going to take something in the order of 43 minutes to solve this. Is there an already-written fast way of doing this in parallel or more efficiently? <pre><code>from scipy import optimize from numpy import * import numpy # Fitting code taken from: http://www.scipy.org/Cookbook/FittingData class Parameter: def __init__(self, value): self.value = value def set(self, value): self.value = value def __call__(self): return self.value def fit(function, parameters, y, x = None): def f(params): i = 0 for p in parameters: p.set(params[i]) i += 1 return y - function(x) if x is None: x = arange(y.shape[0]) p = [param() for param in parameters] optimize.leastsq(f, p) def nd_fit(function, parameters, y, x = None, axis=0): """ Tries to an n-dimensional array to the data as though each point is a new dataset valid across the appropriate axis. """ y = y.swapaxes(0, axis) shape = y.shape axis_of_interest_len = shape[0] prod = numpy.array(shape[1:]).prod() y = y.reshape(axis_of_interest_len, prod) params = numpy.zeros([len(parameters), prod]) for i in range(prod): print "at %d of %d"%(i, prod) fit(function, parameters, y[:,i], x) for p in range(len(parameters)): params[p, i] = parameters[p]() shape[0] = len(parameters) params = params.reshape(shape) return params </code></pre> Note that the data isn't necessarily 256x262144 and i've done some fudging around in nd_fit to make this work. The code I use to get this to work is <pre><code>from curve_fitting import * import numpy frames = numpy.load("data.npy") y = frames[:,0,0,20,40] x = range(0, 512, 2) mu = Parameter(x[argmax(y)]) height = Parameter(max(y)) sigma = Parameter(50) def f(x): return height() * exp (-((x - mu()) / sigma()) ** 2) ls_data = nd_fit(f, [mu, sigma, height], frames, x, 0) </code></pre> Note: The solution posted below by @JoeKington is great and solves really fast. However it doesn't appear to work unless the significant area of the gaussian is inside the appropriate area. I will have to test if the mean is still accurate though, as that is the main thing I use this for. <img src="https://imgur.com/E38eJ.png" alt="Analysis of gaussian distribution estimations">
Tags
<python><scipy><curve-fitting><gaussian><least-squares>
Title
How can I perform a least-squares fitting over multiple data sets fast?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USMichael
UserOwnerUserId
1. USMichael
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow can I perform a least-squares fitting over multiple data sets fast?
 UserUserId
 USunutbu
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 POHow can I perform a least-squares fitting over multiple data sets fast?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POHow can I perform a least-squares fitting over multiple data sets fast?
 UserUserId
 USPat
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId
1. COCould you post the code you used?
 singulars
 PostPostId
 POHow can I perform a least-squares fitting over multiple data sets fast?
 UserUserId
 USDavid Robinson

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.