StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow to get data in a histogram bin
primarykey
Id
2275924
data
AcceptedAnswerId
2277669
AnswerCount
3
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2010-02-16T20:05:45.293
FavoriteCount
7
LastActivityDate
2015-09-23T18:09:02.640
LastEditDate
LastEditorUserId
0
OwnerUserId
159595
ParentId
0
PostTypeId
1
Score
15
ViewCount
31343
LastEditorDisplayName
text
Body
I want to get a list of the data contained in a histogram bin. I am using numpy, and Matplotlib. I know how to traverse the data and check the bin edges. However, I want to do this for a 2D histogram and the code to do this is rather ugly. Does numpy have any constructs to make this easier? For the 1D case, I can use searchsorted(). But the logic is not that much better, and I don’t really want to do a binary search on each data point when I don’t have to. Most of the nasty logic is due to the bin boundary regions. All regions have boundaries like this: [left edge, right edge). Except the last bin, which has a region like this: [left edge, right edge]. Here is some sample code for the 1D case: <pre><code>import numpy as np data = [0, 0.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5, 3] hist, edges = np.histogram(data, bins=3) print 'data =', data print 'histogram =', hist print 'edges =', edges getbin = 2 #0, 1, or 2 print '---' print 'alg 1:' #for i in range(len(data)): for d in data: if d >= edges[getbin]: if (getbin == len(edges)-2) or d < edges[getbin+1]: print 'found:', d #end if #end if #end for print '---' print 'alg 2:' for d in data: val = np.searchsorted(edges, d, side='right')-1 if val == getbin or val == len(edges)-1: print 'found:', d #end if #end for </code></pre> Here is some sample code for the 2D case: <pre><code>import numpy as np xdata = [0, 1.5, 1.5, 2.5, 2.5, 2.5, \ 0.5, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, \ 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 3] ydata = [0, 5,5, 5, 5, 5, \ 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, \ 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 30] xbins = 3 ybins = 3 hist2d, xedges, yedges = np.histogram2d(xdata, ydata, bins=(xbins, ybins)) print 'data2d =', zip(xdata, ydata) print 'hist2d =' print hist2d print 'xedges =', xedges print 'yedges =', yedges getbin2d = 5 #0 through 8 print 'find data in bin #', getbin2d xedge_i = getbin2d % xbins yedge_i = int(getbin2d / xbins) #IMPORTANT: this is xbins for x, y in zip(xdata, ydata): # x and y left edges if x >= xedges[xedge_i] and y >= yedges[yedge_i]: #x right edge if xedge_i == xbins-1 or x < xedges[xedge_i + 1]: #y right edge if yedge_i == ybins-1 or y < yedges[yedge_i + 1]: print 'found:', x, y #end if #end if #end if #end for </code></pre> Is there a cleaner / more efficient way to do this? It seems like numpy would have something for this.
Tags
<python><numpy><matplotlib><histogram>
Title
How to get data in a histogram bin
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USBen
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow to get data in a histogram bin
 UserUserId
 USunutbu
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 POHow to get data in a histogram bin
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POHow to get data in a histogram bin
 UserUserId
 USMikoala
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId
1. COJust out of curiosity; why do you use comments like #end if in your code? "Every pixel counts" By doing that you are ignoring the purpose of indentation.
 singulars
 PostPostId
 POHow to get data in a histogram bin
 UserUserId
 USGökhan Sever
2. CO2 reasons. I am a C++ developer first, and a python developer second. Python's lack of braces irritates me to no end. When I have complicated code blocks with lots of varying indentation, I don't want to be counting whitespace. And I do most of my development in Emacs. By putting closing comments on code blocks, it lets me press TAB on every line and Emacs won't try to wrongly indent something.
 singulars
 PostPostId
 POHow to get data in a histogram bin
 UserUserId
 USBen

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.