StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
17755370
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2013-07-19T20:57:51.117
FavoriteCount
0
LastActivityDate
2013-07-19T21:05:04.083
LastEditDate
2013-07-19T21:05:04.083
LastEditorUserId
1951065
OwnerUserId
1951065
ParentId
17688155
PostTypeId
2
Score
2
ViewCount
0
LastEditorDisplayName
text
Body
Ok, this took longer that I expected, but here's a more general answer that works with an arbitrary number of choices per individual. I'm sure there are simpler ways, so it would be great if somebody can chime in with something better for some of the following code. <pre><code>df = pd.DataFrame( {'location' : ['A', 'A', 'A', 'B', 'B', 'B'], 'dist_to_A' : [0, 0, 0, 50, 50, 50], 'dist_to_B' : [50, 50, 50, 0, 0, 0], 'location_var': [10, 10, 10, 14, 14, 14], 'ind_var': [3, 8, 10, 1, 3, 4]}) </code></pre> which gives <pre><code> dist_to_A dist_to_B ind_var location location_var 0 0 50 3 A 10 1 0 50 8 A 10 2 0 50 10 A 10 3 50 0 1 B 14 4 50 0 3 B 14 5 50 0 4 B 14 </code></pre> Then we do: <pre><code>df.index.names = ['ind'] # Add choice var df['choice'] = 1 # Create dictionaries we'll use later ind_to_loc = dict(df['location']) # gives ind_to_loc equal to {0 : 'A', 1 : 'A', 2 : 'A', 3 : 'B', 4 : 'B', 5: 'B'} ind_dict = dict(df['ind_var']) #gives { 0: 3, 1 : 8, 2 : 10, 3: 1, 4 : 3, 5: 4} loc_dict = dict( df.groupby('location').agg(lambda x : int(np.mean(x)) )['location_var'] ) # gives {'A' : 10, 'B' : 14} </code></pre> Now I create a Multi-Index and do a re-index to get a long shape <pre><code>df = df.set_index( [df.index, df['location']] ) df.index.names = ['ind', 'location'] # re-index to long shape loc_list = ['A', 'B'] ind_list = [0, 1, 2, 3, 4, 5] new_shape = [ (ind, loc) for ind in ind_list for loc in loc_list] idx = pd.Index(new_shape) df_long = df.reindex(idx, method = None) df_long.index.names = ['ind', 'loc'] </code></pre> The long shape looks like this: <pre><code> dist_to_A dist_to_B ind_var location location_var choice ind loc 0 A 0 50 3 A 10 1 B NaN NaN NaN NaN NaN NaN 1 A 0 50 8 A 10 1 B NaN NaN NaN NaN NaN NaN 2 A 0 50 10 A 10 1 B NaN NaN NaN NaN NaN NaN 3 A NaN NaN NaN NaN NaN NaN B 50 0 1 B 14 1 4 A NaN NaN NaN NaN NaN NaN B 50 0 3 B 14 1 5 A NaN NaN NaN NaN NaN NaN B 50 0 4 B 14 1 </code></pre> So now fill the NaN values with the dictionaries: <pre><code>df_long['ind_var'] = df_long.index.map(lambda x : ind_dict[x[0]] ) df_long['location'] = df_long.index.map(lambda x : ind_to_loc[x[0]] ) df_long['location_var'] = df_long.index.map(lambda x : loc_dict[x[1]] ) # Fill in choice df_long['choice'] = df_long['choice'].fillna(0) </code></pre> Finally, all that is left is creating dist_S I'll cheat here and assume I can create a nested dictionary like this one <pre><code>nested_loc = {'A' : {'A' : 0, 'B' : 50}, 'B' : {'A' : 50, 'B' : 0}} </code></pre> (This reads: if you're in location A, then location A is at 0 km and location B at 50 km) <pre><code>def nested_f(x): return nested_loc[x[0]][x[1]] df_long = df_long.reset_index() df_long['dist_S'] = df_long[['loc', 'location']].apply(nested_f, axis=1) df_long = df_long.drop(['dist_to_A', 'dist_to_B', 'location'], axis = 1 ) df_long </code></pre> gives the desired result <pre><code> ind loc ind_var location_var choice dist_S 0 0 A 3 10 1 0 1 0 B 3 14 0 50 2 1 A 8 10 1 0 3 1 B 8 14 0 50 4 2 A 10 10 1 0 5 2 B 10 14 0 50 6 3 A 1 10 0 50 7 3 B 1 14 1 0 8 4 A 3 10 0 50 9 4 B 3 14 1 0 10 5 A 4 10 0 50 11 5 B 4 14 1 0 </code></pre>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POComplicated (for me) reshaping from wide to long in Pandas
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. UScd98
UserOwnerUserId
1. UScd98
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. This table or related slice is empty.
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.