StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
20821065
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2013-12-29T00:58:25.400
FavoriteCount
0
LastActivityDate
2013-12-29T00:58:25.400
LastEditDate
LastEditorUserId
0
OwnerUserId
3083138
ParentId
20818522
PostTypeId
2
Score
2
ViewCount
0
LastEditorDisplayName
text
Body
<p>The code appended below works in Python 3.3 and produces the desired output, with a few minor caveats:</p> <ul> <li>It grabs the initial comment line from the first file that it processes, but doesn't bother to check that all of the other ones after that still match (i.e., if you have several files that start with #A and one that starts with #C, it won't reject the #C, even though it probably should). I mainly wanted to illustrate how the merge function would work in Python, and figured that adding in this type of miscellaneous validity check is best left as a "homework" problem.</li> <li>It also doesn't bother to check that the number of rows and columns match, and will likely crash if they don't. Consider it another minor homework problem.</li> <li>It prints all columns to the right of the first one as float values, since in some cases, that's what they might be. The initial column is treated as a label or line number, and is therefore printed as an integer value.</li> </ul> <p>You can call the code in almost the same way as before; e.g., if you name the script file merge.py, you can do <code>python merge.py data0001.dat data0002.dat</code> and it will print the merged average result to stdout just as with the bash script. The code also has an added flexibility to compared to one of the earlier answers: the way it's written, it should in principle (I haven't actually tested this to make sure) be able to merge files with any number of columns, not just files that have precisely three columns. Another nice benefit: it doesn't keep files open after it is done with them; the <code>with open(name, 'r') as infile:</code> line is a Python idiom that automatically results in a file closure after the script is finished reading from the file, even though <code>close()</code> is never explicitly called.</p> <pre><code>#!/usr/bin/env python import argparse import re # Give help description parser = argparse.ArgumentParser(description='Merge some data files') # Add to help description parser.add_argument('fname', metavar='f', nargs='+', help='Names of files to be merged') # Parse the input arguments! args = parser.parse_args() argdct = vars(args) topcomment=None output = {} # Loop over file names for name in argdct['fname']: with open(name, "r") as infile: # Loop over lines in each file for line in infile: line = str(line) # Skip comment lines, except to take note of first one that # matches "#A" if re.search('^#', line): if re.search('^#A', line) != None and topcomment==None: topcomment = line continue items = line.split() # If a line matching this one has been encountered in a previous # file, add the column values currkey = float(items[0]) if currkey in output.keys(): for ii in range(len(output[currkey])): output[currkey][ii] += float(items[ii+1]) # Otherwise, add a new key to the output and create the columns else: output[currkey] = list(map(float, items[1:])) # Print the comment line print(topcomment, end='') # Get total number of files for calculating average nfile = len(argdct['fname']) # Sort the output keys skey = sorted(output.keys()) # Loop through sorted keys and print each averaged column to stdout for key in skey: outline = str(int(key)) for item in output[key]: outline += ' ' + str(item/nfile) outline += '\n' print(outline, end='') </code></pre>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POcombine/average multiple data files
  singulars
  PostTypePostTypeId
  PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USstachyra
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POcombine/average multiple data files
  singulars
  PostTypePostTypeId
  PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTAcceptedByOriginator
2. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.