StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POPython: appending/merging multiple csv files respecting headers and write to csv
primarykey
Id
17192115
data
AcceptedAnswerId
17192320
AnswerCount
1
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2013-06-19T13:11:11.103
FavoriteCount
0
LastActivityDate
2016-10-11T07:11:17.730
LastEditDate
LastEditorUserId
0
OwnerUserId
2445114
ParentId
0
PostTypeId
1
Score
1
ViewCount
6539
LastEditorDisplayName
text
Body
[Using Python3] I'm very new to (Python) programming but nonetheless am writing a script that scans a folder for certain csv files, then I want to read them all and append them and write them into another csv file. In between it is required that data is returned only where the values in a certain columns are matched to a set criteria. All csv files have the same columns, and would look somewhere like this: <pre><code>header1 header2 header3 header4 ... string float string float ... string float string float ... string float string float ... string float string float ... ... ... ... ... ... </code></pre> The code I'm working with right now is the following (below), however it just keeps on overwriting the data from the previous file. That does make sense to me, I just cannot figure out how to get it working though. Code: <pre><code>import csv import datetime import sys import glob import itertools from collections import defaultdict # Raw data files have the format like '2013-06-04'. To be able to use this script during the whole of 2013, the glob is set to search for the pattern '2013-*.csv' files = [f for f in glob.glob('2013-*.csv')] # Output file looks like '20130620-filtered.csv' outfile = '{:%Y%m%d}-filtered.csv'.format(datetime.datetime.now()) # List of 'Header4' values to be filtered for writing output header4 = ['string1', 'string2', 'string3', 'string4'] for f in files: with open(f, 'r') as f_in: dict_reader = csv.DictReader(f_in) with open(outfile, 'w') as f_out: dict_writer = csv.DictWriter(f_out, lineterminator='\n', fieldnames=dict_reader.fieldnames) dict_writer.writeheader() for row in dict_reader: if row['Campaign'] in campaign_names: dict_writer.writerow(row) </code></pre> I also tried something like <code>readers = list(itertools.chain(*map(lambda f: csv.DictReader(open(f)), files)))</code>, and trying to iterate over the readers however then I cannot figure out how to work with the headers. (I get the error that itertools.chain() does not have the fieldnames attribute). Any help is very much appreciated!
Tags
<python><csv><python-3.x>
Title
Python: appending/merging multiple csv files respecting headers and write to csv
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USMatthijs
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POPython: appending/merging multiple csv files respecting headers and write to csv
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.