Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: appending/merging multiple csv files respecting headers and write to csv
    primarykey
    data
    text
    <p>[Using Python3] I'm very new to (Python) programming but nonetheless am writing a script that scans a folder for certain csv files, then I want to read them all and append them and write them into another csv file.</p> <p>In between it is required that data is returned only where the values in a certain columns are matched to a set criteria.</p> <p>All csv files have the same columns, and would look somewhere like this:</p> <pre><code>header1 header2 header3 header4 ... string float string float ... string float string float ... string float string float ... string float string float ... ... ... ... ... ... </code></pre> <p>The code I'm working with right now is the following (below), however it just keeps on overwriting the data from the previous file. That does make sense to me, I just cannot figure out how to get it working though.</p> <p>Code:</p> <pre><code>import csv import datetime import sys import glob import itertools from collections import defaultdict # Raw data files have the format like '2013-06-04'. To be able to use this script during the whole of 2013, the glob is set to search for the pattern '2013-*.csv' files = [f for f in glob.glob('2013-*.csv')] # Output file looks like '20130620-filtered.csv' outfile = '{:%Y%m%d}-filtered.csv'.format(datetime.datetime.now()) # List of 'Header4' values to be filtered for writing output header4 = ['string1', 'string2', 'string3', 'string4'] for f in files: with open(f, 'r') as f_in: dict_reader = csv.DictReader(f_in) with open(outfile, 'w') as f_out: dict_writer = csv.DictWriter(f_out, lineterminator='\n', fieldnames=dict_reader.fieldnames) dict_writer.writeheader() for row in dict_reader: if row['Campaign'] in campaign_names: dict_writer.writerow(row) </code></pre> <p>I also tried something like <code>readers = list(itertools.chain(*map(lambda f: csv.DictReader(open(f)), files)))</code>, and trying to iterate over the readers however then I cannot figure out how to work with the headers. (I get the error that itertools.chain() does not have the fieldnames attribute).</p> <p>Any help is very much appreciated!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload