Note that there are some explanatory texts on larger screens.

plurals
  1. POHow do I cycle through a csv in python, writing lines to a new file that meet new criteria
    primarykey
    data
    text
    <p>I've been at this a while now, and I think it in my best interest to ask advice of the experts. I know I'm not writing this the best way possible, and I've gone down a rabbit hole and confused myself. </p> <p>I have a csv. A bunch, actually. That part is not the problem. </p> <p>The lines at the top of the CSV are not really CSV data, but it does contain an important piece of info, the data for which the data is valid. For certain kinds of a report, it is on one line, and on others another. </p> <p>My data starts on some line down from the top, usually 10 or 11, but I can't always be certain. I do know that the first column always has the same info (the header of the table of data). </p> <p>I want to pull the report date from the preceding lines, and for file type A, do stuffA, and for file tpye B, do stuffB, then write out that row to a new file. I'm having a problem incrementing the row and I have no idea what I'm doing wrong.</p> <p>Sample data:</p> <pre><code>"Attribute ""OPSURVEYLEVEL2_O"" [Category = ""Retail v1""]" Date exported: 2/16/13 Exported by user: William Project: Classification: Online Retail v1 Report type: Attributes Date range: from 12/14/12 to 12/14/12 "Filter OpSurvey Level 2(mine): [ LEVEL:SENTENCE TYPE:KEYWORD {OPSURVEYLEVEL2_O:""gift certificate redemption"", OPSURVEYLEVEL2_O:""combine accounts"", OPSURVEYLEVEL2_O:""cancel account"", OPSURVEYLEVEL2_O:""saved project moved to purchased project"", OPSURVEYLEVEL2_O:""unlock account"", OPSURVEYLEVEL2_O:""affiliate promotions"", OPSURVEYLEVEL2_O:""print to store coupons"", OPSURVEYLEVEL2_O:""disclaimer not clear"", OPSURVEYLEVEL2_O:""prepaid issue"", OPSURVEYLEVEL2_O:""customer wants to use coupons for print to store"", OPSURVEYLEVEL2_O:""customer received someone else's order"", OPSURVEYLEVEL2_O:""hi-res images unavailable"", OPSURVEYLEVEL2_O:""how to re-order"", OPSURVEYLEVEL2_O:""missing items"", OPSURVEYLEVEL2_O:""missing envelopes: print to store"", OPSURVEYLEVEL2_O:""missing envelopes: mail order"", OPSURVEYLEVEL2_O:""group rooms"", OPSURVEYLEVEL2_O:""print to store"", OPSURVEYLEVEL2_O:""print to store coupons"", OPSURVEYLEVEL2_O:""publisher: card not available for print to store"", OPSURVEYLEVEL2_O:publisher}]" Total: 905 OPSURVEYLEVEL2_O,Distinct Document,% of Document,Sentiment Score PRINT TO STORE,297,32.82,-0.1 ... </code></pre> <p>Sample Code</p> <pre><code>#!/usr/bin/python import csv, os, glob, sys, errno path = '/path/to/Downloads' for infile in glob.glob(os.path.join(path,'report_ATTRIBUTE_OP*.csv')): if 'OPSURVEYLEVEL2' in infile: prime_column = 'ops2' elif 'OPSURVEYLEVEL3' in infile: prime_column = 'ops3' else: sys.exit(errno.ENOENT) with open(infile, "r") as csvfile: reader = csv.reader(csvfile) report_date = 'DATE NOT FOUND' # import pdb; pdb.set_trace() for row in reader: foo = 0 while foo &lt; 1: if row[0][0:].find('OPSURVEYLEVEL') == 0: foo = 1 if "Date range" in row: report_date = row[0][-8:] break if foo &gt;= 1: if row[0][0:].find('OPSURVEYLEVEL') == 0: break if 'ops2' in prime_column: dup_col = row[0] row.insert(0,dup_col) row.append(report_date) elif 'ops3' in prime_column: row.append(report_date) with open('report_merge.csv', 'a') as outfile: outfile.write(row) reader.next() </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload