Note that there are some explanatory texts on larger screens.

plurals
  1. POProblems handling multiple data types in CSV input with Python
    primarykey
    data
    text
    <p>I have a csv dump that I'm trying to import to run analysis on metrics therein, cherry picking certain metrics to look at. Some of the cells are strings and some are numbers. However, I can't get csv.reader to handle the numbers properly. A snippet:</p> <pre><code>with open('t0.csv', 'rU') as file: reader = csv.reader(file, delimiter=",", quotechar='|') reader.next() # Burn header row for row in reader: if row[0] != "": # Burn footer row t0_company.extend([unicode(row[3], 'utf-8')]) t0_revenue.extend([row[9]]) t0_clicks.extend([row[10]]) t0_pageviews.extend([row[11]]) t0_avg_cpc.extend([row[13]]) t0_monthly_budget.extend([row[16]]) </code></pre> <p>I input another file of the same format for metrics at t1. Then I create two dicts for each metric (one at t0 and the other at t1) with the form metric_dict = {'company': 'metric'} like this:</p> <pre><code>metric = dict(zip(company, metric)) </code></pre> <p>Running simple math on these metrics is problematic however:</p> <pre><code>percent_change = float(t1_metric_dict[company]) / float(t0_metric_dict[company]) - 1 </code></pre> <p>Returns errors like:</p> <pre><code>Traceback (most recent call last): File "report.py", line 104, in &lt;module&gt; start_revenue_dict[company], end_revenue_dict[company], float(end_revenue_dict[company]) / float(start_revenue_dict[company]) - 1, ValueError: could not convert string to float: "6.18" </code></pre> <p>It seems to pick the same number to complain about every time.</p> <p>I'm fairly certain the error happens in the division as everything works normally if I swap in a placeholder string as the third element.</p> <p>I also tried using quoting=csv.QUOTE_NONNUMERIC, changing the second line in the first snippet to</p> <pre><code>reader = csv.reader(file, delimiter=",", quotechar='|', quoting=csv.QUOTE_NONNUMERIC) </code></pre> <p>Which gets me this error:</p> <pre><code>Traceback (most recent call last): File "report.py", line 30, in &lt;module&gt; reader.next() ValueError: could not convert string to float: "Type" </code></pre> <p>I've tried making sure the csv doesn't have any weird cell types (everything is text) even though I doubt it matters. I'd appreciate any help on this one.</p> <p>------ Update ------</p> <p>One of the columns in my input file contains email addresses. As an experiment, I removed all @s from the input docs which changed the error message I'm getting to:</p> <pre><code>Traceback (most recent call last): File "report.py", line 129, in &lt;module&gt; unicode_row = [str(item).encode('utf8') for item in utf8_row] UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 2: ordinal not in range(128) </code></pre> <p>The code it's referencing is the csv out section:</p> <pre><code>writer = csv.writer(open('output.csv', 'wb'), delimiter=",", quotechar='|') for utf8_row in report: unicode_row = [str(item).encode('utf8') for item in utf8_row] writer.writerow(unicode_row) </code></pre> <p>----- Update #2 -----</p> <p>As requested, here is the full snippet that's causing problems:</p> <pre><code>for company in companies_in_both: report.append([company, start_revenue_dict[company], end_revenue_dict[company], float(end_revenue_dict[company]) / float(start_revenue_dict[company]) - 1, start_clicks_dict[company], end_clicks_dict[company], float(end_clicks_dict[company]) / float(start_clicks_dict[company]) - 1, start_pageviews_dict[company], end_pageviews_dict[company], float(end_pageviews_dict[company]) / float(start_pageviews_dict[company]) - 1, start_avg_cpc_dict[company], end_avg_cpc_dict[company], float(end_avg_cpc_dict[company]) / float(start_avg_cpc_dict[company]) - 1, start_monthly_budget_dict[company], end_monthly_budget_dict[company], float(end_monthly_budget_dict[company]) / float(start_monthly_budget_dict[company]) - 1]) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload