Note that there are some explanatory texts on larger screens.

plurals
  1. PONumpy set dtype=None, cannot splice columns and set dtype=object cannot set dtype.names
    primarykey
    data
    text
    <p>I am running Python 2.6. I have the following example where I am trying to concatenate the date and time string columns from a csv file. Based on the dtype I set (None vs object), I am seeing some differences in behavior that I cannot explained, see Question 1 and 2 at the end of the post. The exception returned is not too descriptive, and the dtype documentation doesn't mention any specific behavior to expect when dtype is set to object.</p> <p>Here is the snippet:</p> <pre><code>#! /usr/bin/python import numpy as np # simulate a csv file from StringIO import StringIO data = StringIO(""" Title Date,Time,Speed ,,(m/s) 2012-04-01,00:10, 85 2012-04-02,00:20, 86 2012-04-03,00:30, 87 """.strip()) # (Fail) case 1: dtype=None splicing a column fails next(data) # eat away the title line header = [item.strip() for item in next(data).split(',')] # get the headers arr1 = np.genfromtxt(data, dtype=None, delimiter=',',skiprows=1)# skiprows=1 for the row with units arr1.dtype.names = header # assign the header to names # so we can do y=arr['Speed'] y1 = arr1['Speed'] # Q1 IndexError: invalid index #a1 = arr1[:,0] #print a1 # EDIT1: print "arr1.shape " print arr1.shape # (3,) # Fails as expected TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray' # z1 = arr1['Date'] + arr1['Time'] # This can be workaround by specifying dtype=object, which leads to case 2 data.seek(0) # resets # (Fail) case 2: dtype=object assign header fails next(data) # eat away the title line header = [item.strip() for item in next(data).split(',')] # get the headers arr2 = np.genfromtxt(data, dtype=object, delimiter=',',skiprows=1) # skiprows=1 for the row with units # Q2 ValueError: there are no fields define #arr2.dtype.names = header # assign the header to names. so we can use it to do indexing # ie y=arr['Speed'] # y2 = arr['Date'] + arr['Time'] # column headings were assigned previously by arr.dtype.names = header data.seek(0) # resets # (Good) case 3: dtype=object but don't assign headers next(data) # eat away the title line header = [item.strip() for item in next(data).split(',')] # get the headers arr3 = np.genfromtxt(data, dtype=object, delimiter=',',skiprows=1) # skiprows=1 for the row with units y3 = arr3[:,0] + arr3[:,1] # slice the columns print y3 # case 4: dtype=None, all data are ints, array dimension 2-D # simulate a csv file from StringIO import StringIO data2 = StringIO(""" Title Date,Time,Speed ,,(m/s) 45,46,85 12,13,86 50,46,87 """.strip()) next(data2) # eat away the title line header = [item.strip() for item in next(data2).split(',')] # get the headers arr4 = np.genfromtxt(data2, dtype=None, delimiter=',',skiprows=1)# skiprows=1 for the row with units #arr4.dtype.names = header # Value error print "arr4.shape " print arr4.shape # (3,3) data2.seek(0) # resets </code></pre> <p><strong>Question 1:</strong> At comment Q1, why can I not slice a column, when dtype=None? This could be avoided by a) arr1=np-genfromtxt... was initialized with dtype=object like case 3, b) arr1.dtype.names=... wascommented out to avoid the Value error in case 2</p> <p><strong>Question 2:</strong> At comment Q2, why can I not set the dtype.names when dtype=object?</p> <p><strong>EDIT1:</strong></p> <p>Added a case 4 that shows when the dimension of the array would be 2-D if the values in the simulated csv files are all ints instead. One can slice the column, but assigning the dtype.names would still fail.</p> <p>Update the term 'splice' to 'slice'.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload