Note that there are some explanatory texts on larger screens.

plurals
  1. POHaving an issue changing index from integer to date in pandas
    primarykey
    data
    text
    <p>I'm having an issue changing a pandas DataFrame index to a datetime from an integer. I want to do it so that I can call reindex and fill in the dates between those listed in the table. Note that I have to use pandas 0.7.3 at the moment because I'm also using qstk, and qstk relies on pandas 0.7.3</p> <p>First, here's my layout:</p> <pre><code>(Pdb) df AAPL GOOG IBM XOM date 1 0 0 4000 0 2011-01-13 16:00:00 2 0 1000 4000 0 2011-01-26 16:00:00 3 0 1000 4000 0 2011-02-02 16:00:00 4 0 1000 4000 4000 2011-02-10 16:00:00 6 0 0 1800 4000 2011-03-03 16:00:00 7 0 0 3300 4000 2011-06-03 16:00:00 8 0 0 0 4000 2011-05-03 16:00:00 9 1200 0 0 4000 2011-06-10 16:00:00 11 1200 0 0 4000 2011-08-01 16:00:00 12 0 0 0 4000 2011-12-20 16:00:00 (Pdb) type(df['date']) &lt;class 'pandas.core.series.Series'&gt; (Pdb) df2 = DataFrame(index=df['date']) (Pdb) df2 Empty DataFrame Columns: array([], dtype=object) Index: array([2011-01-13 16:00:00, 2011-01-26 16:00:00, 2011-02-02 16:00:00, 2011-02-10 16:00:00, 2011-03-03 16:00:00, 2011-06-03 16:00:00, 2011-05-03 16:00:00, 2011-06-10 16:00:00, 2011-08-01 16:00:00, 2011-12-20 16:00:00], dtype=object) (Pdb) df2.merge(df,left_index=True,right_on='date') AAPL GOOG IBM XOM date 1 0 0 4000 0 2011-01-13 16:00:00 2 0 1000 4000 0 2011-01-26 16:00:00 3 0 1000 4000 0 2011-02-02 16:00:00 4 0 1000 4000 4000 2011-02-10 16:00:00 6 0 0 1800 4000 2011-03-03 16:00:00 8 0 0 0 4000 2011-05-03 16:00:00 7 0 0 3300 4000 2011-06-03 16:00:00 9 1200 0 0 4000 2011-06-10 16:00:00 11 1200 0 0 4000 2011-08-01 16:00:00 12 0 0 0 4000 2011-12-20 16:00:00 </code></pre> <p>I have tried multiple things to get a datetime index:</p> <p>1.) Using the reindex() method with a list of datetime values. This creates a datetime index, but then fills in NaNs for the data in the DataFrame. I'm guessing that this is because the original values are tied to the integer index and reindexing to datetime tries to fill the new indices with default values (NaNs if no fill method is indicated). Thusly:</p> <pre><code>(Pdb) df.reindex(index=df['date']) AAPL GOOG IBM XOM date date 2011-01-13 16:00:00 NaN NaN NaN NaN NaN 2011-01-26 16:00:00 NaN NaN NaN NaN NaN 2011-02-02 16:00:00 NaN NaN NaN NaN NaN 2011-02-10 16:00:00 NaN NaN NaN NaN NaN 2011-03-03 16:00:00 NaN NaN NaN NaN NaN 2011-06-03 16:00:00 NaN NaN NaN NaN NaN 2011-05-03 16:00:00 NaN NaN NaN NaN NaN 2011-06-10 16:00:00 NaN NaN NaN NaN NaN 2011-08-01 16:00:00 NaN NaN NaN NaN NaN 2011-12-20 16:00:00 NaN NaN NaN NaN NaN </code></pre> <p>2.) Using DataFrame.merge with my original df and a second dataframe, df2, that is basically just a datetime index with nothing else. So I end up doing something like:</p> <pre><code>(pdb) df2.merge(df,left_index=True,right_on='date') AAPL GOOG IBM XOM date 1 0 0 4000 0 2011-01-13 16:00:00 2 0 1000 4000 0 2011-01-26 16:00:00 3 0 1000 4000 0 2011-02-02 16:00:00 4 0 1000 4000 4000 2011-02-10 16:00:00 6 0 0 1800 4000 2011-03-03 16:00:00 8 0 0 0 4000 2011-05-03 16:00:00 7 0 0 3300 4000 2011-06-03 16:00:00 9 1200 0 0 4000 2011-06-10 16:00:00 11 1200 0 0 4000 2011-08-01 16:00:00 </code></pre> <p>(and vice-versa). But I always end up with this kind of thing, with integer indices.</p> <p>3.) Starting with an empty DataFrame with a datetime index (created from the 'date' field of df) and a bunch of empty columns. Then I attempt to assign each column by setting the columns with the same names to be equal to the columns from df:</p> <pre><code>(Pdb) df2['GOOG']=0 (Pdb) df2 GOOG date 2011-01-13 16:00:00 0 2011-01-26 16:00:00 0 2011-02-02 16:00:00 0 2011-02-10 16:00:00 0 2011-03-03 16:00:00 0 2011-06-03 16:00:00 0 2011-05-03 16:00:00 0 2011-06-10 16:00:00 0 2011-08-01 16:00:00 0 2011-12-20 16:00:00 0 (Pdb) df2['GOOG'] = df['GOOG'] (Pdb) df2 GOOG date 2011-01-13 16:00:00 NaN 2011-01-26 16:00:00 NaN 2011-02-02 16:00:00 NaN 2011-02-10 16:00:00 NaN 2011-03-03 16:00:00 NaN 2011-06-03 16:00:00 NaN 2011-05-03 16:00:00 NaN 2011-06-10 16:00:00 NaN 2011-08-01 16:00:00 NaN 2011-12-20 16:00:00 NaN </code></pre> <p>So, how in pandas 0.7.3 do I get df to be re-created with an datetime index instead of the integer index? What am I missing? </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload