StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POReindexing pandas timeseries from object dtype to datetime dtype
primarykey
Id
13654699
data
AcceptedAnswerId
13655271
AnswerCount
1
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2012-11-30T23:35:15.957
FavoriteCount
10
LastActivityDate
2016-11-14T21:38:31.830
LastEditDate
2012-12-01T04:26:42.320
LastEditorUserId
1574687
OwnerUserId
1574687
ParentId
0
PostTypeId
1
Score
29
ViewCount
26738
LastEditorDisplayName
text
Body
I have a time-series that is not recognized as a DatetimeIndex despite being indexed by standard YYYY-MM-DD strings with valid dates. Coercing them to a valid DatetimeIndex seems to be inelegant enough to make me think I'm doing something wrong. I read in (someone else's lazily formatted) data that contains invalid datetime values and remove these invalid observations. <pre><code>In [1]: df = pd.read_csv('data.csv',index_col=0) In [2]: print df['2008-02-27':'2008-03-02'] Out[2]: count 2008-02-27 20 2008-02-28 0 2008-02-29 27 2008-02-30 0 2008-02-31 0 2008-03-01 0 2008-03-02 17 In [3]: def clean_timestamps(df): # remove invalid dates like '2008-02-30' and '2009-04-31' to_drop = list() for d in df.index: try: datetime.date(int(d[0:4]),int(d[5:7]),int(d[8:10])) except ValueError: to_drop.append(d) df2 = df.drop(to_drop,axis=0) return df2 In [4]: df2 = clean_timestamps(df) In [5] :print df2['2008-02-27':'2008-03-02'] Out[5]: count 2008-02-27 20 2008-02-28 0 2008-02-29 27 2008-03-01 0 2008-03-02 17 </code></pre> This new index is still only recognized as a 'object' dtype rather than a DatetimeIndex. <pre><code>In [6]: df2.index Out[6]: Index([2008-01-01, 2008-01-02, 2008-01-03, ..., 2012-11-27, 2012-11-28, 2012-11-29], dtype=object) </code></pre> Reindexing produces NaNs because they're different dtypes. <pre><code>In [7]: i = pd.date_range(start=min(df2.index),end=max(df2.index)) In [8]: df3 = df2.reindex(index=i,columns=['count']) In [9]: df3['2008-02-27':'2008-03-02'] Out[9]: count 2008-02-27 NaN 2008-02-28 NaN 2008-02-29 NaN 2008-03-01 NaN 2008-03-02 NaN </code></pre> I create a fresh dataframe with the appropriate index, drop the data to a dictionary, then populate the new dataframe based on the dictionary values (skipping missing values). <pre><code>In [10]: df3 = pd.DataFrame(columns=['count'],index=i) In [11]: values = dict(df2['count']) In [12]: for d in i: try: df3.set_value(index=d,col='count',value=values[d.isoformat()[0:10]]) except KeyError: pass In [13]: print df3['2008-02-27':'2008-03-02'] Out[13]: count 2008-02-27 20 2008-02-28 0 2008-02-29 27 2008-03-01 0 2008-03-02 17 In [14]: df3.index Out[14]; <class 'pandas.tseries.index.DatetimeIndex'> [2008-01-01 00:00:00, ..., 2012-11-29 00:00:00] Length: 1795, Freq: D, Timezone: None </code></pre> This last part of setting values based on lookups to a dictionary keyed by strings seems especially hacky and makes me think I've missed something important.
Tags
<python><datetime><python-2.7><pandas>
Title
Reindexing pandas timeseries from object dtype to datetime dtype
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USBrian Keegan
UserOwnerUserId
1. USBrian Keegan
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POReindexing pandas timeseries from object dtype to datetime dtype
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POReindexing pandas timeseries from object dtype to datetime dtype
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POReindexing pandas timeseries from object dtype to datetime dtype
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.