Note that there are some explanatory texts on larger screens.

plurals
  1. POHDFStore.append(string, DataFrame) fails when string column contents are longer than those already there
    primarykey
    data
    text
    <p>I have a Pandas DataFrame stored via an HDFStore that essentially stores summary rows about test runs I am doing.</p> <p>Several of the fields in each row contain descriptive strings of variable length.</p> <p>When I do a test run, I create a new DataFrame with a single row in it: </p> <pre><code>def export_as_df(self): return pd.DataFrame(data=[self._to_dict()], index=[datetime.datetime.now()]) </code></pre> <p>And then call <code>HDFStore.append(string, DataFrame)</code> to add the new row to the existing DataFrame.</p> <p>This works fine, apart from where one of the string columns contents is larger than the longest instance already existing, whereupon I get the following error:</p> <pre><code>File "&lt;ipython-input-302-a33c7955df4a&gt;", line 516, in save_pytables store.append('tests', test.export_as_df()) File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 532, in append self._write_to_group(key, value, table=True, append=True, **kwargs) File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 788, in _write_to_group s.write(obj = value, append=append, complib=complib, **kwargs) File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 2491, in write min_itemsize=min_itemsize, **kwargs) File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 2254, in create_axes raise Exception("cannot find the correct atom type -&gt; [dtype-&gt;%s,items-&gt;%s] %s" % (b.dtype.name, b.items, str(detail))) Exception: cannot find the correct atom type -&gt; [dtype-&gt;object,items-&gt;Index([bp, id, inst, per, sp, st, title], dtype=object)] [values_block_3] column has a min_itemsize of [51] but itemsize [46] is required! </code></pre> <p>I can't find any documentation about how to specify string length when creating a DataFrame. What is the solution here?</p> <p>Update: </p> <p>Code that is failing:</p> <pre><code> store = pd.HDFStore(pytables_store) for test in self.backtests: try: min_itemsizes = { 'buy_pattern' : 60, 'sell_pattern': 60, 'strategy': 60, 'title': 60 } store.append('tests', test.export_as_df(), min_itemsize = min_itemsizes) </code></pre> <p>Here's the error under 0.11rc1:</p> <pre><code>File "&lt;ipython-input-110-492b7b6603d7&gt;", line 522, in save_pytables store.append('tests', test.export_as_df(), min_itemsize = min_itemsizes) File "/Users/admin/dev/pandas/pandas-0.11.0rc1/pandas/io/pytables.py", line 610, in append self._write_to_group(key, value, table=True, append=True, **kwargs) File "/Users/admin/dev/pandas/pandas-0.11.0rc1/pandas/io/pytables.py", line 871, in _write_to_group s.write(obj = value, append=append, complib=complib, **kwargs) File "/Users/admin/dev/pandas/pandas-0.11.0rc1/pandas/io/pytables.py", line 2707, in write min_itemsize=min_itemsize, **kwargs) File "/Users/admin/dev/pandas/pandas-0.11.0rc1/pandas/io/pytables.py", line 2447, in create_axes self.validate_min_itemsize(min_itemsize) File "/Users/admin/dev/pandas/pandas-0.11.0rc1/pandas/io/pytables.py", line 2184, in validate_min_itemsize raise ValueError("min_itemsize has [%s] which is not an axis or data_column" % k) ValueError: min_itemsize has [buy_pattern] which is not an axis or data_column </code></pre> <p>Data sample:</p> <pre><code> all_day buy_pattern \ 2013-04-14 12:11:44.377695 False Hammer() and LowerLow() id instrument \ 2013-04-14 12:11:44.377695 tafdcc96ba4eb11e2a86d14109fcecd49 EURUSD open_margin periodicity sell_pattern strategy \ 2013-04-14 12:11:44.377695 0.0001 1:00:00 Tsl() title top_bottom wick_body 2013-04-14 12:11:44.377695 tsl 0.5 2 </code></pre> <p>dtypes:</p> <pre><code>print prob_test.export_as_df().get_dtype_counts() bool 1 float64 2 int64 1 object 7 dtype: int64 </code></pre> <p>I am deleting the h5 file each time as I want clean results. Wondering if there is something as silly as it is failing because the df does not exist in the h5 (and hence neither do any columns) at the time of the first append?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload