Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>You picked an example that on the surface seems very simple, but is actually fairly complicated behind the scenes. This ends up storing 3 different blocks of data (1 for each dtype), and each of these stores and index and the data.</p> <p>The object which you stored is what I call a <code>Storer</code> format, meaning the numpy arrays are written all at once, so once written they are not changeable. See docs here: <a href="http://pandas.pydata.org/pandas-docs/dev/io.html#hdf5-pytables" rel="nofollow">http://pandas.pydata.org/pandas-docs/dev/io.html#hdf5-pytables</a></p> <p>PyTables docs are here: <a href="http://pytables.github.io/usersguide/libref/declarative_classes.html#the-atom-class-and-its-descendants" rel="nofollow">http://pytables.github.io/usersguide/libref/declarative_classes.html#the-atom-class-and-its-descendants</a></p> <p>These strings unfortunately are stored as a python pickle in this particular format of storage, so I don't know if you can decode them cross-platform.</p> <p>You will have an easier time reading a <code>Table</code> object, which is stored using more basic types, that are easily exported (there is a section in the docs on exporting to R for example).</p> <p>try reading this format:</p> <pre><code>In [2]: df = DataFrame({0: [1,2,3], 1: ["a", "b", "c"], 2: [1.5, 2.5, 3.5]}) In [4]: h = pd.HDFStore('tmp.h5') In [6]: h.put('df',df, table=True) In [7]: h.close() </code></pre> <p>using the PyTables <code>ptdump -avd tmp.h5</code> utility, this yields the following. If you are reading &lt; PyTables 3.0.0 (which just came out), or in py3 (which we are going to support in 0.11.1). Then strings are all utf-8 encoded written as bytes. Prior to (PyTables 3.0.0,), strings are written as ascii I believe.</p> <pre><code>/ (RootGroup) '' /._v_attrs (AttributeSet), 4 attributes: [CLASS := 'GROUP', PYTABLES_FORMAT_VERSION := '2.0', TITLE := '', VERSION := '1.0'] /df (Group) '' /df._v_attrs (AttributeSet), 12 attributes: [CLASS := 'GROUP', TITLE := '', VERSION := '1.0', data_columns := [], index_cols := [(0, 'index')], levels := 1, nan_rep := b'nan', non_index_axes := b"(lp1\n(I1\n(lp2\ncnumpy.core.multiarray\nscalar\np3\n(cnumpy\ndtype\np4\n(S'i8'\nI0\nI1\ntRp5\n(I3\nS'&lt;'\nNNNI-1\nI-1\nI0\ntbS'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp6\nag3\n(g5\nS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp7\nag3\n(g5\nS'\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp8\natp9\na.", pandas_type := b'frame_table', pandas_version := b'0.10.1', table_type := b'appendable_frame', values_cols := ['values_block_0', 'values_block_1', 'values_block_2']] /df/table (Table(3,)) '' description := { "index": Int64Col(shape=(), dflt=0, pos=0), "values_block_0": Float64Col(shape=(1,), dflt=0.0, pos=1), "values_block_1": Int64Col(shape=(1,), dflt=0, pos=2), "values_block_2": StringCol(itemsize=1, shape=(1,), dflt=b'', pos=3)} byteorder := 'little' chunkshape := (2621,) autoindex := True colindexes := { "index": Index(6, medium, shuffle, zlib(1)).is_csi=False} /df/table._v_attrs (AttributeSet), 19 attributes: [CLASS := 'TABLE', FIELD_0_FILL := 0, FIELD_0_NAME := 'index', FIELD_1_FILL := 0.0, FIELD_1_NAME := 'values_block_0', FIELD_2_FILL := 0, FIELD_2_NAME := 'values_block_1', FIELD_3_FILL := b'', FIELD_3_NAME := 'values_block_2', NROWS := 3, TITLE := '', VERSION := '2.6', index_kind := b'integer', values_block_0_dtype := b'float64', values_block_0_kind := b"(lp1\ncnumpy.core.multiarray\nscalar\np2\n(cnumpy\ndtype\np3\n(S'i8'\nI0\nI1\ntRp4\n(I3\nS'&lt;'\nNNNI-1\nI-1\nI0\ntbS'\\x02\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp5\na.", values_block_1_dtype := b'int64', values_block_1_kind := b"(lp1\ncnumpy.core.multiarray\nscalar\np2\n(cnumpy\ndtype\np3\n(S'i8'\nI0\nI1\ntRp4\n(I3\nS'&lt;'\nNNNI-1\nI-1\nI0\ntbS'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp5\na.", values_block_2_dtype := b'string8', values_block_2_kind := b"(lp1\ncnumpy.core.multiarray\nscalar\np2\n(cnumpy\ndtype\np3\n(S'i8'\nI0\nI1\ntRp4\n(I3\nS'&lt;'\nNNNI-1\nI-1\nI0\ntbS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\ntRp5\na."] Data dump: [0] (0, [1.5], [1], [b'a']) [1] (1, [2.5], [2], [b'b']) [2] (2, [3.5], [3], [b'c']) </code></pre> <p>Probably best to contact me off-line to discuss further. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload