Note that there are some explanatory texts on larger screens.

plurals
  1. POMost appropriate conversion of .mat file for database purposes
    text
    copied!<p>I am trying to create a database of my experimental results that with a very flexible structure (as different experiments require different experimental conditions). For now, I am thinking about going with JSON as the most appropriate format due to its "dictionary-like" nature. </p> <p>My raw data files come in as Matlab files (.mat extension) but I have noticed that after conversion, I get an increase in file size by almost a factor of 10. I tried different conversion methods but they all give me a huge file increases and I was wondering whether this is an inherent problem with the formats I have chosen or whether there can be anything done with it.</p> <p>Here is a sample code, I have created to test the conversion efficiency and a sample file I run through:</p> <pre><code>import numpy as np import scipy.io as spio import json import pickle import os def json_dump(data): with open('json.txt.','w') as outfile: json.dump(data,outfile) print 'JSON file size: ', os.path.getsize('json.txt')/1000, ' kB' def pickle_dump(data): with open('pickle.pkl','w') as outfile: pickle.dump(data,outfile) print 'Pickle file size: ', os.path.getsize('pickle.pkl')/1000, ' kB' def numpy_dump(data): np.save('numpy.npy',data) print 'NPY file size: ', os.path.getsize('numpy.npy')/1000, ' kB' np.savetxt('numpy.txt',data) print 'Numpy text file size: ', os.path.getsize('numpy.txt')/1000, ' kB' def get_data(path): data = spio.loadmat(path) del data['__function_workspace__'] del data['__globals__'] del data['__version__'] del data['__header__'] spio.savemat('mat.mat',data) print 'Converted mat file size: ', os.path.getsize('mat.mat')/1000, ' kB' #Convert into list data = data['data'][0][0][0] return data path = 'myrecording.mat' print 'Original file size: ', os.path.getsize(path)/1000, ' kB' data = get_data(path) json_dump(data.tolist()) pickle_dump(data.tolist()) numpy_dump(data) </code></pre> <p>I get an output of:</p> <pre><code>Original file size: 706 kB Converted mat file size: 4007 kB JSON file size: 9104 kB Pickle file size: 10542 kB NPY file size: 4000 kB Numpy text file size: 12550 kB </code></pre> <p>Is there anything I can do with the encoding to limit the file size. I would ideally stick with JSON format but I am open to suggestions.</p> <p>Thanks in advance!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload