Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>This sounds like it's going to be largely a question of filesystem choice. One option to look at might be <a href="http://en.wikipedia.org/wiki/ZFS" rel="nofollow noreferrer">ZFS</a>, it's designed for high volume applications.</p> <p>You may also want to consider using a relational database for this sort of thing. 750 million rows is sort of a medium size database, so any robust DBMS (eg. <a href="http://postgresql.org" rel="nofollow noreferrer">PostgreSQL</a>) would be able to handle it well. You can store arbitrary blobs in the database too, so whatever you were going to store in the files on disk you can just store in the database itself.</p> <p><strong>Update:</strong> Your additional information is certainly helpful. Given a choice between FAT32 and NTFS, then <em>definitely</em> choose NTFS. Don't store too many files in a single directory, 100,000 might be an upper limit to consider (although you will have to experiment, there's no hard and fast rule). Your friend's suggestion of a new directory for every letter is probably too much, you might consider breaking it up on every four letters or something. The best value to choose depends on the shape of your dataset.</p> <p>The reason breaking up the name is a good idea is that typically the performance of filesystems decreases as the number of files in a directory increases. This depends highly on the filesystem in use, for example FAT32 will be horrible with probably only a few thousand files per directory. You don't want to break up the filenames <em>too</em> much, so you will minimise the number of directory lookups the filesystem will have to do.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload