Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>One Idea would be to use work queues (directories or DB), assuming you will be working out storage such that it meets your criteria for redundancy.</p> <p>\retrieve</p> <p>\retrieve\server1</p> <p>\retrieve\server...</p> <p>\retrieve\server10000</p> <p>\in-process</p> <p>\complete</p> <p>1.) All pages to be seeds will be hashed and be placed in the queue using the hash as a file root.</p> <p>2.) Before putting in the queue you check the complete and in-process queues to make sure you don't re-queue</p> <p>3.) Each server retrieves a random batch (1-N) files from the retrieve queue and attempts to move it to the private queue</p> <p>4.) Files that fail the rename process are assumed to have been “claimed” by another process</p> <p>5.) Files that can be moved are to be processed put a marker in in-process directory to prevent re-queuing.</p> <p>6.) Download the file and place it into the \Complete queue</p> <p>7.) Clean file out of the in-process and server directories</p> <p>8.) Every 1,000 runs check the oldest 10 in-process files by trying to move them from their server queues back into the general retrieve queue. This will help if a server hangs and also should load balance slow servers.</p> <p>For the Retrieve, in-process and complete servers most file systems hate millions of files in 1 directory, Divide storage into segments based on the characters of the hash \abc\def\123\ would be the directory for file abcdef123FFFFFF…. If you were scaling to billions of downloads. </p> <p>If you are using a mongo DB instead of a regular file store much of these problems would be avoided and you could benefit from the sharding etc…</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload