Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Here's a really basic structure idea: each folder gets it's own thread. You would have 2 classes, one that gathers the data "directoryHelper", and one to store it "Directory".</p> <p>Two classes are required because a thread can only be started once, and you need to be able to generate a new thread for a directory that has already been listed without losing it's data.</p> <p>The root directory would be a Directory instance that lists it's given path ('C:\mydocs'). It would store the file list in self.files and create a new Directory instance for every directory it contains (without forgetting to them in self.dirs to be able to access them.</p> <p>Refreshing could be timed, and checks for the directory's modification date as you suggested.</p> <p>Here's some code to help you understand my idea:</p> <pre><code>class Helper(threading.Thread): def __init__(self, directory): super(Helper, self).__init__() self.directory = directory self.start() def run(self): for path, folders, files in os.walk(self.directory.path): for f in files: self.directory.files.append(os.path.join(path, f)) for d in folders: self.directory.dirs.append(Directory(os.path.join(path, d), self.directory.interval, self.directory.do)) self = None class Directory(threading.Thread): def __init__(self, path, interval=5, do=None): super(Directory, self).__init__() self.path = path self.files, self.dirs = ([], []) self.interval = interval self.last_update = 0 self.helper = None self.do = do # One flag to stop refreshing all instances if do == None: self.do = True def run(self): while self.do: self.refresh() time.sleep(self.interval) def refresh(self): # Only start a refresh if there self.helper is done and directory was changed if not self.helper and self.has_changed(): self.last_update = int(time.time()) self.helper = Helper(self) def has_changed(self): return int(os.path.getmtime(self.path)) &gt; self.last_update </code></pre> <p>I think this should be enough to get you started!</p> <p>Edit: I changed the code a bit to actually be in a working state. Or at least I hope it is (I haven't tested it)!</p> <p>Edit 2: I actually took the time to test this, and fix it. I ran:</p> <pre><code>if __name__ == '__main__': root = Directory('/home/plg') root.refresh() root.helper.join() for d in [root] + root.dirs: for f in d.files: print f </code></pre> <p>And:</p> <pre><code>$ time python bin/dirmon.py | wc -l # wc -l == len(sys.stdout.readlines()) 7805 real 0m0.078s user 0m0.048s sys 0m0.028s </code></pre> <p>That's 7805 / 0.078 = 100,064 files per second. Not too bad! :)</p> <p>Edit 3 (last one!): I ran the test on '/', first run (without cache): 147551 / 4.103 = 35,961 files per second</p> <p>Second and third:</p> <pre><code>$ time python bin/dirmon.py | wc -l 147159 real 0m1.213s user 0m0.940s sys 0m0.272s $ time python bin/dirmon.py | wc -l 147159 real 0m1.209s user 0m0.928s sys 0m0.284s </code></pre> <p>147551 / 1.213 = 121,641 files per second</p> <p>147551 / 1.209 = 122,044 files per second</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload