StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POOptimizing Divisor Sieve
primarykey
Id
13280198
data
AcceptedAnswerId
0
AnswerCount
2
ClosedDate
2012-11-08T03:48:38.147
CommentCount
2
CommunityOwnedDate
CreationDate
2012-11-07T23:39:12.860
FavoriteCount
0
LastActivityDate
2012-11-08T00:04:42.157
LastEditDate
LastEditorUserId
0
OwnerUserId
1765721
ParentId
0
PostTypeId
1
Score
1
ViewCount
402
LastEditorDisplayName
text
Body
I have two sieves that I wrote in python and would like help optimizing them if at all possible. The divisorSieve calculates the divisors of all numbers up to <code>n</code>. Each index of the list contains a list of its divisors. The numDivisorSieve just counts the number of divisors each index has but doesn't store the divisors themselves. These sieves work in a similar way as you would do a Sieve of Eratosthenes to calculate all prime numbers up to <code>n</code>. <pre><code>def divisorSieve(n): divs = [[1] for x in xrange(0, n + 1)] divs[0] = [0] for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j] += [i] return divs def numDivisorSieve(n): divs = [1] * (n + 1) divs[0] = 0 for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j] += 1 return divs #Timer test for function if __name__=='__main__': from timeit import Timer n = ... t1 = Timer(lambda: divisorSieve(n)) print n, t1.timeit(number=1) </code></pre> Results: <pre><code> -----n-----|--time(divSieve)--|--time(numDivSieve)-- 100,000 | 0.452978167569 | 0.187762331281 200,000 | 0.98932170181 | 0.362314797537 300,000 | 1.6338203645 | 0.55124339118 400,000 | 2.17913634385 | 0.748340797412 500,000 | 2.84024755862 | 0.959312993718 600,000 | 3.46395907197 | 1.17777010636 700,000 | 4.11840766312 | 1.38268800149 800,000 | 4.77613733906 | 1.62560614543 900,000 | 5.4634148162 | 1.83002270324 1,000,000 | 6.11017136282 | 2.10247496423 2,000,000 | 13.0507389438 | 4.59150618897 3,000,000 | 20.6068050631 | 7.24799900479 4,000,000 | 28.3432841948 | 10.1484527586 5,000,000 | 36.4113970268 | 12.7670585308 6,000,000 | 44.4017867139 | 15.4226118057 7,000,000 | 52.4697580394 | 18.2902677738 8,000,000 | 61.4056966474 | 21.1247001928 9,000,000 | 69.7948510294 | 23.8988925173 10,000,000 | 78.639434285 | 26.8588813211 20,000,000 | N/A | 56.2527237669 (waited for a while but divisorSieve didn't finish) 30,000,000 | MemoryError | 86.8917332214 40,000,000 | MemoryError | 118.457179822 50,000,000 | MemoryError | 149.526622815 60,000,000 | MemoryError | 181.627320396 70,000,000 | MemoryError | 214.17467749 80,000,000 | MemoryError | 246.23677614 90,000,000 | MemoryError | 279.53308422 100,000,000 | MemoryError | 314.813166014 </code></pre> Results are pretty good and I'm happy I was able to get it this far, but I'm looking to get it even faster. If at all possible, I'd like to get 100,000,000 at a reasonable speed with the divisorSieve. Although this also brings into the issue that anything over 30,000,000+ throws a <code>MemoryError</code> at <code>divs = [[1] for x in xrange(0, n + 1)]</code>) in divisorSieve. numDivisorSieve does allow the full 100,000,000 to run. I've tried replacing numDivisorSieve's <code>divs = [1] * (n + 1)</code> with both <code>divs = array.array('i', [1] * (n + 1))</code> and <code>divs = numpy.ones((n + 1), dtype='int')</code> but both resulted in a loss of speed (slight difference for array, much larger difference for numpy). I expect that since numDivisorSieve had a loss in efficiency, then so would divisorSieve. Of course there's always the chance I'm using one or both of these incorrectly since I'm not used to either of them. I would appreciate any help you can give me. I hope i have provided enough details. Thank you.
Tags
<python><optimization><factors><sieve>
Title
Optimizing Divisor Sieve
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USJeremy K
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POOptimizing Divisor Sieve
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTClose
2. VO
 singulars
 PostPostId
 POOptimizing Divisor Sieve
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTClose
3. VO
 singulars
 PostPostId
 POOptimizing Divisor Sieve
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTClose
CommentsPostId
1. COYou may want to also try your luck on http://codereview.stackexchange.com
 singulars
 PostPostId
 POOptimizing Divisor Sieve
 UserUserId
 USnewfurniturey
2. COif anyone sees this and is interested, I have taken newfurniturey's advice and continued here http://codereview.stackexchange.com/questions/18346/optimizing-divisor-sieve
 singulars
 PostPostId
 POOptimizing Divisor Sieve
 UserUserId
 USJeremy K

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.