Note that there are some explanatory texts on larger screens.

plurals
  1. POIncreasing speed of python code
    primarykey
    data
    text
    <p>I have some python code that has many classes. I used <code>cProfile</code> to find that the total time to run the program is 68 seconds. I found that the following function in a class called <code>Buyers</code> takes about 60 seconds of those 68 seconds. I have to run the program about 100 times, so any increase in speed will help. Can you suggest ways to increase the speed by modifying the code? If you need more information that will help, please let me know.</p> <pre><code>def qtyDemanded(self, timePd, priceVector): '''Returns quantity demanded in period timePd. In addition, also updates the list of customers and non-customers. Inputs: timePd and priceVector Output: count of people for whom priceVector[-1] &lt; utility ''' ## Initialize count of customers to zero ## Set self.customers and self.nonCustomers to empty lists price = priceVector[-1] count = 0 self.customers = [] self.nonCustomers = [] for person in self.people: if person.utility &gt;= price: person.customer = 1 self.customers.append(person) else: person.customer = 0 self.nonCustomers.append(person) return len(self.customers) </code></pre> <p><code>self.people</code> is a list of <code>person</code> objects. Each <code>person</code> has <code>customer</code> and <code>utility</code> as its attributes. </p> <p><strong>EDIT - responsed added</strong></p> <p><strong>-------------------------------------</strong> </p> <p>Thanks so much for the suggestions. Here is the response to some questions and suggestions people have kindly made. I have not tried them all, but will try others and write back later. </p> <p>(1) @amber - the function is accessed 80,000 times. </p> <p>(2) @gnibbler and others - self.people is a list of Person objects in memory. Not connected to a database.</p> <p>(3) @Hugh Bothwell </p> <p>cumtime taken by the original function - 60.8 s (accessed 80000 times)</p> <p>cumtime taken by the new function with local function aliases as suggested - 56.4 s (accessed 80000 times)</p> <p>(4) @rotoglup and @Martin Thomas </p> <p>I have not tried your solutions yet. I need to check the rest of the code to see the places where I use self.customers before I can make the change of not appending the customers to self.customers list. But I will try this and write back.</p> <p>(5) @TryPyPy - thanks for your kind offer to check the code. </p> <p>Let me first read a little on the suggestions you have made to see if those will be feasible to use. </p> <p><strong>EDIT 2</strong> Some suggested that since I am flagging the customers and noncustomers in the <code>self.people</code>, I should try without creating separate lists of <code>self.customers</code> and <code>self.noncustomers</code> using append. Instead, I should loop over the <code>self.people</code> to find the number of customers. I tried the following code and timed both functions below <code>f_w_append</code> and <code>f_wo_append</code>. I did find that the latter takes less time, but it is still 96% of the time taken by the former. That is, it is a very small increase in the speed. </p> <p>@TryPyPy - The following piece of code is complete enough to check the bottleneck function, in case your offer is still there to check it with other compilers. </p> <p>Thanks again to everyone who replied.</p> <pre><code>import numpy class person(object): def __init__(self, util): self.utility = util self.customer = 0 class population(object): def __init__(self, numpeople): self.people = [] self.cus = [] self.noncus = [] numpy.random.seed(1) utils = numpy.random.uniform(0, 300, numpeople) for u in utils: per = person(u) self.people.append(per) popn = population(300) def f_w_append(): '''Function with append''' P = 75 cus = [] noncus = [] for per in popn.people: if per.utility &gt;= P: per.customer = 1 cus.append(per) else: per.customer = 0 noncus.append(per) return len(cus) def f_wo_append(): '''Function without append''' P = 75 for per in popn.people: if per.utility &gt;= P: per.customer = 1 else: per.customer = 0 numcustomers = 0 for per in popn.people: if per.customer == 1: numcustomers += 1 return numcustomers </code></pre> <p><strong>EDIT 3: It seems numpy is the problem</strong></p> <p>This is in response to what John Machin said below. Below you see two ways of defining <code>Population</code> class. I ran the program below twice, once with each way of creating <code>Population</code> class. One uses numpy and one does not use numpy. The one <strong>without</strong> numpy takes similar time as John found in his runs. One with numpy takes much longer. What is not clear to me is that the <code>popn</code> instance is created before time recording begins (at least that is what it appears from the code). Then, why is numpy version taking longer. And, I thought numpy was supposed to be more efficient. Anyhow, the problem seems to be with numpy and not so much with the append, even though it does slow down things a little. Can someone please confirm with the code below? Thanks.</p> <pre><code>import random # instead of numpy import numpy import time timer_func = time.time # using Mac OS X 10.5.8 class Person(object): def __init__(self, util): self.utility = util self.customer = 0 class Population(object): def __init__(self, numpeople): random.seed(1) self.people = [Person(random.uniform(0, 300)) for i in xrange(numpeople)] self.cus = [] self.noncus = [] # Numpy based # class Population(object): # def __init__(self, numpeople): # numpy.random.seed(1) # utils = numpy.random.uniform(0, 300, numpeople) # self.people = [Person(u) for u in utils] # self.cus = [] # self.noncus = [] def f_wo_append(popn): '''Function without append''' P = 75 for per in popn.people: if per.utility &gt;= P: per.customer = 1 else: per.customer = 0 numcustomers = 0 for per in popn.people: if per.customer == 1: numcustomers += 1 return numcustomers t0 = timer_func() for i in xrange(20000): x = f_wo_append(popn) t1 = timer_func() print t1-t0 </code></pre> <p><strong>Edit 4: See the answers by John Machin and TryPyPy</strong></p> <p>Since there have been so many edits and updates here, those who find themselves here for the first time may be a little confused. See the answers by John Machin and TryPyPy. Both of these can help in improving the speed of the code substantially. I am grateful to them and others who alerted me to slowness of <code>append</code>. Since, in this instance I am going to use John Machin's solution and not use numpy for generating utilities, I am accepting his response as an answer. However, I really appreciate the directions pointed out by TryPyPy also. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload