StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POA better method than readlines?
text
Body
copied!<p>Using Python 2.5, I am reading an HTML file for three different pieces of information. The way I am able to find information is by finding a match with <em>regex</em>* and then counting a specific number of lines down from the matching line to get the actual information I'm looking for. <strong>The problem is I to have to re-open the site 3 times (one for each piece of info I'm looking up). I think it's inefficient and want to be able to look up all three things opening the site only once.</strong> Does anyone have a better method or suggestion?</p> <p>*<em>I will learn a better way, such as BeautifulSoup, but for now, I need a quick fix</em></p> <p>Code:</p> <pre><code>def scrubdividata(ticker): try: f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Annual Dividend:" in line: s = str(lines[i+1]) start = '>\$' end = '</td>' AnnualDiv = re.search('%s(.*)%s' % (start, end), s).group(1) f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Last Dividend:" in line: s = str(lines[i+1]) start = '>\$' end = '</td>' LastDiv = re.search('%s(.*)%s' % (start, end), s).group(1) f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Last Ex-Dividend Date:" in line: s = str(lines[i+1]) start = '>' end = '</td>' LastExDivDate = re.search('%s(.*)%s' % (start, end), s).group(1) divlist.append((ticker,LastDiv,AnnualDiv,LastExDivDate)) except: if ticker not in errorlist: errorlist.append(ticker) else: pass pass </code></pre> <p>Thanks,</p> <p>B</p> <p>I found a solution that works! I deleted the two extraneous urlopen and readlines commands, leaving only one for the loop (before I was only deleting the urlopen commands, but leaving readlines). Here is my corrected code:</p> <pre><code>def scrubdividata(ticker): try: f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Annual Dividend:" in line: s = str(lines[i+1]) start = '>\$' end = '</td>' AnnualDiv = re.search('%s(.*)%s' % (start, end), s).group(1) #f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) #lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Last Dividend:" in line: s = str(lines[i+1]) start = '>\$' end = '</td>' LastDiv = re.search('%s(.*)%s' % (start, end), s).group(1) #f = urllib2.urlopen('http://dividata.com/stock/%s'%(ticker)) #lines = f.readlines() for i in range(0,len(lines)): line = lines[i] if "Last Ex-Dividend Date:" in line: s = str(lines[i+1]) start = '>' end = '</td>' LastExDivDate = re.search('%s(.*)%s' % (start, end), s).group(1) divlist.append((ticker,LastDiv,AnnualDiv,LastExDivDate)) print '@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@' print ticker,LastDiv,AnnualDiv,LastExDivDate print '@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@' except: if ticker not in errorlist: errorlist.append(ticker) else: pass pass </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload