Note that there are some explanatory texts on larger screens.

plurals
  1. POYouTube video_id from Firefox bookmark.html source code [almost there]
    primarykey
    data
    text
    <p>bookmarks.html looks like this:</p> <pre><code>&lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=Gg81zi0pheg" ADD_DATE="1320876124" LAST_MODIFIED="1320878745" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=Gg81zi0pheg&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=pP9VjGmmhfo" ADD_DATE="1320876156" LAST_MODIFIED="1320878756" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=pP9VjGmmhfo&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=yTA1u6D1fyE" ADD_DATE="1320876163" LAST_MODIFIED="1320878762" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=yTA1u6D1fyE&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=4v8HvQf4fgE" ADD_DATE="1320876186" LAST_MODIFIED="1320878767" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=4v8HvQf4fgE&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=e9zG20wQQ1U" ADD_DATE="1320876195" LAST_MODIFIED="1320878773" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=e9zG20wQQ1U&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=khL4s2bvn-8" ADD_DATE="1320876203" LAST_MODIFIED="1320878782" ICON_URI="http://s.ytimg.com/yt/favicon-vflZlzSbU.ico" ICON=""&gt;http://www.youtube.com/watch?v=khL4s2bvn-8&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=XTndQ7bYV0A" ADD_DATE="1320876271" LAST_MODIFIED="1320876271"&gt;Paramore - Walmart Soundcheck 6-For a pessimist(HQ)&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=xTT2MqgWRRc" ADD_DATE="1320876284" LAST_MODIFIED="1320876284"&gt;Paramore - Walmart Soundcheck 5-Pressure(HQ)&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=J2ZYQngwSUw" ADD_DATE="1320876291" LAST_MODIFIED="1320876291"&gt;Paramore - Wal-Mart Soundcheck Interview&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=9RZwvg7unrU" ADD_DATE="1320878207" LAST_MODIFIED="1320878207"&gt;Paramore - 08 - Interview [ Wal-Mart Soundcheck ]&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=vz3qOYWwm10" ADD_DATE="1320878295" LAST_MODIFIED="1320878295"&gt;Paramore - 04 - That&amp;#39;s What You Get [ Wal-Mart Soundcheck ]&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=yarv52QX_Yw" ADD_DATE="1320878301" LAST_MODIFIED="1320878301"&gt;Paramore - 05 - Pressure [ Wal-Mart Soundcheck ]&lt;/A&gt; &lt;DT&gt;&lt;A HREF="http://www.youtube.com/watch?v=LRREY1H3GCI" ADD_DATE="1320878317" LAST_MODIFIED="1320878317"&gt;Paramore - Walmart Promo&lt;/A&gt; </code></pre> <p>It's a standard bookmarks export file from Firefox.</p> <p>I feed it into bookmarks.py which looks like this:</p> <pre><code>#!/usr/bin/env python import sys import BeautifulSoup as bs from BeautifulSoup import BeautifulSoup url_list = sys.argv[1] urls = [tag['href'] for tag in BeautifulSoup(open(url_list)).findAll('a')] print urls </code></pre> <p>This returns a much more clean list of urls:</p> <pre><code>[u'http://www.youtube.com/watch?v=Gg81zi0pheg', u'http://www.youtube.com/watch?v=pP9VjGmmhfo', u'http://www.youtube.com/watch?v=yTA1u6D1fyE', u'http://www.youtube.com/watch?v=4v8HvQf4fgE', u'http://www.youtube.com/watch?v=e9zG20wQQ1U', u'http://www.youtube.com/watch?v=khL4s2bvn-8', u'http://www.youtube.com/watch?v=XTndQ7bYV0A', u'http://www.youtube.com/watch?v=xTT2MqgWRRc', u'http://www.youtube.com/watch?v=J2ZYQngwSUw', u'http://www.youtube.com/watch?v=9RZwvg7unrU', u'http://www.youtube.com/watch?v=vz3qOYWwm10', u'http://www.youtube.com/watch?v=yarv52QX_Yw', u'http://www.youtube.com/watch?v=LRREY1H3GCI'] </code></pre> <p>my next step is to get each of the youtube urls into video_info.py</p> <pre><code>#!/usr/bin/python import urlparse import sys import gdata.youtube import gdata.youtube.service import re import urlparse import urllib2 youtube_url = sys.argv[1] url_data = urlparse.urlparse(youtube_url) query = urlparse.parse_qs(url_data.query) youtube_id = query["v"][0] print youtube_id yt_service = gdata.youtube.service.YouTubeService() yt_service.developer_key = 'AI39si4yOmI0GEhSTXH0nkiVDf6tQjCkqoys5BBYLKEr-PQxWJ0IlwnUJAcdxpocGLBBCapdYeMLIsB7KVC_OA8gYK0VKV726g' entry = yt_service.GetYouTubeVideoEntry(video_id=youtube_id) print 'Video title: %s' % entry.media.title.text print 'Video view count: %s' % entry.statistics.view_count </code></pre> <p>when this url "http://www.youtube.com/watch?v=aXrgwC1rsw4" the output looks like this:</p> <pre><code>aXrgwC1rsw4 Video title: OneRepublic Good Life Live Walmart Soundcheck Video view count: 202 </code></pre> <p><strong>How do I feed the list of urls from bookmarks.py into video_info.py?</strong></p> <p>*extra points for output to csv format and extra extra points of checking of duplicates in bookmarks.html before passing data to video_info.py*</p> <p>Thanks for all your help guys. Because of Stackoverflow I've gotten this far.</p> <p>David</p> # <p>So combined I now have:</p> <pre><code>#!/usr/bin/env python import urlparse import gdata.youtube import gdata.youtube.service import re import urlparse import urllib2 import sys import BeautifulSoup as bs from BeautifulSoup import BeautifulSoup yt_service = gdata.youtube.service.YouTubeService() yt_service.developer_key = 'AI39si4yOmI0GEhSTXH0nkiVDf6tQjCkqoys5BBYLKEr-PQxWJ0IlwnUJAcdxpocGLBBCapdYeMLIsB7KVC_OA8gYK0VKV726g' url_list = sys.argv[1] urls = [tag['href'] for tag in BeautifulSoup(open(url_list)).findAll('a')] print urls youtube_url = urls url_data = urlparse.urlparse(youtube_url) query = urlparse.parse_qs(url_data.query) youtube_id = query["v"][0] #list(set(my_list)) entry = yt_service.GetYouTubeVideoEntry(video_id=youtube_id) myyoutubes = [] myyoutubes.append(", ".join([youtube_id, entry.media.title.text,entry.statistics.view_count])) print "\n".join(myyoutubes) </code></pre> <p><strong>How do I pass the list of urls to the youtube_url variable?</strong> <em>they need to be cleaned up further and passed one at a time I believe</em></p> <p>I've got it down to this now:</p> <pre><code>#!/usr/bin/env python import urlparse import gdata.youtube import gdata.youtube.service import re import urlparse import urllib2 import sys import BeautifulSoup as bs from BeautifulSoup import BeautifulSoup yt_service = gdata.youtube.service.YouTubeService() yt_service.developer_key = 'AI39si4yOmI0GEhSTXH0nkiVDf6tQjCkqoys5BBYLKEr-PQxWJ0IlwnUJAcdxpocGLBBCapdYeMLIsB7KVC_OA8gYK0VKV726g' url_list = sys.argv[1] urls = [tag['href'] for tag in BeautifulSoup(open(url_list)).findAll('a')] for url in urls: youtube_url = url url_data = urlparse.urlparse(youtube_url) query = urlparse.parse_qs(url_data.query) youtube_id = query["v"][0] #list(set(my_list)) entry = yt_service.GetYouTubeVideoEntry(video_id=youtube_id) myyoutubes = [] myyoutubes.append(", ".join([youtube_id, entry.media.title.text,entry.statistics.view_count])) print "\n".join(myyoutubes) </code></pre> <p>I can pass bookmarks.html to combined.py but it only returns the first line.</p> <p><strong>How to I loop through each line of youtube_url?</strong></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload