Note that there are some explanatory texts on larger screens.

plurals
  1. POparsing a twitter feed in Python into a table
    text
    copied!<p>Have a set of tweets that have been saved to a .txt file. </p> <p>I want to place certain attributes in a sqlite table in Python. I successfully created the table. </p> <pre><code>import pandas import sqlite3 conn = sqlite3.connect('twitter.db') c = conn.cursor() c.execute(CREATE TABLE Tweet ( created_at VARCHAR2(25), id VARCHAR2(25), text VARCHAR2(25) source VARCHAR2(25), in-reply_to_user_ID VARCHAR2(25), retweet_Count VARCHAR2(25) ) </code></pre> <p>Before I even attempted to add the parsed data into the db, I tried to create a data frame with it just to view.</p> <pre><code>tweets =pandas.read_table('file.txt', sep=',') </code></pre> <p>I get the error:</p> <pre><code>CParserError: Error tokenizing data. C error: Expected 63 fields in line 3, saw 69 </code></pre> <p>My assumption is there are ',' not only separating the fields, but within the strings too. </p> <p>Also, twitter data comes in a format that I have not worked with before. Each field starts with the variable name in parenthesis, a colon, then the data separated by more parenthesis. Like: </p> <pre><code>"created_at":"Fri Oct 11 00:00:03 +0000 2013", </code></pre> <p>So how can I get this into a standard table format with the variable names at the top?</p> <p>A full example of a tweet is this:</p> <pre><code>{"created_at":"Fri Oct 11 00:00:03 +0000 2013","id":388453908911095800,"id_str":"388453908911095809","text":"LAGI PUN VISITORS DATANG PUKUL 9 AH","source":"&lt;a href=\"http://www.tweetdeck.com\" rel=\"nofollow\"&gt;TweetDeck&lt;/a&gt;","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":447800506,"id_str":"447800506","name":"§yazwina·","screen_name":"_SAireen","location":"SSP","url":"http://flavors.me/syazwinaaireen#","description":"Absence makes the heart grow fonder. Stay us x @_DFitri's","protected":false,"followers_count":806,"friends_count":702,"listed_count":2,"created_at":"Tue Dec 27 08:29:53 +0000 2011","favourites_count":7478,"utc_offset":28800,"time_zone":"Beijing","geo_enabled":true,"verified":false,"statuses_count":32558,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"DBE9ED","profile_background_image_url":"http://a0.twimg.com/profile_background_images/378800000056283804/65d84665fbb81deba13427e8078a3eff.png","profile_background_image_url_https":"https://si0.twimg.com/profile_background_images/378800000056283804/65d84665fbb81deba13427e8078a3eff.png","profile_background_tile":true,"profile_image_url":"http://a0.twimg.com/profile_images/378800000264138431/fd9d57bd1b1609f36fd7159499a94b6e_normal.jpeg","profile_image_url_https":"https://si0.twimg.com/profile_images/378800000264138431/fd9d57bd1b1609f36fd7159499a94b6e_normal.jpeg","profile_banner_url":"https://pbs.twimg.com/profile_banners/447800506/1369969522","profile_link_color":"FA0096","profile_sidebar_border_color":"FFFFFF","profile_sidebar_fill_color":"E6F6F9","profile_text_color":"333333","profile_use_background_image":true,"default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"symbols":[],"urls":[],"user_mentions":[]},"favorited":false,"retweeted":false,"filter_level":"medium","lang":"it"} </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload