Note that there are some explanatory texts on larger screens.

plurals
  1. POWeird characters in exported csv files when converting
    primarykey
    data
    text
    <p>I came across a problem I cannot solve on my own concerning the downloadable csv formatted trends data files from Google Insights for Search. </p> <p>I'm to lazy to reformat the files I4S gives me manually what means: Extracting the section with the actual trends data and reformatting the columns so that I can use it with a modelling program I do for school. </p> <p>So I wrote a tiny script the should do the work for me: Taking a file, do some magic and give me a new file in proper format. </p> <p>What it's supposed to do is reading the file contents, extracting the trends section, splitting it by newlines, splitting each line and then reorder the columns and maybe reformat them. </p> <p>When looking at a untouched I4S csv file it looks normal containing CR LF caracters at line breaks (maybe thats only because I'm using Windows). </p> <p>When just reading the contents and then writing them to a new file using the script wierd asian characters appear between CR and LF. I tried the script with a manually written similar looking file and even tried a csv file from Google Trends and it works fine. </p> <p>I use Python and the script (snippet) I used for the following example looks like this: </p> <pre><code> # Read from an input file file = open(file,"r") contents = file.read() file.close() cfile = open("m.log","w+") cfile.write(contents) cfile.close() </code></pre> <p>Has anybody an idea why those characters appear??? Thank you for you help! </p> <p>I'll give you and example:</p> <h2>First few lines of I4S csv file:</h2> <pre><code>Web Search Interest: foobar Worldwide; 2004 - present Interest over time Week foobar 2004-01-04 - 2004-01-10 44 2004-01-11 - 2004-01-17 44 2004-01-18 - 2004-01-24 37 2004-01-25 - 2004-01-31 40 2004-02-01 - 2004-02-07 49 2004-02-08 - 2004-02-14 51 2004-02-15 - 2004-02-21 45 2004-02-22 - 2004-02-28 61 2004-02-29 - 2004-03-06 51 2004-03-07 - 2004-03-13 48 2004-03-14 - 2004-03-20 50 2004-03-21 - 2004-03-27 56 2004-03-28 - 2004-04-03 59 </code></pre> <hr> <h2>Output file when reading and writing contents:</h2> <pre><code>Web Search Interest: foobar ਍圀漀爀氀搀眀椀搀攀㬀 ㈀  㐀 ⴀ 瀀爀攀猀攀渀琀ഀഀ ਍䤀渀琀攀爀攀猀琀 漀瘀攀爀 琀椀洀攀ഀഀ Week foobar ਍㈀  㐀ⴀ ㄀ⴀ 㐀 ⴀ ㈀  㐀ⴀ ㄀ⴀ㄀ ऀ㐀㐀ഀഀ 2004-01-11 - 2004-01-17 44 ਍㈀  㐀ⴀ ㄀ⴀ㄀㠀 ⴀ ㈀  㐀ⴀ ㄀ⴀ㈀㐀ऀ㌀㜀ഀഀ 2004-01-25 - 2004-01-31 40 ਍㈀  㐀ⴀ ㈀ⴀ ㄀ ⴀ ㈀  㐀ⴀ ㈀ⴀ 㜀ऀ㐀㤀ഀഀ 2004-02-08 - 2004-02-14 51 ਍㈀  㐀ⴀ ㈀ⴀ㄀㔀 ⴀ ㈀  㐀ⴀ ㈀ⴀ㈀㄀ऀ㐀㔀ഀഀ 2004-02-22 - 2004-02-28 61 ਍㈀  㐀ⴀ ㈀ⴀ㈀㤀 ⴀ ㈀  㐀ⴀ ㌀ⴀ 㘀ऀ㔀㄀ഀഀ 2004-03-07 - 2004-03-13 48 ਍㈀  㐀ⴀ ㌀ⴀ㄀㐀 ⴀ ㈀  㐀ⴀ ㌀ⴀ㈀ ऀ㔀 ഀഀ 2004-03-21 - 2004-03-27 56 ਍㈀  㐀ⴀ ㌀ⴀ㈀㠀 ⴀ ㈀  㐀ⴀ 㐀ⴀ ㌀ऀ㔀㤀ഀഀ 2004-04-04 - 2004-04-10 69 ਍㈀  㐀ⴀ 㐀ⴀ㄀㄀ ⴀ ㈀  㐀ⴀ 㐀ⴀ㄀㜀ऀ㘀㔀ഀഀ 2004-04-18 - 2004-04-24 51 ਍㈀  㐀ⴀ 㐀ⴀ㈀㔀 ⴀ ㈀  㐀ⴀ 㔀ⴀ ㄀ऀ㔀㄀ഀഀ 2004-05-02 - 2004-05-08 56 ਍㈀  㐀ⴀ 㔀ⴀ 㤀 ⴀ ㈀  㐀ⴀ 㔀ⴀ㄀㔀ऀ㔀㈀ഀഀ 2004-05-16 - 2004-05-22 54 ਍㈀  㐀ⴀ 㔀ⴀ㈀㌀ ⴀ ㈀  㐀ⴀ 㔀ⴀ㈀㤀ऀ㔀㔀ഀഀ 2004-05-30 - 2004-06-05 74 ਍㈀  㐀ⴀ 㘀ⴀ 㘀 ⴀ ㈀  㐀ⴀ 㘀ⴀ㄀㈀ऀ㔀㜀ഀഀ 2004-06-13 - 2004-06-19 50 ਍㈀  㐀ⴀ 㘀ⴀ㈀  ⴀ ㈀  㐀ⴀ 㘀ⴀ㈀㘀ऀ㔀㐀ഀഀ 2004-06-27 - 2004-07-03 58 ਍㈀  㐀ⴀ 㜀ⴀ 㐀 ⴀ ㈀  㐀ⴀ 㜀ⴀ㄀ ऀ㔀㤀ഀഀ 2004-07-11 - 2004-07-17 59 ਍㈀  㐀ⴀ 㜀ⴀ㄀㠀 ⴀ ㈀  㐀ⴀ 㜀ⴀ㈀㐀ऀ㘀㈀ഀഀ </code></pre> <hr>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload