Note that there are some explanatory texts on larger screens.

plurals
  1. PORelationships created by a rake task are not persisted though the rails server
    text
    copied!<p>I'm working my first project using Neo4j. I'm parsing wikipedia's page and pagelinks dumps to create a graph where the nodes are pages and the edges are links. I've defined some rake tasks that download the dumps, parse the data, and save it in a Neo4j database. At the end of the rake task I print the number of pages and links created, and some of the pages with the most links. Here is the output of the raks task for the <a href="https://za.wikipedia.org/wiki/Yiebdaeuz" rel="nofollow">zawiki</a>.</p> <pre><code>$ rake wiki[zawiki] [ omitted ] ... :: Done parsing zawiki :: 1984 pages :: 2144 links :: The pages with the most links are: 9625.0 - Emijrp/List_of_Wikipedians_by_number_of_edits_(bots_included): 40 1363.0 - Gvangjsih_Bouxcuengh_Swcigih: 30 9112.0 - Fuzsuih: 27 1367.0 - Cungzcoj: 26 9279.0 - Vangz_Yenfanh: 19 </code></pre> <p>It looks like pages and links are being created, but when I start a rails console, or the server the links aren't found.</p> <pre><code>$ rails c jruby-1.7.5 :013 &gt; Pages.all.count =&gt; 1984 jruby-1.7.5 :003 &gt; Pages.all.reduce(0) { |count, page| count + page.links.count} =&gt; 0 jruby-1.7.5 :012 &gt; Pages.all.sort_by { |p| p.links.count }.reverse[0...5].map { |p| p.links.count } =&gt; [0, 0, 0, 0, 0] </code></pre> <p>Here is the rake task, and <a href="https://github.com/everett1992/wiki_graph" rel="nofollow">this is the projects github page</a>. Can anyone tell me why the links aren't saved?</p> <pre><code>DUMP_DIR = Rails.root.join('lib','assets') desc "Download wiki dumps and parse them" task :wiki, [:wiki] =&gt; 'wiki:all' namespace :wiki do task :all, [:wiki] =&gt; [:get, :parse] do |t, args| # Print info about the newly created pages and links. link_count = 0 Pages.all.each do |page| link_count += page.links.count end indent "Done parsing #{args[:wiki]}" indent "#{Pages.count} pages" indent "#{link_count} links" indent "The pages with the most links are:" Pages.all.sort_by { |a| a.links.count }.reverse[0...5].each do |page| puts "#{page.page_id} - #{page.title}: #{page.links.count}" end end desc "Download wiki page and page links database dumps to /lib/assets" task :get, :wiki do |t, args| indent "Downloading dumps" sh "#{Rails.root.join('lib', "get_wiki").to_s} #{args[:wiki]}" indent "Done" end desc "Parse all dumps" task :parse, [:wiki] =&gt; 'parse:all' namespace :parse do task :all, [:wiki] =&gt; [:pages, :pagelinks] desc "Read wiki page dumps from lib/assests into the database" task :pages, [:wiki] =&gt; :environment do |t, args| parse_dumps('page', args[:wiki]) do |obj| page = Pages.create_from_dump(obj) end indent = "Created #{Pages.count} pages" end desc "Read wiki pagelink dumps from lib/assests into the database" task :pagelinks, [:wiki] =&gt; :environment do |t, args| errors = 0 parse_dumps('pagelinks', args[:wiki]) do |from_id, namespace, to_title| from = Pages.find(:page_id =&gt; from_id) to = Pages.find(:title =&gt; to_title) if to.nil? || from.nil? errors = errors.succ else from.links &lt;&lt; to from.save end end end end end def indent *args print ":: " puts args end def parse_dumps(dump, wiki_match, &amp;block) wiki_match ||= /\w+/ DUMP_DIR.entries.each do |file| file, wiki = *(file.to_s.match(Regexp.new "(#{wiki_match})-#{dump}.sql")) if file indent "Parsing #{wiki} #{dump.pluralize} from #{file}" each_value(DUMP_DIR.join(file), &amp;block) end end end def each_value(filename) f = File.open(filename) num_read = 0 begin # read file until line starting with INSERT INTO line = f.gets end until line.match /^INSERT INTO/ begin line = line.match(/\(.*\)[,;]/)[0] # ignore begining of line until (...) object begin yield line[1..-3].split(',').map { |e| e.match(/^['"].*['"]$/) ? e[1..-2] : e.to_f } num_read = num_read.succ line = f.gets.chomp end while(line[0] == '(') # until next insert block, or end of file end while line.match /^INSERT INTO/ # Until line doesn't start with (... f.close end </code></pre> <p>app/models/pages.rb</p> <pre><code>class Pages &lt; Neo4j::Rails::Model include Neo4j::NodeMixin has_n(:links).to(Pages) property :page_id property :namespace, :type =&gt; Fixnum property :title, :type =&gt; String property :restrictions, :type =&gt; String property :counter, :type =&gt; Fixnum property :is_redirect, :type =&gt; Fixnum property :is_new, :type =&gt; Fixnum property :random, :type =&gt; Float property :touched, :type =&gt; String property :latest, :type =&gt; Fixnum property :length, :type =&gt; Fixnum property :no_title_convert, :type =&gt; Fixnum def self.create_from_dump(obj) # TODO: I wonder if there is a way to compine these calls page = {} # order of this array is important, it corresponds to the data in obj attrs = [:page_id, :namespace, :title, :restrictions, :counter, :is_redirect, :is_new, :random, :touched, :latest, :length, :no_title_convert] attrs.each_index { |i| page[attrs[i]] = obj[i] } page = Pages.create(page) return page end end </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload