Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>As Assaf said, I wouldn't worry about storing duplicated articles if they come from different feeds, for now at least. The complication it would add doesn't benefit the few kilobytes of space you'd save..</p> <p>I suppose if you take a sha1 hash of the content, do <code>SELECT id FROM articles WHERE hash = $hash</code> and if something exists, simply have a "article_content_id" which if set points the articles content at another row... but, what if you have two articles:</p> <pre><code>id: 1 title: My First Post! feed: Bobs site content: Hi! hash: abc link: no content_link_id: id:2 title: My First Post! feed: Planet Randompeople Aggregator content: hash: abc content_link_id: 1 </code></pre> <p>..this works fine, and you've saved 3 bytes by not duplicating the article (obviously more if the article was longer)</p> <p>..but what happens when Bob decides to add adverts to his RSS feed, changing the content from <code>Hi!</code> to <code>Hi!&lt;p&gt;&lt;img src='...'&gt;&lt;/p&gt;</code> - but Planet Randompeople strips out all images. Then to update a feed item, you have to check each row that <code>content_link_id</code>-links against the article you are updating, check if the new item has the same hash as the articles that link against it - if it is different, you have to break the link and copy the old data to the linking-item, then copy the new content to the original item..</p> <p>There's possibly neater ways to do that, but my point is that it can get very complicated, and you will probably only save a few kilobytes (assuming the database engine doesn't do any compression itself) on a very limited subset of posts..</p> <p>Other than that, having a table of <code>feeds</code> and <code>items</code> seems sensible, and is how most other RSS-storing databases I've seen dealt with it..</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload