Note that there are some explanatory texts on larger screens.

plurals
  1. POPreferred method for Materialized Views (Summary Tables) with MySQL
    primarykey
    data
    text
    <p>I am developing a project at work for which I need to create and maintain <strong>Summary Tables</strong> for performance reasons. I believe the correct term for this is <strong>Materialized Views</strong>.</p> <p>I have 2 main reasons to do this:</p> <ol> <li><p><strong>Denormalization</strong></p> <p>I normalized the tables as much as possible. So there are situations where I would have to join many tables to pull data. We work with MySQL Cluster, which has pretty poor performance when it comes to JOIN's.</p> <p>So I need to create Denormalized Tables that can run faster SELECT's.</p></li> <li><p><strong>Summarize Data</strong></p> <p>For example, I have a Transactions table with a few million records. The transactions come from different websites. The application needs to generate a report will display the daily or monthly transaction counts, and total revenue amounts per website. I don't want the report script to calculate this every time, so I need to generate a Summary Table that will have a breakdown by [site,date].</p> <p>That is just one simple example. There are many different kinds of summary tables I need to generate and maintain.</p></li> </ol> <p>In the past I have done these things by writing several cron scripts to keep each summary table updated. But in this new project, I am hoping to implement a more elegant and proper solution.</p> <p>I would prefer a PHP based solution, as I am not a server administrator, and I feel the most comfortable when I can control everything through my application code.</p> <hr> <p><em><strong>Solutions that I have considered:</em></strong></p> <ol> <li><p><strong>Copying VIEW's</strong></p> <p>If the resulting table can be represented as a single SELECT query, I can generate a VIEW. Since they are slow, there can be a cronjob that copies this VIEW into a real table.</p> <p>However, some of these SELECT queries can be so slow that it's not acceptable even for cronjobs. It is not very efficient to recreate the whole summary data, if older rows are not even being updated much.</p></li> <li><p><strong>Custom Cronjobs for each Summary Table</strong></p> <p>This is the solution I have used before, but now I am trying to avoid it if possible. If there will be many summary tables, it can be messy to maintain.</p></li> <li><p><strong>MySQL Triggers</strong></p> <p>It is possible to add triggers to the main tables so that every time there is an INSERT, UPDATE or DELETE, the summary tables get updated accordingly.</p> <p>There would be no cronjobs and the summaries would be in real time. However if there is ever a need to rebuild a summary table from scratch, it would have to be done with another solution (probably #1 above).</p></li> <li><p><strong>Using ORM Hooks/Triggers</strong></p> <p>I am using Doctrine as my ORM. There is a way to add event listeners that will trigger stuff on INSERT/UPDATE/DELETE, which in turn can update the summary tables. In a sense this solution is similar to #3 above, but I will have better control over these triggers since they will be implemented in PHP.</p></li> </ol> <hr> <p><em><strong>Implementation Considerations:</em></strong></p> <ol> <li><p><strong>Complete Rebuilds</strong></p> <p>I want to avoid having to rebuild the summary tables, for efficiency, and only update for new data. But in case something goes wrong, I need the capability to rebuild the summary table from scratch using existing data on the main tables.</p></li> <li><p><strong>Ignoring UPDATE/DELETE on Old Data</strong></p> <p>Some summaries can assume that older records will never be updated or deleted, but only new records will be inserted. The summary process can save a lot of work by making the assumption that it doesn't need to check for updates on older data.</p> <p>But of course this won't apply to all tables.</p></li> <li><p><strong>Keeping a Log</strong></p> <p>Let's assume that I won't have access to, or do not want to use the binary MySQL logs. </p> <p>For summarizing new data, the summary process just needs to remember the last primary key id's for the last records it summarized. Next time it runs, it can summarize everything after that id. However, to keep track of older records that have been updated/deleted, it needs another log so it can go back and re-summarize that data.</p></li> </ol> <hr> <p>I would appreciate any kind of strategies, suggestions or links that can help. Thank you!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload