Note that there are some explanatory texts on larger screens.

plurals
  1. POTable schema design while using innodb
    text
    copied!<p>I have encountered a problem when designing the table schema for our system.</p> <p>Here is the situation:</p> <ol> <li><p>our system has a lot of items ( more than 20 millions ), each item has an unique id, but for each item there can be lots of records. For example for the item with id 1 there are about 5000 records and each record has more than 20 attributes. The needs to be identified by its id and status of one or more of its attributes for use in <code>select</code>, <code>update</code> or <code>delete</code>.</p></li> <li><p>I want to use innodb</p></li> </ol> <p>But the problem is when using innodb, there must be an cluster index. Due to the situation described above it seems had to find a cluster index so I can only use an <code>auto_increment int</code> as the key</p> <p>The current design is as follows:</p> <pre><code>create table record ( item_key int(10) unsigned NOT NULL AUTO_INCREMENT, item_id int(10) unsigned NOT NULL, attribute_1 char(32) NOT NULL, attribute_2 int(10) unsigned NOT NULL, . . . . . attribute_20 int(10) unsigned NOT NULL, PRIMARY KEY (`item_key`), KEY `iattribute_1` (`item_id`,`attribute_1`), KEY `iattribute_2` (`item_id`,`attribute_2`) ) ENGINE=InnoDB AUTO_INCREMENT=22 DEFAULT CHARSET=latin1 </code></pre> <p>the sql statement:</p> <pre><code>select * from records where item_id=1 and attribute_1='a1' and attribute_2 between 10 and 1000; </code></pre> <p>the <code>update</code> and <code>delete</code> statement are similar.</p> <p>I don't think this a good design, but I can't think of anything else; all suggestions welcome.</p> <hr> <p>Sorry if I didn't make the question clear.</p> <ol> <li><p>What I want to access ( select, update, delete, insert) is the records, not the items. The items have there own attributes, but in the descriptions above, the attributes that I mentioned are belongs to the records.</p></li> <li><p>Every item can have many records, like item 1 have about 5000 records.</p></li> <li><p>Every record have 42 attributes, some of them can be NULL, every record has an unique id, this id is unique among different items, but this id is an string not an number</p></li> <li><p>I want to access the records in this way:</p> <p>A. I will only get(or update or delete) the records that belongs to one specific item at on time or in one query</p> <p>B. I will get or update the values of all attributes or some specific attributes in the query</p> <p>C. The attributes that in the condition of the query may not the same as the attributes that I want.</p></li> </ol> <p>So there could be some SQL statements like:</p> <pre><code>Select attribute_1, attribute_N from record_table_1 where item_id=1 and attribute_K='some value' and attribute_M between 10 and 100 </code></pre> <p>And the reasons that why I think the original design is not good are:</p> <ol> <li><p>I can't choose an attribute or the record id as the primary key, because it is no use, in every query, I have to assign the item id and some attributes as the query condition ( like "where item_id=1 and attribute_1='value1' and attribte_2 between 2 and 3), so I can only use an auto_increment int number as the primary key. The result of this is that each query have to scan two b-trees, and it look like that scan of the secondary index is not effective.</p></li> <li><p>Also compound keys seems useless, because the condition of the query could vary among many attributes.</p></li> <li><p>With the original design, it seems that I have add a lot of indexes to satisfy different queries, otherwise I have to deal with the full table scan problem, but it is obviously that too many indexes is not good for update, delete, insert operations.</p></li> </ol>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload