StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POOptimizing SDF filesize
primarykey
Id
11178699
data
AcceptedAnswerId
0
AnswerCount
1
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2012-06-24T15:32:38.180
FavoriteCount
1
LastActivityDate
2014-10-03T18:34:19.167
LastEditDate
2014-10-03T18:34:19.167
LastEditorUserId
64046
OwnerUserId
1443292
ParentId
0
PostTypeId
1
Score
0
ViewCount
297
LastEditorDisplayName
text
Body
I recently started learning Linq and SQL. As a small project I'm writing a dictionary application for Windows Phone. The project is split into two Applications. One Application (that currently runs on my PC) generates a SDF file on my PC. The second App runs on my Windows Phone and searches the database. However I would like to optimize the data usage. The raw entries of the dictionary are written in a TXT file with a filesize of around 39MB. The file has the following layout <pre><code>germanWord \tab englishWord \tab group germanWord \tab englishWord \tab group </code></pre> The file is parsed into a SDF database with the following tables. Table Word with columns _version (rowversion), Id (int IDENTITY), Word (nvarchar(250)), Language (int) This table contains every single word in the file. The language is a flag from my code that I used in case I want to add more languages later. A word-language pair is unique. Table Group with columns _version (rowversion), GroupId (int IDENTITY), Caption (nvarchar(250)) This table contains the different groups. Every group is present one time. Table Entry with columns _version (rowversion), EntryId (int IDENTITY), WordOneId (int), WordTwoId(int), GroupId(int) This table links translations together. WordOneId and WordTwoId are foreign keys to a row in the Word Table, they contain the id of a row. GroupId defines the group the words belong to. I chose this layout to reduce the data footprint. The raw textfile contains some german (or english) words multiple times. There are around 60 groups that repeat themselfes. Programatically I reduce the wordcount from around 1.800.000 to around 1.100.000. There are around 50 rows in the Group table. Despite the reduced number of words the SDF is around 80MB in filesize. That's more than twice the size of the the raw data. Another thing is that in order to speed up the searching of translation I plan to index the Word column of the Word table. By adding this index the file grows to over 130MB. How can it be that the SDF with ~60% of the original data is twice as large? Is there a way to optimize the filesize?
Tags
<c#><windows-phone><sql-server-ce>
Title
Optimizing SDF filesize
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USJasonMArcher
UserOwnerUserId
1. USChrisK
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POOptimizing SDF filesize
 UserUserId
 USChrisK
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.