StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
19705539
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
10
CommunityOwnedDate
CreationDate
2013-10-31T11:39:53.740
FavoriteCount
0
LastActivityDate
2013-10-31T14:54:01.887
LastEditDate
2013-10-31T14:54:01.887
LastEditorUserId
2679066
OwnerUserId
2679066
ParentId
19705115
PostTypeId
2
Score
2
ViewCount
0
LastEditorDisplayName
text
Body
parsing the same file in parallel threads does NOT add speed but just costs extra ressources A less problematic and more efficient text2db optimisation consists of: <ul> <li>bulk read the file (rather then line by line read 1 MB at once, process it, read next MB)</li> <li>bulk insert into the database - mysql like this: <pre><code>insert into urtable values ('val1','val2'), ('val1','val2'); </code></pre></li> </ul> (example stolen from <a href="http://bytes.com/topic/sql-server/answers/585793-insert-into-using-select-values-inserting-multiple-rows" rel="nofollow">http://bytes.com/topic/sql-server/answers/585793-insert-into-using-select-values-inserting-multiple-rows</a> - sorry for being too lazy to make one up by myself) <ul> <li>try to prevent sql back and forth (means: if select-output is required from the database to enrich your dataset read it upfront and not on and on while walking through the file)</li> </ul> UPDATE ---- From the comment I took that there might be a need to get data from the database, while parsing the file. Well, if you have to do, you have to do. BUT: Try to not have to do that. First of all: Reading specific data can be seen caching or not. In a narrow understanding caching is just moving disk data to memory by any heuristics (without knowing what is going on). I personally try to avoid this because heuristics can play against you. In a wider understanding caching is what I described before PLUS put data from disk to memory, which you can pinpoint down (eg. by ID or any filter criteria). So I still do not like the narrow understanding part of this but the behaviour selecting well defined data upfront. Secondly: My personal experiences go like that: IF you are working on a fully normalized data model database read operations in file parsings very often cook down to 'give me the primary key(s)' of what I dumped before into the database. This appears to become tricky when you write multiple rows at once. However especially in MySQL you can definitely rely on 'each insert statement (even multiple row inserts) is atomistic', you get the ID from last_insert_id() and so you can track this down to all your previously written records. I am pretty sure there are similar 'cock downs' for other database systems. Thirdly: Parsing LARGE files is something I would try to operate as job with only ONE technical user triggering that with ensuring that NOT >1 of these processes run in parallel. Otherwise you need to work around all sort of issues starting with file locking going into sessions permission read/write management. So running this as job classifies this (at least in my personal policies) to allocate LOTS of RAM - depending on costs and how important speed is. That means I would not even bother to load a 100 K row keyword-to-id table into memory upfront.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POMulti threading issues with database
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USQuicker
UserOwnerUserId
1. USQuicker
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POMulti threading issues with database
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.