StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POSpring Batch: migrating 1 to n relationship where n is potentially huge
primarykey
Id
3529254
data
AcceptedAnswerId
3530046
AnswerCount
1
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2010-08-20T08:21:47.353
FavoriteCount
2
LastActivityDate
2010-08-23T13:59:52.337
LastEditDate
LastEditorUserId
0
OwnerUserId
342852
ParentId
0
PostTypeId
1
Score
3
ViewCount
1071
LastEditorDisplayName
text
Body
I am experienced with Spring, but new to Spring Batch. Now I have the task to migrate a data structure from a simple structure in one database to a complexer one in the other. The data structure corresponds to an object hierarchy that I will name like this <pre><code>OldParent 1 --> n OldChild // old system NewParent 1 --> n NewChild // new system </code></pre> In the old db, there are only two tables, in the new system, things get a lot more complex and there are 8 tables, but that is irrelevant for now. Basically I would like to use a simple JDBC-based solution with rowmappers reading from OldParent and converting to NewParent. So here would be a basic configuration snippet: <pre><code><batch:job id="migration"> <batch:step id="convertLegacyData"> <batch:tasklet> <batch:chunk reader="parentReader" writer="parentWriter" commit-interval="200" /> </batch:tasklet> </batch:step> </batch:job> </code></pre> In this scenario, the parentReader would acquire and convert the OldChild objects, probably delegating to a childReader / childWriter objects. The problem is this: while there are several hundred thousand Parents, each Parent can have zero to several million children, so the commit-interval based on parent would not help at all, but I would very much like to have a configurable commit interval. So another solution would be to make the workflow child-based: <pre><code><batch:job id="migration"> <batch:step id="convertLegacyData"> <batch:tasklet> <batch:chunk reader="childReader" writer="childWriter" commit-interval="200" /> </batch:tasklet> </batch:step> </batch:job> </code></pre> In this scenario, the childReader would have to also read OldParent objects and write NewParents, delegating to parentReader and parentWriter objects. The major drawback here is that I am losing all OldParents that don't have associated OldChild objects. The third possible scenario would be to have two different workflows for <code>OldParent -> NewParent</code> and <code>OldChild -> NewChild</code>. (I would have to maintain a mapping table that stores the relationship between OldParent and NewParent ids, but I could use standard configurations including commit-interval. Are there other possibilities? Which of these would you recommend as best practice?
Tags
<java><spring><jdbc><data-migration><spring-batch>
Title
Spring Batch: migrating 1 to n relationship where n is potentially huge
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USSean Patrick Floyd
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POSpring Batch: migrating 1 to n relationship where n is potentially huge
 UserUserId
 USSean Patrick Floyd
 VoteTypeVoteTypeId
 VTBountyStart
2. VO
 singulars
 PostPostId
 POSpring Batch: migrating 1 to n relationship where n is potentially huge
 UserUserId
 USPascal Thivent
 VoteTypeVoteTypeId
 VTFavorite
3. VO
 singulars
 PostPostId
 POSpring Batch: migrating 1 to n relationship where n is potentially huge
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.