StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
19573406
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2013-10-24T18:10:38.003
FavoriteCount
0
LastActivityDate
2013-10-28T15:53:54.630
LastEditDate
2013-10-28T15:53:54.630
LastEditorUserId
78792
OwnerUserId
78792
ParentId
19502723
PostTypeId
2
Score
1
ViewCount
0
LastEditorDisplayName
text
Body
I think that @j_random_hacker and @Ashalynd are on the right track regarding using this algorithm in most Perl implementations. The datatypes you're using are going to use more memory that absolutely needed for the calculations. So this is "normal" in that you should expect to see this kind of memory usage for how you've written this algorithm in perl. You may have other problems in surrounding code that are using a lot of memory but this algorithm will hit your memory hard with large sequences. You can address some of the memory issues by changing the datatypes that you're using as @Ashalynd suggests. You could try changing the hash which holds score and pointer into an array and changing the string pointers into integer values. Something like this might get you some benefit while still maintaining readability: <pre><code>use strict; use warnings; # define constants for array positions and pointer values # so the code is still readable. # (If you have the "Readonly" CPAN module you may want to use it for constants # instead although none of the downsides of the "constant" pragma apply in this code.) use constant { SCORE => 0, POINTER => 1, DIAGONAL => 0, LEFT => 1, UP => 2, NONE => 3, }; ... sub semiGlobal2 { my ( $seq1, $seq2,$MATCH,$MISMATCH,$GAP ) = @_; # initialization: first row to 0 ; my @matrix; # score and pointer are now stored in an array # using the defined constants as indices $matrix[0][0][SCORE] = 0; # pointer value is now a constant integer $matrix[0][0][POINTER] = NONE; for ( my $j = 1 ; $j <= length($seq1) ; $j++ ) { $matrix[0][$j][SCORE] = 0; $matrix[0][$j][POINTER] = NONE; } for ( my $i = 1 ; $i <= length($seq2) ; $i++ ) { $matrix[$i][0][SCORE] = $GAP * $i; $matrix[$i][0][POINTER] = UP; } ... # continue to make the appropriate changes throughout the code </code></pre> However, when I tested this I didn't get a huge benefit when attempting to align a 3600 char string in a 5500 char string of random data. I programmed my code to abort when it consumed more than 2GB of memory. The original code aborted after 23 seconds while the one using constants and an array instead of a hash aborted after 32 seconds. If you really want to use this specific algorithm I'd check out the performance of <a href="http://search.cpan.org/dist/Algorithm-NeedlemanWunsch" rel="nofollow">Algorithm::NeedlemanWunsch</a>. It doesn't look like it's very mature but it may have addressed your performance issues. Otherwise look into writing an <a href="http://search.cpan.org/dist/Inline" rel="nofollow">Inline</a> or <a href="http://perldoc.perl.org/perlxs.html" rel="nofollow">Perl XS</a> wrapper around a C implementation
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POperl blowing up in sequence alignment by dynamic programming
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USbenrifkah
UserOwnerUserId
1. USbenrifkah
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COA reference to a single array having two scalars, is cheaper than two arrays having a single scalar. Like: $matrix[$x][$y] = [ 'score', 'pointer'] My testing showed a savings of just under 10%.
 singulars
 PostPostId
 PO
 UserUserId
 USJim Black
2. CO@JimBlack Are you suggesting that the code in my answer has "two arrays having a single scalar"? I'm not sure where you're seeing this.
 singulars
 PostPostId
 PO
 UserUserId
 USbenrifkah

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.