Note that there are some explanatory texts on larger screens.

plurals
  1. POImplementing an Algorithm in Hadoop and java
    text
    copied!<p>Hi I am trying to implement a newly built bioinformatics algorithm in Hadoop and Java (I am not sure if it could be done). I have searched a lot over internet for implementing the algorithm on Hadoop. However all I find is "Identify the parallel tasks and execute them over hadoop". I would really appreciate if you guys can guide me to the right resources of Hadoop with Java over internet where I could find some solid example other than word count. I know Java well, but hadoop is my first time. Any help would be appreciated.</p> <p>This is what I want to do</p> <p>I have a very large text file (approx 100 MB) which have lines of characters (A,G,T,C) randomized.Long sequences of randomized A,G,T,C might form a string of important sequence k for eg (ATCGAGC). I might find this sequence k-mer in many lines of this text file called 'r'.</p> <p>I have to perform following tasks</p> <ol> <li><p>Identify the position of various k-mer in all lines of text(r) in R (whole set/file)</p></li> <li><p>I have to keep track of positions of k-mer in a particular r.</p></li> <li><p>I have two parameters which are used to compare the k-mers in various r.</p></li> <li><p>If the k-mers in two 'r' satisfy the above parameter comparison I have to update the neighbor set N</p></li> </ol> <p>If you are interested this is the pseudo code here it is </p> <pre><code>Given k, ĥ, ȇ 1. Make K by extracting all possible kmers from Reads 2. for all reads r belongs R do construct Gk[r] by scanning through r end for 3. for all k ε K do for all read pairs (r,s) ε Gk × GK if h(r,s) ≥ ĥ and dk &lt; ȇ h(r,s) then update the N end if end for end for k is k-mer K is set of all k ĥ minimum overlap distance ȇ maximum mismatch tolerance N neighbor set h(r,s) overlap length of r and s wrt k d(r,s) distance between r and s </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload