Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Although I'm not sure how to translate the following algorithm into GA (and I'm not sure why you need to use GA for this problem), and I could be off base in proposing it, here goes. </p> <p>The simple technique I would propose is to count the number of black pixels per row. (Actually it's the dark pixel density per row.) This requires very few operations, and with a few additional calculations it's not difficult to find peaks in the pixel-sum histogram.</p> <p>A raw histogram will look something like this, where the profile along the left side shows the number of dark pixels in a row. For visibility, the actual count is normalized to stretch out to x = 200.</p> <p><img src="https://i.stack.imgur.com/sKuXe.png" alt="raw horizontal count"></p> <p>After some additional, simple processing is added (described below), we can generate a histogram like this that can be clipped at some threshold value. What remains are peaks indicating the center of lines of text.</p> <p><img src="https://i.stack.imgur.com/OjdkJ.png" alt="processed horizontal count"></p> <p>From there it's a simple matter to find the lines: just clip (threshold) the histogram at some value such as 1/2 or 2/3 the maximum, and optionally check that the width of the peak at your clipping threshold is some minimum value w.</p> <p>One implementation of the full (yet still simple!) algorithm to find the nicer histogram is as follows:</p> <ol> <li>Binarize the image using a "moving average" threshold or similar local thresholding technique in case a standard Otsu threshold operating on pixels near edges isn't satisfactory. Or, if you have a nice black-on-white image, just use 128 as your binarization threshold.</li> <li>Create an array to store your histogram. This array's length will be the height of the image.</li> <li>For each pixel (x,y) in the binarized image, find the number of dark pixels above and below (x,y) at some radius R. That is, count the number of dark pixels from (x, y - R) to x (y + R), inclusive. </li> <li>If the number of dark pixels within a vertical radius R is equal or greater to R--that is, at least half the pixels are dark--then pixel (x,y) has sufficient vertical dark neighbors. Increment your bin count for row y. </li> <li>As you march along each row, track the leftmost and rightmost x-values for pixels with sufficient neighbors. As long as the width (right - left + 1) exceeds some minimum value, divide the total count of dark pixels by this width. This normalizes the count to ensure the short lines like the very last line of text are included.</li> <li>(Optional) Smooth the resulting histogram. I just used the mean over 3 rows.</li> </ol> <p>The "vertical count" (step 3) eliminates horizontal strokes that happen to be located above or below the center line of text. A more sophisticated algorithm would just check directly above and below (x,y), but also to the upper left, upper right, lower left, and lower right.</p> <p>With my rather crude implementation in C# I was able to process the image in less than 75 milliseconds. In C++, and with some basic optimization, I've little doubt the time could be cut down considerably. </p> <p>This histogram method assumes the text is horizontal. Since the algorithm is reasonably fast, you may have enough time to calculate pixel count histograms at increments of every 5 degrees from the horizontal. The scan orientation with the greatest peak/valley differences would indicate the rotation.</p> <p>I'm not familiar with GA terminology, but if what I've suggested is of some value I'm sure you can translate it into GA terms. In any case, I was interested in this problem anyway, so I might as well share.</p> <p>EDIT: maybe for use GA, it's better to think in terms of "distance since previous dark pixel in X" (or along angle theta) and "distance since previous dark pixel in Y" (or along angle [theta - pi/2]). You might also check distance from white pixel to dark pixel in all radial directions (to find loops).</p> <pre><code>byte[,] arr = get2DArrayFromBitamp(); //source array from originalBitmap int w = arr.GetLength(0); //width of 2D array int h = arr.GetLength(1); //height of 2D array //we can use a second 2D array of dark pixels that belong to vertical strokes byte[,] bytes = new byte[w, h]; //dark pixels in vertical strokes //initial morph int r = 4; //radius to check for dark pixels int count = 0; //number of dark pixels within radius //fill the bytes[,] array only with pixels belonging to vertical strokes for (int x = 0; x &lt; w; x++) { //for the first r rows, just set pixels to white for (int y = 0; y &lt; r; y++) { bytes[x, y] = 255; } //assume pixels of value &lt; 128 are dark pixels in text for (int y = r; y &lt; h - r - 1; y++) { count = 0; //count the dark pixels above and below (x,y) //total range of check is 2r, from -r to +r for (int j = -r; j &lt;= r; j++) { if (arr[x, y + j] &lt; 128) count++; } //if half the pixels are dark, [x,y] is part of vertical stroke bytes[x, y] = count &gt;= r ? (byte)0 : (byte)255; } //for the last r rows, just set pixels to white for (int y = h - r - 1; y &lt; h; y++) { bytes[x, y] = 255; } } //count the number of valid dark pixels in each row float max = 0; float[] bins = new float[h]; //normalized "dark pixel strength" for all h rows int left, right, width; //leftmost and rightmost dark pixels in row bool dark = false; //tracking variable for (int y = 0; y &lt; h; y++) { //initialize values at beginning of loop iteration left = 0; right = 0; width = 100; for (int x = 0; x &lt; w; x++) { //use value of 128 as threshold between light and dark dark = bytes[x, y] &lt; 128; //increment bin if pixel is dark bins[y] += dark ? 1 : 0; //update leftmost and rightmost dark pixels if (dark) { if (left == 0) left = x; if (x &gt; right) right = x; } } width = right - left + 1; //for bins with few pixels, treat them as empty if (bins[y] &lt; 10) bins[y] = 0; //normalize value according to width //divide bin count by width (leftmost to rightmost) bins[y] /= width; //calculate the maximum bin value so that bins can be scaled when drawn if (bins[y] &gt; max) max = bins[y]; } //calculated the smoothed value of each bin i by averaging bin i-1, i, and i+1 float[] smooth = new float[bins.Length]; smooth[0] = bins[0]; smooth[smooth.Length - 1] = bins[bins.Length - 1]; for (int i = 1; i &lt; bins.Length - 1; i++) { smooth[i] = (bins[i - 1] + bins[i] + bins[i + 1])/3; } //create a new bitmap based on the original bitmap, then draw bins on top Bitmap bmp = new Bitmap(originalBitmap); using (Graphics gr = Graphics.FromImage(bmp)) { for (int y = 0; y &lt; bins.Length; y++) { //scale each bin so that it is drawn 200 pixels wide from the left edge float value = 200 * (float)smooth[y] / max; gr.DrawLine(Pens.Red, new PointF(0, y), new PointF(value, y)); } } pictureBox1.Image = bmp; </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload