Note that there are some explanatory texts on larger screens.

plurals
  1. POCount occurences in byte list/array using another byte list/array
    primarykey
    data
    text
    <p>I am trying to get a count of all the times a byte sequences occurs in another byte sequences. It cannot however re-use a bytes if it already counted them. For example given the string<br> <code>k.k.k.k.k.k.</code> let's assume the byte sequence was <code>k.k</code> it would then find only 3 occurrences rather than 5 because they would be broke down like: <code>[k.k].[k.k].[k.k].</code> and not like <code>[k.[k].[k].[k].[k].k]</code> where they over lap and essentially just shift 2 to the right.</p> <p>Ideally the idea is to get an idea how a compression dictionary or run time encoding might look. so the goal would be to get</p> <p><code>k.k.k.k.k.k.</code> down to just 2 parts, as (k.k.k.) is the biggest and best symbol you can have.</p> <p>Here is source so far:</p> <pre><code>using System; using System.Collections.Generic; using System.Collections; using System.Linq; using System.Text; using System.IO; static class Compression { static int Main(string[] args) { List&lt;byte&gt; bytes = File.ReadAllBytes("ok.txt").ToList(); List&lt;List&lt;int&gt;&gt; list = new List&lt;List&lt;int&gt;&gt;(); // Starting Numbers of bytes - This can be changed manually. int StartingNumBytes = bytes.Count; for (int i = StartingNumBytes; i &gt; 0; i--) { Console.WriteLine("i: " + i); for (int ii = 0; ii &lt; bytes.Count - i; ii++) { Console.WriteLine("ii: " + i); // New pattern comes with refresh data. List&lt;byte&gt; pattern = new List&lt;byte&gt;(); for (int iii = 0; iii &lt; i; iii++) { pattern.Add(bytes[ii + iii]); } DisplayBinary(bytes, "red"); DisplayBinary(pattern, "green"); int matches = 0; // foreach (var position in bytes.ToArray().Locate(pattern.ToArray())) for (int position = 0; position &lt; bytes.Count; position++) { if (pattern.Count &gt; (bytes.Count - position)) { continue; } for (int iiii = 0; iiii &lt; pattern.Count; iiii++) { if (bytes[position + iiii] != pattern[iiii]) { //Have to use goto because C# doesn't support continue &lt;level&gt; goto outer; } } // If it made it this far, it has found a match. matches++; Console.WriteLine("Matches: " + matches + " Orig Count: " + bytes.Count + " POS: " + position); if (matches &gt; 1) { int numBytesToRemove = pattern.Count; for (int ra = 0; ra &lt; numBytesToRemove; ra++) { // Remove it at the position it was found at, once it // deletes the first one, the list will shift left and you'll need to be here again. bytes.RemoveAt(position); } DisplayBinary(bytes, "red"); Console.WriteLine(pattern.Count + " Bytes removed."); // Since you deleted some bytes, set the position less because you will need to redo the pos. position = position - 1; } outer: continue; } List&lt;int&gt; sublist = new List&lt;int&gt;(); sublist.Add(matches); sublist.Add(pattern.Count); // Some sort of calculation to determine how good the symbol was sublist.Add(bytes.Count-((matches * pattern.Count)-matches)); list.Add(sublist); } } Display(list); Console.Read(); return 0; } static void DisplayBinary(List&lt;byte&gt; bytes, string color="white") { switch(color){ case "green": Console.ForegroundColor = ConsoleColor.Green; break; case "red": Console.ForegroundColor = ConsoleColor.Red; break; default: break; } for (int i=0; i&lt;bytes.Count; i++) { if (i % 8 ==0) Console.WriteLine(); Console.Write(GetIntBinaryString(bytes[i]) + " "); } Console.WriteLine(); Console.ResetColor(); } static string GetIntBinaryString(int n) { char[] b = new char[8]; int pos = 7; int i = 0; while (i &lt; 8) { if ((n &amp; (1 &lt;&lt; i)) != 0) { b[pos] = '1'; } else { b[pos] = '0'; } pos--; i++; } //return new string(b).TrimStart('0'); return new string(b); } static void Display(List&lt;List&lt;int&gt;&gt; list) { // // Display everything in the List. // Console.WriteLine("Elements:"); foreach (var sublist in list) { foreach (var value in sublist) { Console.Write("{0,4}", value); } Console.WriteLine(); } // // Display total count. // int count = 0; foreach (var sublist in list) { count += sublist.Count; } Console.WriteLine("Count:"); Console.WriteLine(count); } static public int SearchBytePattern(byte[] pattern, byte[] bytes) { int matches = 0; // precomputing this shaves some seconds from the loop execution int maxloop = bytes.Length - pattern.Length; for (int i = 0; i &lt; maxloop; i++) { if (pattern[0] == bytes[i]) { bool ismatch = true; for (int j = 1; j &lt; pattern.Length; j++) { if (bytes[i + j] != pattern[j]) { ismatch = false; break; } } if (ismatch) { matches++; i += pattern.Length - 1; } } } return matches; } } </code></pre> <p>Refer to the post to get the non binary of the file should be, here is the binary data: <code>011010110010111001101011001011100110101100101110011010110010111001101011001011100110101100101110</code> I am hope to have it smaller than how it started.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload