Note that there are some explanatory texts on larger screens.

plurals
  1. POAre there any cleverly efficient algorithms to perform a calculation over the space of partitionings of a string?
    text
    copied!<p>I'm working on a statistical project that involves iterating over every possible way to partition a collection of strings and running a simple calculation on each. Specifically, each possible substring has a probability associated with it, and I'm trying to get the sum across all partitions of the product of the substring probability in the partition. </p> <p>For example, if the string is 'abc', then there would be probabilities for 'a', 'b', 'c', 'ab, 'bc' and 'abc'. There are four possible partitionings of the string: 'abc', 'ab|c', 'a|bc' and 'a|b|c'. The algorithm needs to find the product of the component probabilities for each partitioning, then sum the four resultant numbers. </p> <p>Currently, I've written a python iterator that uses binary representations of integers for the partitions (eg 00, 01, 10, 11 for the example above) and simply runs through the integers. Unfortunately, this is immensely slow for strings longer than 20 or so characters. </p> <p>Can anybody think of a clever way to perform this operation without simply running through every partition one at a time? I've been stuck on this for days now. </p> <p>In response to some comments here is some more information:<br> The string can be just about anything, eg "foobar(foo2)" -- our alphabet is lowercase alphanumeric plus all three type of braces ("(","[","{"), hyphens and spaces.<br> The goal is to get the likelihood of the string given individual 'word' likelihoods. So L(S='abc')=P('abc') + P('ab')P('c') + P('a')P('bc') + P('a')P('b')P('c') (Here "P('abc')" indicates the probability of the 'word' 'abc', while "L(S='abc')" is the statistical likelihood of observing the string 'abc').</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload