Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I've been working on this kind of thing in my spare time for quite a while. I think I'm on my third or fourth version of this same problem right now. I'm actually getting ready to release another version of Fathom (https://github.com/davidrichards/fathom/wiki) with dynamic bayesian models included and a different persistence layer.</p> <p>As I've tried to make my answer clear, it's gotten quite long. I apologize for that. Here's how I've been attacking the problem, which seems to answer some of your questions (somewhat indirectly):</p> <p>I've started with Judea Pearl's breakdown of belief propagation in a Bayesian Network. That is, it's a graph with prior odds (causal support) coming from parents and likelihoods (diagnostic support) coming from children. In this way, the basic class is just a BeliefNode, much like what you described with an extra node between BeliefNodes, a LinkMatrix. In this way, I explicitly choose the type of likelihood I'm using by the type of LinkMatrix I use. It makes it easier to explain what the belief network is doing afterwards as well as keeps the computation simpler.</p> <p>Any subclassing or changes that I'd make to the basic BeliefNode would be for binning continuous variables, rather than changing propagation rules or node associations.</p> <p>I've decided on keeping all data inside the BeliefNode, and only fixed data in the LinkedMatrix. This has to do with ensuring that I maintain clean belief updates with minimal network activity. This means that my BeliefNode stores:</p> <ul> <li>an array of children references, along with the filtered likelihoods coming from each child and the link matrix that is doing the filtering for that child</li> <li>an array of parent references, along with the filtered prior odds coming from each parent and the link matrix that is doing the filtering for that parent</li> <li>the combined likelihood of the node</li> <li>the combined prior odds of the node</li> <li>the computed belief, or posterior probability</li> <li>an ordered list of attributes that all prior odds and likelihoods adhere to</li> </ul> <p>The LinkMatrix can be constructed with a number of different algorithms, depending on the nature of the relationship between the nodes. All of the models that you're describing would just be different classes that you'd employ. Probably the easiest thing to do is default to an or-gate, and then choose other ways to handle the LinkMatrix if we have a special relationship between the nodes. </p> <p>I use MongoDB for persistence and caching. I access this data inside of an evented model for speed and asynchronous access. This makes the network fairly performant while also having the opportunity to be very large if it needs to be. Also, since I'm using Mongo in this way, I can easily create a new context for the same knowledge base. So, for example, if I have a diagnostic tree, some of the diagnostic support for a diagnosis will come from a patient's symptoms and tests. What I do is create a context for that patient and then propagate my beliefs based on the evidence from that particular patient. Likewise, if a doctor said that a patient was likely experiencing two or more diseases, then I could change some of my link matrices to propagate the belief updates differently.</p> <p>If you don't want to use something like Mongo for your system, but you are planning on having more than one consumer working on the knowledge base, you will need to adopt some sort of caching system to make sure that you are working on freshly-updated nodes at all times.</p> <p>My work is open source, so you can follow along if you'd like. It's all Ruby, so it would be similar to your Python, but not necessarily a drop-in replacement. One thing that I like about my design is that all of the information needed for humans to interpret the results can be found in the nodes themselves, rather than in the code. This can be done in the qualitative descriptions, or in the structure of the network.</p> <p>So, here are some important differences I have with your design:</p> <ul> <li>I don't compute the likelihood model inside the class, but rather between nodes, inside the link matrix. In this way, I don't have the problem of combining several likelihood functions inside the same class. I also don't have the problem of one model vs. another, I can just use two different contexts for the same knowledge base and compare results.</li> <li>I'm adding a lot of transparency by making the human decisions apparent. I.e., if I decide to use a default or-gate between two nodes, I know when I added that and that it was just a default decision. If I come back later and change the link matrix and re-calculate the knowledge base, I have a note about why I did that, rather than just an application that chose one method over another. You could have your consumers take notes about that kind of thing. However you solve that, it's probably a good idea to get the step-wise dialog from the analyst about why they are setting things up one way over another.</li> <li>I may be more explicit about prior odds and likelihoods. I don't know for sure on that, I just saw that you were using different models to change your likelihood numbers. Much of what I'm saying may be completely irrelevant if your model for computing posterior beliefs doesn't break down this way. I have the benefit of being able to make three asynchronous steps that can be called in any order: pass changed likelihoods up the network, pass changed prior odds down the network, and re-calculate the combined belief (posterior probability) of the node itself.</li> </ul> <p>One big caveat: some of what I'm talking about hasn't been released yet. I worked on the stuff I am talking about until about 2:00 this morning, so it's definitely current and definitely getting regular attention from me, but isn't all available to the public just yet. Since this is a passion of mine, I'd be happy to answer any questions or work together on a project if you'd like.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload