Note that there are some explanatory texts on larger screens.

plurals
  1. POPython Joint Distribution of N Variables
    primarykey
    data
    text
    <p>So I need to calculate the joint probability distribution for N variables. I have code for two variables, but I am having trouble generalizing it to higher dimensions. I imagine there is some sort of pythonic vectorization that could be helpful, but, right now my code is very C like (and yes I know that is not the right way to write Python). My 2D code is below:</p> <pre><code>import numpy import math feature1 = numpy.array([1.1,2.2,3.0,1.2,5.4,3.4,2.2,6.8,4.5,5.6,1.9,2.8,3.7,4.4,7.3,8.3,8.1,7.0,8.0,6.8,6.2,4.9,5.7,6.3,3.7,2.4,4.5,8.5,9.5,9.9]); feature2 = numpy.array([11.1,12.8,13.0,11.6,15.2,13.8,11.1,17.8,12.5,15.2,11.6,20.8,14.7,14.4,15.3,18.3,11.4,17.0,16.0,16.8,12.2,14.9,15.7,16.3,13.7,12.4,14.2,18.5,19.8,19.0]); #===Concatenate All Features===# numFrames = len(feature1); allFeatures = numpy.zeros((2,numFrames)); allFeatures[0,:] = feature1; allFeatures[1,:] = feature2; #===Create the Array to hold all the Bins===# numBins = int(0.25*numFrames); allBins = numpy.zeros((allFeatures.shape[0],numBins+1)); #===Find the maximum and minimum of each feature===# allRanges = numpy.zeros((allFeatures.shape[0],2)); for f in range(allFeatures.shape[0]): allRanges[f,0] = numpy.amin(allFeatures[f,:]); allRanges[f,1] = numpy.amax(allFeatures[f,:]); #===Create the Array to hold all the individual feature probabilities===# allIndividualProbs = numpy.zeros((allFeatures.shape[0],numBins)); #===Grab all the Individual Probs and the Bins===# for f in range(allFeatures.shape[0]): freqhist, binedges = numpy.histogram(allFeatures[f,:],bins=numBins,range=[allRanges[f,0],allRanges[f,1]],density=False); allBins[f,:] = binedges; allIndividualProbs[f,:] = freqhist; #===Create the joint probability array===# jointProbs = numpy.zeros((numBins,numBins)); #===Compute the joint probability distribution===# numElements = 0; for b1 in range(numBins): for b2 in range(numBins): for f1 in range(numFrames): for f2 in range(numFrames): if ( ( (feature1[f1] &gt;= allBins[0,b1]) and (feature1[f1] &lt;= allBins[0,b1+1]) ) and ((feature2[f2] &gt;= allBins[1,b2]) and (feature2[f2] &lt;= allBins[1,b2+1])) ): jointProbs[b1,b2] += 1; numElements += 1; jointProbs /= numElements; #===But what if I add the following===# feature3 = numpy.array([21.1,21.8,23.5,27.6,25.2,23.8,22.1,22.8,26.5,25.2,28.6,20.8,24.7,24.4,29.3,28.3,27.4,26.0,26.2,26.1,25.9,24.0,22.7,22.3,23.7,26.4,24.2,28.5,29.8,29.0]); </code></pre> <p>How can I generalize the large loop? For N variables (features) this loop would be enormous. Is there a Pythonic way to do this easily?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload