Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>You decomposed the wrong matrix.</p> <p>Principal Component Analysis requires manipulating the eigenvectors/eigenvalues of the <strong><em>covariance matrix</em></strong>, not the data itself. The covariance matrix, created from an m x n data matrix, will be an m x m matrix with ones along the main diagonal.</p> <p>You can indeed use the <em>cov</em> function, but you need further manipulation of your data. It's probably a little easier to use a similar function, <strong><em>corrcoef</em></strong>:</p> <pre><code>import numpy as NP import numpy.linalg as LA # a simulated data set with 8 data points, each point having five features data = NP.random.randint(0, 10, 40).reshape(8, 5) # usually a good idea to mean center your data first: data -= NP.mean(data, axis=0) # calculate the covariance matrix C = NP.corrcoef(data, rowvar=0) # returns an m x m matrix, or here a 5 x 5 matrix) # now get the eigenvalues/eigenvectors of C: eval, evec = LA.eig(C) </code></pre> <p>To get the eigenvectors/eigenvalues, I did not decompose the covariance matrix using SVD, though, you certainly can. My preference is to calculate them using <em>eig</em> in NumPy's (or SciPy's) LA module--it is a little easier to work with than <em>svd</em>, the return values are the eigenvectors and eigenvalues themselves, and nothing else. By contrast, as you know, <em>svd</em> doesn't return these these directly.</p> <p>Granted the SVD function will decompose any matrix, not just square ones (to which the <em>eig</em> function is limited); however when doing PCA, you'll always have a square matrix to decompose, regardless of the form that your data is in. This is obvious because the matrix you are decomposing in PCA is a <em>covariance matrix</em>, which by definition is always square (i.e., the columns are the individual data points of the original matrix, likewise for the rows, and each cell is the covariance of those two points, as evidenced by the ones down the main diagonal--a given data point has perfect covariance with itself).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload