Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <h2>Summary</h2> <p>In a nutshell, just move the view before the slicing.</p> <p>Instead of:</p> <pre><code>ar2 = zeros((1000,2000),dtype=uint16) ar2 = ar2[:,1000:] ar2 = ar2.view(dtype=uint8) </code></pre> <p>Do:</p> <pre><code>ar2 = zeros((1000,2000),dtype=uint16) ar2 = ar2.view(dtype=uint8) # ar2 is now a 1000x4000 array... ar2 = ar2[:,2000:] # Note the 2000 instead of 1000! </code></pre> <p>What's happening is that the sliced array isn't contiguous (as @Craig noted) and <code>view</code> errs on the conservative side and doesn't try to re-interpret non-contiguous memory buffers. (It happens to be possible in this exact case, but in some cases it would result in a non-evenly-strided array, which numpy doesn't allow.)</p> <hr> <p>If you're not very familiar with <code>numpy</code>, it's possible that you're misunderstanding <code>view</code>, and you actually want <code>astype</code> instead.</p> <hr> <h2>What does <code>view</code> do?</h2> <p>First off, let's take a detailed look at what <code>view</code> does. In this case, it re-interprets the memory buffer of a numpy array as a new datatype, if possible. That means that the <em>number of elements in the array</em> will often change when you use view. (You can also use it to view the array as a different subclass of <code>ndarray</code>, but we'll skip that part for now.)</p> <p>You may already be aware of the following (your problem is a bit more subtle), but if not, here's an explanation.</p> <p>As an example:</p> <pre><code>In [1]: import numpy as np In [2]: x = np.zeros(2, dtype=np.uint16) In [3]: x Out[3]: array([0, 0], dtype=uint16) In [4]: x.view(np.uint8) Out[4]: array([0, 0, 0, 0], dtype=uint8) In [5]: x.view(np.uint32) Out[5]: array([0], dtype=uint32) </code></pre> <p>If you want to make a copy of the array with the new datatype instead, use <code>astype</code>:</p> <pre><code>In [6]: x Out[6]: array([0, 0], dtype=uint16) In [7]: x.astype(np.uint8) Out[7]: array([0, 0], dtype=uint8) In [8]: x.astype(np.uint32) Out[8]: array([0, 0], dtype=uint32) </code></pre> <hr> <p>Now let's take a look at what happens with when viewing a 2D array.</p> <pre><code>In [9]: y = np.arange(4, dtype=np.uint16).reshape(2, 2) In [10]: y Out[10]: array([[0, 1], [2, 3]], dtype=uint16) In [11]: y.view(np.uint8) Out[12]: array([[0, 0, 1, 0], [2, 0, 3, 0]], dtype=uint8) </code></pre> <p>Notice that the shape of the array has changed, and that the changes have happened along the last axis (in this case, extra columns have been added).</p> <p>At first glance it may appear that extra zeros have been added. It's <em>not</em> that extra zeros are being inserted, it's that the <code>uint16</code> representation of <code>2</code> is equivalent to two <code>uint8</code>s, one with a value of <code>2</code> and one with a value of <code>0</code>. Therefore, any <code>uint16</code> under 255 will result in the value and a zero, while any value over that will result in two smaller <code>uint8</code>s. As an example:</p> <pre><code>In [13]: y * 100 Out[14]: array([[ 0, 100], [200, 300]], dtype=uint16) In [15]: (y * 100).view(np.uint8) Out[15]: array([[ 0, 0, 100, 0], [200, 0, 44, 1]], dtype=uint8) </code></pre> <hr> <h2>What's happening behind the scenes</h2> <p>Numpy arrays consist of a "raw" memory buffer that's interpreted through a shape, a dtype, and strides (and an offset, but let's ignore that for now). For more detail, there are several good overviews: <a href="http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html" rel="nofollow">the official documentation</a>, <a href="http://csc.ucdavis.edu/~chaos/courses/nlp/Software/NumPyBook.pdf" rel="nofollow">the numpy book</a>, or <a href="http://scipy-lectures.github.io/advanced/advanced_numpy/" rel="nofollow">scipy-lectures</a>.</p> <p>This allows numpy to be very memory efficient and "slice and dice" the underlying memory buffer in many different ways without making a copy. </p> <p>Strides tell numpy how many bytes to jump within the memory buffer to go one increment along a particular axis. </p> <p>For example:</p> <pre><code>In [17]: y Out[17]: array([[0, 1], [2, 3]], dtype=uint16) In [18]: y.strides Out[18]: (4, 2) </code></pre> <p>So, to go one row deeper in the array, numpy needs to step forward 4 bytes in the memory buffer, while to go one column farther in the array, numpy needs to step 2 bytes. Transposing the array just amounts to reversing the strides (and shape, but in this case, <code>y</code> is 2x2):</p> <pre><code>In [19]: y.T.strides Out[19]: (2, 4) </code></pre> <p>When we view the array as <code>uint8</code>, the strides change. We still step forward 4 bytes per row, but only one byte per column:</p> <pre><code>In [20]: y.view(np.uint8).strides Out[20]: (4, 1) </code></pre> <p>However, numpy arrays have to have the one stride length per dimension. This is what "evenly-strided" means. In other words, do move forward one row/column/whatever, numpy needs to be able to step the same amount through the underlying memory buffer each time. In other words, there's no way to tell numpy to step different amounts for each row/column/whatever.</p> <p>For that reason, <code>view</code> takes a very conservative route. If the array isn't contiguous, and the view would change the shape and strides of the array, it doesn't try to handle it. As @Craig noted, it's because the slice of <code>y</code> isn't contiguous that <code>view</code> isn't working.</p> <p>There are plenty of cases (yours is one) where the resulting array would be valid, but the <code>view</code> method doesn't try to be too smart about it. </p> <p>To really play around with what's possible, you can use <code>numpy.lib.stride_tricks.as_strided</code> or directly use the <a href="http://docs.scipy.org/doc/numpy/reference/arrays.interface.html#__array_interface__" rel="nofollow"><code>__array_interface__</code></a>. It's a good learning tool to experiment with it, but you have to really understand what you're doing to use it effectively.</p> <p>Hopefully that helps a bit, anyway! Sorry for the long-winded answer!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload