How does NumPy multidimensional array iteration work? (With and without nditer)

Question

Note: I am not sure if this is a duplicate or not -- please let me know if it is (and close the question).

If one has a 1-dimensional NumPy array `vector`, then if one writes a for loop of the form:

``````for element in vector :
print(element)
``````

The result will print each element of the NumPy array.

If one has a 2-dimensional NumPy array `matrix`, then if one writes a for loop of the form:

``````for vector in matrix :
print(vector)
``````

The result will print each row of the 2-dimensional NumPy array, i.e. it will print 1-dimensional NumPy arrays, and it will not print each element of the array individually.

However, if one instead writes the for loop as:

``````import numpy
for element in numpy.nditer(matrix) :
print(element)
``````

The result will print each element of the 2-dimensional NumPy array.

Question: What happens if one has a 3-dimensional NumPy array, `tensor`?

a. If one writes a for loop of the form:

``````for unknownType in tensor :
print(unknownType)
``````

Will this print the constituent 2-dimensional NumPy (sub-)arrays of `tensor`?

I.e. for an n-dimensional NumPy array `nArray`, does `for unknownType in nArray :` iterate over the constituent (n-1)-dimensional NumPy (sub-)arrays of `nArray`?

b. If one writes a for loop of the form:

``````for unknownType in numpy.nditer(tensor) :
print(unknownType)
``````

Will this print the elements of `tensor`? Or will it print the constituent 1-dimensional NumPy (sub-)arrays of the constituent 2-dimensional NumPy (sub-)arrays of `tensor`?

I.e. for an n-dimensional NumPy array `nArray`, does `for unknownType in nditer(nArray) :` iterate over the elements of `nArray`? Or does it iterate over the constituent (n-2)-dimensional NumPy (sub-)arrays of the constituent (n-1)-dimensional NumPy (sub-)arrays of `nArray`?

It is unclear to me from the name `nditer`, since I don't know what "nd" stands for ("iter" is obviously short for "iteration"). And presumably one can think of the elements as "0-dimensional NumPy arrays", so the examples given to me for 2-dimensional NumPy arrays are ambiguous.

I've looked at the `np.nditer` documentation but honestly I didn't understand the examples or what they were trying to demonstrate -- it seems like it was written for programmers (which I am not) by programmers.

Show source

Answers to How does NumPy multidimensional array iteration work? (With and without nditer) ( 2 )

1. If you just use a `for` loop the iteration is over the first dimension, if the array has only one dimension this will be the elements, if it's 2D it will be the rows, if it's 3D it will iterate over the planes, ...

However `nditer` is a ND (stands for n-dimensional) iterator. It will iterate over each element in the array. It's (roughly!) equivalent to `for item in your_array.ravel()` (iterating over a flattened "view" of the array). For 1D arrays it iterates over the elements, for 2D arrays it iterates first over the elements in the first row, then over the second row, and so on.

Note that `nditer` is much more powerful than that, it can iterate over multiple arrays at once, you can buffer the iteration and a lot of other stuff.

However with NumPy you generally don't want to use a `for`-loop or `np.nditer`. There are lots of "vectorized" operations that make manual iteration (in most cases) unnecessary.

2. a)

`for x in arr:` iterates on the 1st dimension of an array.

``````In [233]: for x in np.arange(24).reshape((2,3,4)):
...:     print(x.shape)
...:
(3, 4)
(3, 4)
``````

I think of it as `for x in list(arr):...`. It breaks the array into a list of subarrays.

b)

It's tricky to control the depth of iteration with `nditer`. As a default it iterates at the element level. The tutorial page shows some tricks using buffers and order. but the best way I seen is to use `ndindex`.

`ndindex` constructs a dummy array of the right size, and does `multi_index` iteration.

For example to iterate on the 1st 2 dimensions of a 3d array:

``````In [237]: arr = np.arange(24).reshape(2,3,4)
In [240]: for idx in np.ndindex(arr.shape[:2]):
...:     print(idx, arr[idx], arr[idx].sum())
...:
(0, 0) [0 1 2 3] 6
(0, 1) [4 5 6 7] 22
(0, 2) [ 8  9 10 11] 38
(1, 0) [12 13 14 15] 54
(1, 1) [16 17 18 19] 70
(1, 2) [20 21 22 23] 86
``````

I could do the same iteration with

``````for i in range(2):
for j in range(3):
arr[i,j]...
``````

or

``````arr1 = arr.reshape(-1,4)
for ij in range(6):
arr1[ij]....
``````

Speed will be basically the same - all poor compared to array functions that work on the whole 3d array at once, or ones that take some sort of `axis` parameter.

``````In [241]: arr.sum(axis=2)
Out[241]:
array([[ 6, 22, 38],
[54, 70, 86]])
``````

The class for numpy as arrays is `np.ndarray`. Presumably `nditer` is named like that. `nditer` was written as a way of consolidating the various that `c` level code could iterate on arrays, especially several broadcastable ones. The `np.nditer` function gives access to the `c` level iterator. But since the actually iteration is still being done in Python code, so there's little to no speed advantage.