## Pandas MultiIndex lookup with Numpy arrays

I'm working with a pandas DataFrame that represents a graph. The dataframe is indexed by a MultiIndex that indicates the node endpoints.

Setup:

```
import pandas as pd
import numpy as np
import itertools as it
edges = list(it.combinations([1, 2, 3, 4], 2))
# Define a dataframe to represent a graph
index = pd.MultiIndex.from_tuples(edges, names=['u', 'v'])
df = pd.DataFrame.from_dict({
'edge_id': list(range(len(edges))),
'edge_weight': np.random.RandomState(0).rand(len(edges)),
})
df.index = index
print(df)
## -- End pasted text --
edge_id edge_weight
u v
1 2 0 0.5488
3 1 0.7152
4 2 0.6028
2 3 3 0.5449
4 4 0.4237
3 4 5 0.6459
```

I want to be able to index into the graph using an edge subset, which is why I've chosen to use a `MultiIndex`

. I'm able to do this just fine as long as the input to `df.loc`

is a list of tuples.

```
# Select subset of graph using list-of-tuple indexing
edge_subset1 = [edges[x] for x in [0, 3, 2]]
df.loc[edge_subset1]
## -- End pasted text --
edge_id edge_weight
u v
1 2 0 0.5488
2 3 3 0.5449
1 4 2 0.6028
```

However, when my list of edges is a numpy array (as it often is), or a list of lists, then I seem to be unable to use the `df.loc`

property.

```
# Why can't I do this if `edge_subset2` is a numpy array?
edge_subset2 = np.array(edge_subset1)
df.loc[edge_subset2]
## -- End pasted text --
TypeError: unhashable type: 'numpy.ndarray'
```

It would be ok if I could just all `arr.tolist()`

, but this results in a seemingly different error.

```
# Why can't I do this if `edge_subset2` is a numpy array?
# or if `edge_subset3` is a list-of-lists?
edge_subset3 = edge_subset2.tolist()
df.loc[edge_subset3]
## -- End pasted text --
TypeError: '[1, 2]' is an invalid key
```

It's a real pain to have to use `list(map(tuple, arr.tolist()))`

every time I want to select a subset. It would be nice if there was another way to do this.

The main questsions are:

Why can't I use a numpy array with

`.loc`

? Is it because under the hood a dictionary is being used to map the multi-index labels to positional indices?Why does a list-of-lists give a different error? Maybe its really the same problem its just caught a different way?

Is there another (ideally less-verbose) way to lookup a subset of a dataframe with a numpy array of multi-index labels that I'm unaware of?

Show source

## Answers ( 0 )