## Index out of bounds when replacing NaNs through a function in Pandas

Question

I have created a function that replaces the NaNs in a Pandas dataframe with the means of the respective columns. I tested the function with a small dataframe and it worked. When I applied it though to a much larger dataframe (30,000 rows, 9 columns) I got the error message: IndexError: index out of bounds

The function is the following:

```
# The 'update' function will replace all the NaNs in a dataframe with the mean of the respective columns
def update(df): # the function takes one argument, the dataframe that will be updated
ncol = df.shape[1] # number of columns in the dataframe
for i in range(0 , ncol): # loops over all the columns
df.iloc[:,i][df.isnull().iloc[:, i]]=df.mean()[i] # subsets the df using the isnull() method, extracting the positions
# in each column where the
return(df)
```

The small dataframe I used to test the function is the following:

```
0 1 2 3
0 NaN NaN 3 4
1 NaN NaN 7 8
2 9.0 10.0 11 12
```

Could you explain the error? Your advice will be appreciated.

Show source

## Answers ( 2 )

I would use DataFrame.fillna() method in conjunction with DataFrame.mean() method:

Mean values:

The reason you are getting "index out of bounds" is because you are assigning the value

`df.mean()[i]`

when`i`

is one iteration of what are supposed to be ordinal positions.`df.mean()`

is a`Series`

whose indices are the columns of`df`

.`df.mean()[something]`

implies`something`

better be a column name. But they aren't and that's why you get your error.your code... fixedAlso, your function is altering the

`df`

directly. You may want to be careful. I'm not sure that's what you intended.All that said. I'd recommend another approach

You could use any number of methods to fill missing with the mean. I'd suggest using @MaxU's answer.

`df.where`

takes

`df`

when first arg is`True`

otherwise second argument`df.combine_first`

with awkward`pandas`

broadcasting`np.where`