Increase a numpy array's elements by 1 at particular indices (for use with grouping an astropy table)

Question

That wasn't perhaps the best description in the title, but I can hopefully describe my problem below. There's really two parts to it.

The ultimate thing I'm trying to do is group certain times together within an astropy table - as the values are not the same for each time that will go into a particular group, I don't believe I can just give the column name in the group_by() method.

So, what I'm trying to do is produce an array describing which group each time with be associated with so that I can pass that to group_by(). I can get the bin edges by performing, for example (the 10 is arbitrary),

``````>>> np.where(np.diff(table['Times']) > 10)[0]
array([ 2,  8, 9, 12])
``````

Let's say the table has length 15. What I want to know is how it might be possible to use that array above to create the following array without having to use loops

``````array([0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 3, 3, 3, 4, 4])
``````

such that when I place that array in the group_by() method it groups the table according to those bin edges.

Alternatively, if there is a better way of grouping an astropy table according to time ranges.

Show source

Answers to Increase a numpy array&#39;s elements by 1 at particular indices (for use with grouping an astropy table) ( 2 )

1. One approach with `np.repeat` -

``````def repeat_based(bin_edges, n):
reps = np.diff(np.hstack((-1,bin_edges,n-1)))
return np.repeat(np.arange(bin_edges.size+1),reps)
``````

Another approach with `np.cumsum` -

``````def cumsum_based(bin_edges, n):
id_arr = np.zeros(n,dtype=int)
id_arr[bin_edges+1] = 1
return id_arr.cumsum()
``````

Sample run -

``````In [400]: bin_edges = np.array([ 2,  8, 9, 12])

In [401]: repeat_based(bin_edges, n = 15)
Out[401]: array([0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 3, 3, 3, 4, 4])

In [402]: cumsum_based(bin_edges, n = 15)
Out[402]: array([0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 3, 3, 3, 4, 4])
``````
2. It sounds like `np.digitize` should do what you want. Using `arr` in place of your table, try

``````arr = np.array([1,2,3,15, 16, 17, 17, 18, 19, 30,41,42, 43, 55, 56])
bin_edges = arr[np.where(np.diff(arr) > 10)[0]]
indices = np.digitize(arr, bin_edges, right=True)
print indices
``````