sort Pandas dataframe with NaT values on top

Question

Im trying to sort a pandas dataframe with NaT values on top. I'm using the df.sort_values function:

df=df.sort_values(by='date_of_last_hoorah_given')

Its working fine and I'm getting a sorted dataframe with NaT values in the bottom

    date_of_last_hoorah_given                              email   first_name  \
16 2016-12-19 07:36:08.000000              mindy.lyndi@hoorah.io        Mindy   
29 2016-12-19 07:36:08.000000              judi.seward@hoorah.io         Judi   
7  2016-12-19 07:36:08.000000                  chao.an@hoorah.io         Chao   
21 2016-12-19 07:36:08.000000              bala.harish@hoorah.io         Bala   
12 2016-12-19 07:36:08.000000            pushpa.swaran@hoorah.io       Pushpa   
30 2016-12-22 07:36:08.000000       sparrow.freespirit@hoorah.io      Sparrow   
28 2016-12-22 07:36:08.000000         sanjeev.prasanna@hoorah.io      Sanjeev   
27 2016-12-22 07:36:08.000000     twinklenose.snowleaf@hoorah.io  Twinklenose   
25 2016-12-22 07:36:08.000000       sweetgaze.sugarshy@hoorah.io    Sweetgaze   
23 2016-12-22 07:36:08.000000            shreya.sarika@hoorah.io       Shreya   
19 2016-12-22 07:36:08.000000              jiahao.dong@hoorah.io       Jiahao   
15 2016-12-22 07:36:08.000000            jannine.tyson@hoorah.io       Janine   
14 2016-12-22 07:36:08.000000                arlo.reed@hoorah.io         Arlo   
0  2016-12-22 07:36:08.000000         aditya.hariharan@hoorah.io       Aditya   
11 2016-12-22 07:36:08.000000        shirley.madalitso@hoorah.io      Shirley   
2  2016-12-22 07:36:08.000000             minerva.jena@hoorah.io     Minerva    
3  2016-12-22 07:36:08.000000             colby.brandi@hoorah.io        Colby   
13 2016-12-22 07:36:08.000000            beverly.cohen@hoorah.io      Beverly   
6  2016-12-22 07:36:08.000000             guanting.jun@hoorah.io     Guanting   
5  2016-12-22 07:36:08.000000                  chen.tu@hoorah.io         Chen   
18 2016-12-22 10:55:03.474683                  fen.lin@hoorah.io          Fen   
9  2016-12-23 07:36:08.000000             kourtney.pam@hoorah.io     Kourtney   
10 2016-12-23 14:30:55.206581             kailee.alfie@hoorah.io       Kailee   
4  2016-12-24 07:36:08.000000                jing.chao@hoorah.io        Jing    
31 2016-12-24 16:02:28.945809               rich.oswin@hoorah.io         Rich   
24 2016-12-25 07:36:08.000000           ganesh.vasanta@hoorah.io       Ganesh   
8  2016-12-26 07:36:08.000000               xia.yaling@hoorah.io          Xia   
20 2016-12-27 07:36:08.000000              kinley.joan@hoorah.io       Kinley   
22 2016-12-28 07:36:08.000000   honeygleam.dazzlesmile@hoorah.io   Honeygleam   
26 2016-12-28 15:29:48.629929             indira.padma@hoorah.io       Indira   
17 2016-12-29 02:27:11.125078             ileen.gaynor@hoorah.io        Ileen   
32 2016-12-29 15:38:02.335296            ragnar.lestat@hoorah.io       Ragnar   
1                         NaT  flitterbeam.clovergaze@hoorah.com  Flitterbeam   

but when I try to get it on top with the following piece of code:

df=df.sort_values(by='date_of_last_hoorah_given',ascending=[1,0])

Im getting a valueError: Length of ascending (2) != length of by (1) Full stack trace given below:

ValueError                                Traceback (most recent call last)
<ipython-input-107-948a8354aeeb> in <module>()
      1 cd = ClientData(1)
----> 2 cd.get_inactive_users()

<ipython-input-106-ed230054ea86> in get_inactive_users(self)
    346             inactive_users_result.append(user_dict)
    347         df=pd.DataFrame(inactive_users_result)
--> 348         df=df.sort_values(by='date_of_last_hoorah_given',ascending=[1,0])
    349         print(df)

C:\Users\aditya\Anaconda3\lib\site-packages\pandas\core\frame.py in sort_values(self, by, axis, ascending, inplace, kind, na_position)
   3126         if com.is_sequence(ascending) and len(by) != len(ascending):
   3127             raise ValueError('Length of ascending (%d) != length of by (%d)' %
-> 3128                              (len(ascending), len(by)))
   3129         if len(by) > 1:
   3130             from pandas.core.groupby import _lexsort_indexer

ValueError: Length of ascending (2) != length of by (1)

Show source
| numpy   | sorting   | pandas   | python   2017-01-05 07:01 2 Answers

Answers to sort Pandas dataframe with NaT values on top ( 2 )

  1. 2017-01-05 07:01

    You cannot use 2 values in ascending=[1,0], because sort only one column:

    If need descending sort use False, True is by default:

    df=df.sort_values(by='date_of_last_hoorah_given',ascending=False)
    print (df)
         date_of_last_hoorah_given                              email   first_name
    1                          NaT  flitterbeam.clovergaze@hoorah.com  Flitterbeam
    32  2016-12-29 15:38:02.335296            ragnar.lestat@hoorah.io       Ragnar
    17  2016-12-29 02:27:11.125078             ileen.gaynor@hoorah.io        Ileen
    26  2016-12-28 15:29:48.629929             indira.padma@hoorah.io       Indira
    22  2016-12-28 07:36:08.000000   honeygleam.dazzlesmile@hoorah.io   Honeygleam
    20  2016-12-27 07:36:08.000000              kinley.joan@hoorah.io       Kinley
    8   2016-12-26 07:36:08.000000               xia.yaling@hoorah.io          Xia
    24  2016-12-25 07:36:08.000000           ganesh.vasanta@hoorah.io       Ganesh
    31  2016-12-24 16:02:28.945809               rich.oswin@hoorah.io         Rich
    4   2016-12-24 07:36:08.000000                jing.chao@hoorah.io         Jing
    10  2016-12-23 14:30:55.206581             kailee.alfie@hoorah.io       Kailee
    9   2016-12-23 07:36:08.000000             kourtney.pam@hoorah.io     Kourtney
    18  2016-12-22 10:55:03.474683                  fen.lin@hoorah.io          Fen
    3   2016-12-22 07:36:08.000000             colby.brandi@hoorah.io        Colby
    5   2016-12-22 07:36:08.000000                  chen.tu@hoorah.io         Chen
    6   2016-12-22 07:36:08.000000             guanting.jun@hoorah.io     Guanting
    13  2016-12-22 07:36:08.000000            beverly.cohen@hoorah.io      Beverly
    2   2016-12-22 07:36:08.000000             minerva.jena@hoorah.io      Minerva
    11  2016-12-22 07:36:08.000000        shirley.madalitso@hoorah.io      Shirley
    0   2016-12-22 07:36:08.000000         aditya.hariharan@hoorah.io       Aditya
    14  2016-12-22 07:36:08.000000                arlo.reed@hoorah.io         Arlo
    15  2016-12-22 07:36:08.000000            jannine.tyson@hoorah.io       Janine
    ...
    ...
    

    If need sort by 2 columns, first ascending and second descending:

    df=df.sort_values(by=['date_of_last_hoorah_given', 'email'],ascending=[True, False])
    

    If need ascending sort with NaN first one possible solution is concat splited DataFrame:

    df.date_of_last_hoorah_given = pd.to_datetime(df.date_of_last_hoorah_given)
    df=df.sort_values(by='date_of_last_hoorah_given')
    mask = df.date_of_last_hoorah_given.isnull()
    print (pd.concat([df[mask], df[~mask]]))
        date_of_last_hoorah_given                              email   first_name
    1                         NaT  flitterbeam.clovergaze@hoorah.com  Flitterbeam
    16 2016-12-19 07:36:08.000000              mindy.lyndi@hoorah.io        Mindy
    29 2016-12-19 07:36:08.000000              judi.seward@hoorah.io         Judi
    7  2016-12-19 07:36:08.000000                  chao.an@hoorah.io         Chao
    21 2016-12-19 07:36:08.000000              bala.harish@hoorah.io         Bala
    12 2016-12-19 07:36:08.000000            pushpa.swaran@hoorah.io       Pushpa
    5  2016-12-22 07:36:08.000000                  chen.tu@hoorah.io         Chen
    6  2016-12-22 07:36:08.000000             guanting.jun@hoorah.io     Guanting
    13 2016-12-22 07:36:08.000000            beverly.cohen@hoorah.io      Beverly
    3  2016-12-22 07:36:08.000000             colby.brandi@hoorah.io        Colby
    11 2016-12-22 07:36:08.000000        shirley.madalitso@hoorah.io      Shirley
    0  2016-12-22 07:36:08.000000         aditya.hariharan@hoorah.io       Aditya
    14 2016-12-22 07:36:08.000000                arlo.reed@hoorah.io         Arlo
    2  2016-12-22 07:36:08.000000             minerva.jena@hoorah.io      Minerva
    19 2016-12-22 07:36:08.000000              jiahao.dong@hoorah.io       Jiahao
    23 2016-12-22 07:36:08.000000            shreya.sarika@hoorah.io       Shreya
    15 2016-12-22 07:36:08.000000            jannine.tyson@hoorah.io       Janine
    25 2016-12-22 07:36:08.000000       sweetgaze.sugarshy@hoorah.io    Sweetgaze
    27 2016-12-22 07:36:08.000000     twinklenose.snowleaf@hoorah.io  Twinklenose
    28 2016-12-22 07:36:08.000000         sanjeev.prasanna@hoorah.io      Sanjeev
    30 2016-12-22 07:36:08.000000       sparrow.freespirit@hoorah.io      Sparrow
    18 2016-12-22 10:55:03.474683                  fen.lin@hoorah.io          Fen
    9  2016-12-23 07:36:08.000000             kourtney.pam@hoorah.io     Kourtney
    10 2016-12-23 14:30:55.206581             kailee.alfie@hoorah.io       Kailee
    4  2016-12-24 07:36:08.000000                jing.chao@hoorah.io         Jing
    ...
    ...
    
  2. 2017-01-05 07:01

    The issue is that NaT is maximal when sorting and therefore will always come last. In order to sort by ascending date while putting NaT in front or on top, you need to sort with two conditions.

    np.lexsort will sort an array by any number of conditions and return a sorting slice similar to np.argsort

    Also note that I will put the notnull condition last in the array of conditions passed to np.lexsort. np.lexsort sorts the last elements first... I don't know why, but that's the way it is.

    So we should sort by df.date_of_last_hoorah_given.notnull() first because those that aren't null will have a value of True which is greater that False in a sorting context. Then we can sort by the rest of the dates.

    dates = df.date_of_last_hoorah_given
    sort_slice = np.lexsort([dates.values, dates.notnull().values])
    df.iloc[sort_slice]
    

    OR! as OP said in comments, this gives the same thing and is much more straight forward

    df.sort_values('date_of_last_hoorah_given', na_position='first')
    
         date_of_last_hoorah_given                              email   first_name
    1                          NaT  flitterbeam.clovergaze@hoorah.com  Flitterbeam
    16  2016-12-19 07:36:08.000000              mindy.lyndi@hoorah.io        Mindy
    29  2016-12-19 07:36:08.000000              judi.seward@hoorah.io         Judi
    7   2016-12-19 07:36:08.000000                  chao.an@hoorah.io         Chao
    21  2016-12-19 07:36:08.000000              bala.harish@hoorah.io         Bala
    12  2016-12-19 07:36:08.000000            pushpa.swaran@hoorah.io       Pushpa
    30  2016-12-22 07:36:08.000000       sparrow.freespirit@hoorah.io      Sparrow
    28  2016-12-22 07:36:08.000000         sanjeev.prasanna@hoorah.io      Sanjeev
    27  2016-12-22 07:36:08.000000     twinklenose.snowleaf@hoorah.io  Twinklenose
    25  2016-12-22 07:36:08.000000       sweetgaze.sugarshy@hoorah.io    Sweetgaze
    23  2016-12-22 07:36:08.000000            shreya.sarika@hoorah.io       Shreya
    19  2016-12-22 07:36:08.000000              jiahao.dong@hoorah.io       Jiahao
    15  2016-12-22 07:36:08.000000            jannine.tyson@hoorah.io       Janine
    14  2016-12-22 07:36:08.000000                arlo.reed@hoorah.io         Arlo
    0   2016-12-22 07:36:08.000000         aditya.hariharan@hoorah.io       Aditya
    11  2016-12-22 07:36:08.000000        shirley.madalitso@hoorah.io      Shirley
    2   2016-12-22 07:36:08.000000             minerva.jena@hoorah.io      Minerva
    3   2016-12-22 07:36:08.000000             colby.brandi@hoorah.io        Colby
    13  2016-12-22 07:36:08.000000            beverly.cohen@hoorah.io      Beverly
    6   2016-12-22 07:36:08.000000             guanting.jun@hoorah.io     Guanting
    5   2016-12-22 07:36:08.000000                  chen.tu@hoorah.io         Chen
    18  2016-12-22 10:55:03.474683                  fen.lin@hoorah.io          Fen
    9   2016-12-23 07:36:08.000000             kourtney.pam@hoorah.io     Kourtney
    10  2016-12-23 14:30:55.206581             kailee.alfie@hoorah.io       Kailee
    4   2016-12-24 07:36:08.000000                jing.chao@hoorah.io         Jing
    31  2016-12-24 16:02:28.945809               rich.oswin@hoorah.io         Rich
    24  2016-12-25 07:36:08.000000           ganesh.vasanta@hoorah.io       Ganesh
    8   2016-12-26 07:36:08.000000               xia.yaling@hoorah.io          Xia
    20  2016-12-27 07:36:08.000000              kinley.joan@hoorah.io       Kinley
    22  2016-12-28 07:36:08.000000   honeygleam.dazzlesmile@hoorah.io   Honeygleam
    26  2016-12-28 15:29:48.629929             indira.padma@hoorah.io       Indira
    17  2016-12-29 02:27:11.125078             ileen.gaynor@hoorah.io        Ileen
    32  2016-12-29 15:38:02.335296            ragnar.lestat@hoorah.io       Ragnar
    

Leave a reply to - sort Pandas dataframe with NaT values on top

◀ Go back