On what data does sklearn Transformation operate?

I am writing a set of custom transformations in sklearn in order to clean the data in a pipeline. Each custom transformation takes two Pandas DataFrame as parameters for fit and transform, transform returns two DataFrames as well (see examples below)...
more »

2017-10-04 21:10 (1) Answers

Formatting concatenated columns in Pandas/Python.

I am very new to Python/Pandas and am using the Spyder IDF, via the Anaconda distribution with Python 3.6 (maybe 3.7?).I am importing an Excel file via the code below and would like to know how to get an output that looks as follows: VOLTS/PHASE Cu...
more »

2017-10-04 17:10 (1) Answers

How to merge/combine columns in pandas?

I have a (example-) dataframe with 4 columns: data = {'A': ['a', 'b', 'c', 'd', 'e', 'f'], 'B': [42, 52, np.nan, np.nan, np.nan, np.nan], 'C': [np.nan, np.nan, 31, 2, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan, 62, 70]} df =...
more »

2017-10-04 13:10 (2) Answers

Python Pandas Data frame creation

I tried to create a data frame df using the below code : import numpy as np import pandas as pd index = [0,1,2,3,4,5] s = pd.Series([1,2,3,4,5,6],index= index) t = pd.Series([2,4,6,8,10,12],index= index) df = pd.DataFrame(s,columns = ["MUL1"]) df["M...
more »

2017-10-04 12:10 (2) Answers

Count words in a column of strings in Pandas

I have a pandas dataframe that contains queries and counts for a given time period and I'm hoping to turn this dataframe into a count of unique words. For example, if the dataframe contained the below: query count foo bar 10 super ...
more »

2017-10-03 22:10 (2) Answers

Merge two dataframes with multi-index

I have seen several posts about this but I could not get my head around how merge, join and concat would deal with this. How can I merge two dataframes to find matching indexes? in: import pandas as pd import numpy as np row_x1 = ['a1','b1','c1'] r...
more »

2017-10-03 21:10 (1) Answers

From dic pandas with nested dictionaries

I have a dictionary like that: {12: {'Soccer': {'value': 31, 'year': 2013}}, 23: {'Volley': {'value': 24, 'year': 2012},'Yoga': {'value': 3, 'year': 2014}}, 39: {'Baseball': {'value': 2, 'year': 2014},'basket': {'value': 4, 'year': 2012}}} and i ...
more »

2017-10-03 16:10 (2) Answers

Inheritance and Pandas

I am trying to create a file writer based on Pandas' ExcelWriter. I proceeded as I usually do with classes in Python (3) with inheritance: import pandas as pd class Writer(pd.ExcelWriter): def __init__(self, fname, engine='openpyxl'): p...
more »

2017-10-03 15:10 (1) Answers

Removing character from string in dataframe

I have a dataframe, one column of which is filled with entries like this: 2017-03-01T09:30:00.436 2017-03-01T09:30:00.444 ... Is there a way to convert the entire column into datetime format? So far I have tried using str.replace('T',' ') over ...
more »

2017-10-03 14:10 (2) Answers

Pandas: Find row wise frequent value

I have a dataset with binary values. I want to find out frequent value in each row. This dataset have couple of millions records. What would be the most efficient way to do it? Following is the sample of the dataset. import pandas as pd data = pd.re...
more »

2017-10-03 08:10 (2) Answers

Pandas DateTimeIndex - Shifting over index

So I'm working on some technical analysis using Pandas, however I'm struggling with the DateTimeIndex, since a lot of financial data doesn't have a consistent frequency. I use pandas_datareader to get yahoo finance data containing DateTimeIndex, Ope...
more »

2017-10-02 22:10 (1) Answers

pandas: slicing along first level of multiindex

I've set up a DataFrame with two indices. But slicing doesn't behave as expected. I realize that this is a very basic problem, so I searched for similar questions: pandas: slice a MultiIndex by range of secondary index Python Pandas slice multiinde...
more »

2017-10-02 21:10 (1) Answers

Delete zeros from a pandas dataframe

This question was asked in multiple other posts but I could not get any of the methods to work. This is my dataframe: df = pd.DataFrame([[1,2,3,4.5],[1,2,0,4,5]]) I would like to know how I can either: 1) Delete rows that contain any/all zeros 2)...
more »

2017-10-02 21:10 (1) Answers

Pandas Test Failure

After installing Python 3.6.2 and Pandas on Windows 10 64-bit I ran the test described here. The failures seem to be largely linked to two issues. The first 3 errors have to do with indexing: TestMixedIntIndex.test_argsort & TestMixedIntInde...
more »

2017-10-02 19:10 (1) Answers

Colour fill on matplotlib time series chart

Consider the following pandas dataframe: time val server state 2015-01-01 00:00:00 10 server01 normal 2015-01-01 00:02:00 18 server01 high 2015-01-01 00:03:00 41 server01 high 2015-01-01 00:04:00 22 server01 high 2015-01-01 01:...
more »

2017-10-02 17:10 (1) Answers