Pandas - Replace all NaN values in DataFrame with empty python dict object


Valear

I have a Pandas DataFrame where each cell contains a python dict.

>>> data = {'Q':{'X':{2:2010}, 'Y':{2:2011, 3:2009}},'R':{'X':{1:2013}}}
>>> frame = DataFrame(data)
>>> frame
                    Q          R
X           {2: 2010}  {1: 2013}
Y  {2: 2011, 3: 2009}        NaN

I want to replace NaN with an empty dict to get the following result:

                    Q          R
X           {2: 2010}  {1: 2013}
Y  {2: 2011, 3: 2009}        {}

However, because the fillnafunction interprets the empty dict not as a scalar value, but as a map of column->value, it does nothing (i.e., it doesn't work) if I just do:

>>> frame.fillna(inplace=True, value={})
                    Q          R
X           {2: 2010}  {1: 2013}
Y  {2: 2011, 3: 2009}        NaN

Is there any way I can fillnause to accomplish what I want? Do I have to iterate through the entire DataFrame or construct a stupid dictionary and map all my columns to an empty dictionary?

Valear

I am able to use DataFrame.applymapthis way :

>>> from pandas import isnull
>>> frame=frame.applymap(lambda x: {} if isnull(x) else x)
>>> frame
                    Q          R
X           {2: 2010}  {1: 2013}
Y  {2: 2011, 3: 2009}         {}

This solution avoids the pitfalls of EdChum's solution (all NaN cells end up pointing to the same underlying dict object in memory, preventing them from updating independently of each other) and Shashank's solution (requiring nested construction of potentially large data structures) dict, specifying only an empty dict value).

Related


pandas DataFrame: replace nan values with mean of column

piokuc I have a pandas DataFrame, most of which are real numbers, but there are also some nanvalues in it. How can I replace s nanwith the mean of the column ? This question is very similar to this one: numpy array: replace nan values with mean of column, but

How to replace a range of values with NaN in Pandas dataframe?

Mat_python I have a huge dataframe. How should I replace a range of values (-200, -100) with NaNs? Maximum capacity You can do it like this: In [145]: df = pd.DataFrame(np.random.randint(-250, 50, (10, 3)), columns=list('abc')) In [146]: df Out[146]: a

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

Replace empty list with NaN in pandas dataframe

running Man I am trying to replace some empty lists in my data with NaN values. But how to represent an empty list in an expression? import numpy as np import pandas as pd d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]}) d x

Replacing empty list values in Pandas DataFrame with NaN

Rack A I know similar questions have been asked before, but I literally tried all the possible solutions listed here and none worked. I have a dataframe consisting of dates, strings, nulls and empty list values. It's very large, with 8 million rows. I want to

pandas DataFrame: replace nan values with mean of column

piokuc I have a pandas DataFrame, most of which are real numbers, but there are also some nanvalues in it. How can I replace s nanwith the mean of the column ? This question is very similar to this one: numpy array: replace nan values with mean of column, but

Python Pandas Dataframe replace NaN with values in list

Hendricks I am trying to replace my column with NaN group_choices = ['Group1', 'Group2', 'Group3'] Groups limit 1 NaN NaN 2 Group1 2 3 Group2 2 4 Group3 2 5 NaN NaN 6 NaN NaN 7 NaN NaN How to randomly replace NaNs based on group_choises? I'm also trying to l

Replace empty list with NaN in pandas dataframe

running Man I am trying to replace some empty lists in my data with NaN values. But how to represent an empty list in an expression? import numpy as np import pandas as pd d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]}) d x

Error in function to replace all nan values in dataframe

Hapanas I am trying to create a function in python to replace any form of NaN with NaN. import pandas as pd import numpy as np data=pd.read_csv("diabetes.csv") def proc_all_NaN(data): nan_sym=["_","-","?","","na","n/a"] for i in nan_sym: data

How to replace a range of values with NaN in Pandas dataframe?

Mat_python I have a huge dataframe. How should I replace a range of values (-200, -100) with NaNs? Maximum capacity You can do it like this: In [145]: df = pd.DataFrame(np.random.randint(-250, 50, (10, 3)), columns=list('abc')) In [146]: df Out[146]: a

Replacing empty list values in Pandas DataFrame with NaN

Rack A I know similar questions have been asked before, but I literally tried all the possible solutions listed here and none worked. I have a dataframe consisting of dates, strings, nulls and empty list values. It's very large, with 8 million rows. I want to

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

How to replace a range of values with NaN in Pandas dataframe?

Mat_python I have a huge dataframe. How should I replace a range of values (-200, -100) with NaNs? Maximum capacity You can do it like this: In [145]: df = pd.DataFrame(np.random.randint(-250, 50, (10, 3)), columns=list('abc')) In [146]: df Out[146]: a

Replace empty list with NaN in pandas dataframe

running Man I am trying to replace some empty lists in my data with NaN values. But how to represent an empty list in an expression? import numpy as np import pandas as pd d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]}) d x

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

Replacing empty list values in Pandas DataFrame with NaN

Rack A I know similar questions have been asked before, but I literally tried all the possible solutions listed here and none worked. I have a dataframe consisting of dates, strings, nulls and empty list values. It's very large, with 8 million rows. I want to

Error in function to replace all nan values in dataframe

Hapanas I am trying to create a function in python to replace any form of NaN with NaN. import pandas as pd import numpy as np data=pd.read_csv("diabetes.csv") def proc_all_NaN(data): nan_sym=["_","-","?","","na","n/a"] for i in nan_sym: data

Replace empty list with NaN in pandas dataframe

running Man I am trying to replace some empty lists in my data with NaN values. But how to represent an empty list in an expression? import numpy as np import pandas as pd d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]}) d x

Replacing empty list values in Pandas DataFrame with NaN

Rack A I know similar questions have been asked before, but I literally tried all the possible solutions listed here and none worked. I have a dataframe consisting of dates, strings, nulls and empty list values. It's very large, with 8 million rows. I want to

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

Replace empty list with NaN in pandas dataframe

running Man I am trying to replace some empty lists in my data with NaN values. But how to represent an empty list in an expression? import numpy as np import pandas as pd d = pd.DataFrame({'x' : [[1,2,3], [1,2], ["text"], []], 'y' : [1,2,3,4]}) d x

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

Pandas - if all values of dataFrame are NaN

math student How to create an if statement that does the following: if all values in dataframe are nan: do something else: do something else According to this post , it is possible to check if all values of a DataFrame are NaN. I know that one c

Python Pandas Dataframe replace NaN with values in list

Hendricks I am trying to replace my column with NaN group_choices = ['Group1', 'Group2', 'Group3'] Groups limit 1 NaN NaN 2 Group1 2 3 Group2 2 4 Group3 2 5 NaN NaN 6 NaN NaN 7 NaN NaN How to randomly replace NaNs based on group_choises? I'm also trying to l

Replacing empty list values in Pandas DataFrame with NaN

Rack A I know similar questions have been asked before, but I literally tried all the possible solutions listed here and none worked. I have a dataframe consisting of dates, strings, nulls and empty list values. It's very large, with 8 million rows. I want to