Asked  7 Months ago    Answers:  5   Viewed   38 times

I have a DataFrame with four columns. I want to convert this DataFrame to a python dictionary. I want the elements of first column be keys and the elements of other columns in same row be values.

DataFrame:

    ID   A   B   C
0   p    1   3   2
1   q    4   3   2
2   r    4   0   9  

Output should be like this:

Dictionary:

{'p': [1,3,2], 'q': [4,3,2], 'r': [4,0,9]}

 Answers

66

The to_dict() method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.

to_dict() also accepts an 'orient' argument which you'll need in order to output a list of values for each column. Otherwise, a dictionary of the form {index: value} will be returned for each column.

These steps can be done with the following line:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

In case a different dictionary format is needed, here are examples of the possible orient arguments. Consider the following simple DataFrame:

>>> df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
>>> df
        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

Then the options are as follows.

dict - the default: column names are keys, values are dictionaries of index:data pairs

>>> df.to_dict('dict')
{'a': {0: 'red', 1: 'yellow', 2: 'blue'}, 
 'b': {0: 0.5, 1: 0.25, 2: 0.125}}

list - keys are column names, values are lists of column data

>>> df.to_dict('list')
{'a': ['red', 'yellow', 'blue'], 
 'b': [0.5, 0.25, 0.125]}

series - like 'list', but values are Series

>>> df.to_dict('series')
{'a': 0       red
      1    yellow
      2      blue
      Name: a, dtype: object, 

 'b': 0    0.500
      1    0.250
      2    0.125
      Name: b, dtype: float64}

split - splits columns/data/index as keys with values being column names, data values by row and index labels respectively

>>> df.to_dict('split')
{'columns': ['a', 'b'],
 'data': [['red', 0.5], ['yellow', 0.25], ['blue', 0.125]],
 'index': [0, 1, 2]}

records - each row becomes a dictionary where key is column name and value is the data in the cell

>>> df.to_dict('records')
[{'a': 'red', 'b': 0.5}, 
 {'a': 'yellow', 'b': 0.25}, 
 {'a': 'blue', 'b': 0.125}]

index - like 'records', but a dictionary of dictionaries with keys as index labels (rather than a list)

>>> df.to_dict('index')
{0: {'a': 'red', 'b': 0.5},
 1: {'a': 'yellow', 'b': 0.25},
 2: {'a': 'blue', 'b': 0.125}}
Tuesday, June 1, 2021
 
PeanutsMcgee
answered 7 Months ago
24

Let's try this, using stack, to_frame, and T:

df.index = df.index + 1
df_out = df.stack()
df_out.index = df_out.index.map('{0[1]}_{0[0]}'.format)
df_out.to_frame().T

Output:

   A_1  B_1  C_1  D_1  E_1  A_2  B_2  C_2  D_2  E_2  A_3  B_3  C_3  D_3  E_3
0    1    2    3    4    5    6    7    8    9   10   11   12   13   14    5
Friday, July 30, 2021
 
bumperbox
answered 4 Months ago
40

I believe you need lists ad values of dict - use groupby + apply + to_dict:

d = df.groupby('key')['id'].apply(list).to_dict()
print (d)
{1: ['a1', 'a2', 'a3'], 2: ['a4', 'a5'], 3: ['a6']}

Or if need list with scalars add if/else to apply:

d = df.groupby('key')['id'].apply(lambda x: list(x) if len(x) > 1 else x.iat[0]).to_dict()
print (d)
{1: ['a1', 'a2', 'a3'], 2: ['a4', 'a5'], 3: 'a6'}
Saturday, August 7, 2021
 
NewPHP
answered 4 Months ago
98

You can use stack:

df = pd.DataFrame(d).stack().reset_index()
df.columns = ['word','category','count']
print(df)
        word   category  count
0     boring  analytics    5.0
1  important       data    2.0
2      sleep  analytics    3.0
3       very       data    3.0

df = pd.DataFrame.from_dict(d, orient='index').stack().reset_index()
df.columns = ['category','word','count']
print(df)

    category       word  count
0  analytics     boring    5.0
1  analytics      sleep    3.0
2       data  important    2.0
3       data       very    3.0

Another solution with nested list comprehension:

df = pd.DataFrame([(key,key1,val1) for key,val in d.items() for key1,val1 in val.items()])
df.columns = ['category','word','count']
print(df)
    category       word  count
0  analytics     boring      5
1  analytics      sleep      3
2       data  important      2
3       data       very      3
Friday, August 20, 2021
 
Jeremy Pare
answered 4 Months ago
89

Well you could use a dictionary comprehension and iterrows:

print {key:row.tolist() for key,row in df.set_index('Label1').iterrows()}

{'key3': ['col1value3', 'col2value3'],
 'key2': ['col1value2', 'col2value2'], 
 'key1': ['col1value1', 'col2value1']}

Also, I think the following will work too:

df = df.set_index('Label1')
print df.T.to_dict(outtype='list')

{'key3': ['col1value3', 'col2value3'],
 'key2': ['col1value2', 'col2value2'],
 'key1': ['col1value1', 'col2value1']}

Update as of fall 2017; outtype is no longer the keyword argument. Use orient instead:

In [11]: df.T.to_dict(orient='list')
Out[11]: 
{'key1': ['col1value1', 'col2value1'],
 'key2': ['col1value2', 'col2value2'],
 'key3': ['col1value3', 'col2value3']}
Wednesday, October 13, 2021
 
Maxime VAST
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share