将dict / json放入数据框

发布时间:2020-07-06 07:57

我有以下输入内容:

{'1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp': {"balance": 0}, '1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka': {"balance": 0}, '1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St': {"balance": 34},
...

我想放入一个数据框。但是我运行代码:

p = pd.DataFrame.from_dict(input)

我得到了错误:

ValueError: If using all scalar values, you must pass an index

xlsx中的预期输出:

                          addresse     balance
'1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp'          0
'1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka'          0
'1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St'         34

任何贡献将不胜感激。

回答1

将参数orient='index'添加到DataFrame.from_dict,然后通过DataFrame.rename_axis创建索引名称,最后通过DataFrame.reset_index将索引转换为列:

d = {'1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp': {"balance": 0}, 
     '1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka': {"balance": 0}, 
     '1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St': {"balance": 34}}

p = pd.DataFrame.from_dict(d, orient='index').rename_axis('addresse').reset_index()
print (p)
                             addresse  balance
0  1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St       34
1  1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka        0
2  1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp        0

另一个想法是使用列表理解,并将新关键字放在字典列表的前面,最后传递给DataFrame构造函数:

p = pd.DataFrame([dict(**{'addresse':k}, **v) for k, v in d.items()])
print (p)
                             addresse  balance
0  1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp        0
1  1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka        0
2  1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St       34
回答2

替代代码:

使用$impressions = DB::table('journey_content_impression') ->where('user_id',{id}) ->where('journey_id',{j_id}) ->groupBy('journey_item_id') ->get(['journey_item_id', DB::raw('MAX(progress) as progress')]); .from_dict().stack().unstack()。修改了问题中发布的代码。

.reset_index()

输出

# Import libraries
import pandas as pd

# Create dictionary
d = {'1LsquDfKDtz1uFz7txAVixkgFc82PHwqqp': {"balance": 0}, 
     '1FBGyQnLZrfwVZRdYNxbrqnKukm9trH5Ka': {"balance": 0}, 
     '1DSBqLVtDFgMypdo2yC77C5LZuTCHZS7St': {"balance": 34}
    }

# Create DataFrame from dictionary
df = pd.DataFrame.from_dict(d).stack().unstack(0).reset_index()

# Rename columns
df.columns = ['address', 'balance']