Copying a column from one DataFrame to another gives NaN values?(将一列从一个 DataFrame 复制到另一个会给出 NaN 值?)
问题描述
This question has been asked so many times, and it seemed to work for others, however, I am getting NaN
values when I copy a column from a different DataFrame(df1
and df2
are same length).
df1
date hour var1
a 2017-05-01 00:00:00 456585
b 2017-05-01 01:00:00 899875
c 2017-05-01 02:00:00 569566
d 2017-05-01 03:00:00 458756
e 2017-05-01 04:00:00 231458
f 2017-05-01 05:00:00 986545
df2
MyVar1 MyVar2
0 6169.719338 3688.045368
1 5861.148007 3152.238704
2 5797.053347 2700.469871
3 5779.102340 2730.471948
4 6708.219647 3181.298291
5 8550.380343 3793.580394
I need like this in my df2
MyVar1 MyVar2 date hour
0 6169.719338 3688.045368 2017-05-01 00:00:00
1 5861.148007 3152.238704 2017-05-01 01:00:00
2 5797.053347 2700.469871 2017-05-01 02:00:00
3 5779.102340 2730.471948 2017-05-01 03:00:00
4 6708.219647 3181.298291 2017-05-01 04:00:00
5 8550.380343 3793.580394 2017-05-01 05:00:00
I tried the following,
df2['date'] = df1['date']
df2['hour'] = df1['hour']
type(df1)
>> pandas.core.frame.DataFrame
type(df2)
>> pandas.core.frame.DataFrame
I am getting the following,
MyVar1 MyVar2 date hour
0 6169.719338 3688.045368 NaN NaN
1 5861.148007 3152.238704 NaN NaN
2 5797.053347 2700.469871 NaN NaN
Why is this happening? There is another post that discusses merge
, but I just need to copy it. Any help would be appreciated.
The culprit is unalignable indexes
Your DataFrames' indexes are different (and correspondingly, the indexes for each columns), so when trying to assign a column of one DataFrame to another, pandas will try to align the indexes, and failing to do so, insert NaNs.
Consider the following examples to understand what this means:
# Setup
A = pd.DataFrame(index=['a', 'b', 'c'])
B = pd.DataFrame(index=['b', 'c', 'd', 'f'])
C = pd.DataFrame(index=[1, 2, 3])
# Example of alignable indexes - A & B (complete or partial overlap of indexes)
A.index B.index
a
b b (overlap)
c c (overlap)
d
f
# Example of unalignable indexes - A & C (no overlap at all)
A.index C.index
a
b
c
1
2
3
When there are no overlaps, pandas cannot match even a single value between the two DataFrames to put in the result of the assignment, so the output is a column full of NaNs.
If you're working on an IPython notebook, you can check that this is indeed the root cause using,
df1.index.equals(df2.index)
# False
df1.index.intersection(df2.index).empty
# True
You can use any of the following solutions to fix this issue.
Solution 1: Reset both DataFrames' indexes
You may prefer this option if you didn't mean to have different indices in the first place, or if you don't particularly care about preserving the index.
# Optional, if you want a RangeIndex => [0, 1, 2, ...]
# df1.index = pd.RangeIndex(len(df))
# Homogenize the index values,
df2.index = df1.index
# Assign the columns.
df2[['date', 'hour']] = df1[['date', 'hour']]
If you want to keep the existing index, but as a column, you may use reset_index()
instead.
Solution 2: Assign NumPy arrays (bypass index alignment)
This solution will only work if the lengths of the two DataFrames match.
# pandas >= 0.24
df2['date'] = df1['date'].to_numpy()
# pandas < 0.24
df2['date'] = df1['date'].values
To assign multiple columns easily, use,
df2[['date', 'hour']] = df1[['date', 'hour']].to_numpy()
这篇关于将一列从一个 DataFrame 复制到另一个会给出 NaN 值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:将一列从一个 DataFrame 复制到另一个会给出 NaN 值?
基础教程推荐
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 筛选NumPy数组 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01