Pandas, groupby and summing over specific months(Pandas、groupby 和特定月份的求和)
问题描述
我有一个数据框:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 982 entries, 2009-10-30 00:00:00 to 2012-12-16 00:00:00
Data columns (total 4 columns):
rain 981 non-null values
temp_max 982 non-null values
temp_min 982 non-null values
temp 982 non-null values
dtypes: float64(4)
对于每年/每月的求和,我使用:
For summing per Year/Month i use :
mdata = data.groupby([lambda x: x.year, lambda x: x.month]).agg([sum])
但我需要季节性分析(夏季、冬季等),那么我如何创建特定月份的总和,例如每年的 [1 ,2 ,3]?
But i need Seasonal analysis (summer, winter etc), so how i can create the Sum of specific months like [1 ,2 ,3] of each year?
泰
推荐答案
是的,对我来说似乎很简洁的一种解决方案是使用 Seasons 字典,然后使用函数对数据进行分组.作为组键传递的任何函数,每个索引值都会调用一次,返回值用作组名.
Yes, one solution which seems neat to me is to use a Seasons dictionary and then group the data using a function. Any function passed as a group key is called once per index value and the return values are used as the group names.
import pandas as pd
import numpy as np
from pandas import DataFrame
import datetime
# Create a year's worth of data
base = datetime.date.today() - datetime.timedelta(365)
Datelist = [base + datetime.timedelta(days = x) for x in range(365)]
DF = DataFrame(np.random.rand(365), index = Datelist)
# Create a Seasonal Dictionary that will map months to seasons
SeasonDict = {11: 'Winter', 12: 'Winter', 1: 'Winter', 2: 'Spring', 3: 'Spring', 4: 'Spring', 5: 'Summer', 6: 'Summer', 7: 'Summer',
8: 'Autumn', 9: 'Autumn', 10: 'Autumn'}
# Write a function that will be used to group the data
def GroupFunc(x):
return SeasonDict[x.month]
# Call the function with the groupby operation.
Grouped = DF.groupby(GroupFunc)
Grouped.sum()
该函数获取每个索引值并在季节字典中查找月份并返回与月份键对应的值.该值随后成为组名.
The function takes each index value and looks up the month in the Seasons Dictionary and returns the value corresponding to the month key. This value then becomes the group name.
或者,您可以使用示例中的 lambda(效率更高,但我认为上面的内容更容易理解):
Alternatively you can use the lambda as in your example (which is more efficient, but I thought the above would be easier to understand):
DF.groupby(lambda x: SeasonDict[x.month]).sum()
根据评论的附加代码在我看来,您最好对数据进行切片.因此,您可以执行以下操作
ADDITIONAL CODE AS PER COMMENTS It seems to me like you would be better off slicing the data. So you could do the following
DF['Season'] = ""
for row in DF.index:
DF.Season[row] = SeasonDict[row.month]
DFWinter = DF[DF.Season == 'Winter']
现在您有了一个包含冬季数据的新数据框,可以随意使用.不同之处在于 groupby 操作允许您对所有数据进行相同的操作,而听起来您想以不同的方式调查数据集不同部分的属性.为此,最好进行切片,在这种情况下使用布尔切片.
Now you have a new data frame with the winter data in, to play with as you desire. The difference is that the groupby operations allow you to undertake the same operations on all the data, whereas it sounds like you wanted to investigate the properties of different parts of your data set in different ways. To do that its better to slice, in this case using Boolean slicing.
这篇关于Pandas、groupby 和特定月份的求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Pandas、groupby 和特定月份的求和
基础教程推荐
- 哪些 Python 包提供独立的事件系统? 2022-01-01
- 合并具有多索引的两个数据帧 2022-01-01
- 将 YAML 文件转换为 python dict 2022-01-01
- 使用Python匹配Stata加权xtil命令的确定方法? 2022-01-01
- 症状类型错误:无法确定关系的真值 2022-01-01
- 使用 Google App Engine (Python) 将文件上传到 Google Cloud Storage 2022-01-01
- 使 Python 脚本在 Windows 上运行而不指定“.py";延期 2022-01-01
- 如何在Python中绘制多元函数? 2022-01-01
- 如何在 Python 中检测文件是否为二进制(非文本)文 2022-01-01
- Python 的 List 是如何实现的? 2022-01-01