Categories > Coding > Python >
Pandas DataFrame Manipulation Issue: Calculating Monthly Average from Daily Data
Posted
I'm working on a data analysis project using Python and Pandas, and I'm facing an issue with manipulating a DataFrame that contains daily data. I have a DataFrame with two columns: date
and value
. I want to calculate the monthly average of the value
column based on the daily data.
Here's a simplified version of my code:
import pandas as pd
# Sample data
data = {'date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-02-01', '2023-02-02'],
'value': [10, 15, 20, 5, 8]}
df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])
# Calculate monthly average
monthly_avg = df.resample('M', on='date').mean()
When I run this code, the monthly_avg
DataFrame seems to have NaN values for all the rows. I suspect this is because I don't have data for every day in a month. Is there a way to calculate the monthly average even if I have missing days within a month? Or do I need to preprocess the data differently before calculating the monthly average? I have looked on other websites, including this one, but I was unable to find the answer. I would appreciate any advice on how to correctly compute the monthly average using Pandas from this daily data. I appreciate your help in advance.
Cancel
Post
Replied
Here is some Python code to calculate the average number of days per month: Tower Defense
# Number of days in each month
days_in_month = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
# Total number of days in a year
total_days = sum(days_in_month)
# Number of months in a year
num_months = len(days_in_month)
# Calculate average number of days per month
average_days = total_days / num_months
Cancel
Post
Replied
import pandas as pd
# Sample data
data = {'date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-02-01', '2023-02-02'],
'value': [10, 15, 20, 5, 8]}
df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])
# Set date column as index
df.set_index('date', inplace=True)
# Calculate monthly average
monthly_avg = df.resampl
e('M').mean()
Cancel
Post
https://cdn.discordapp.com/attachments/968557692639666267/1139574673630318632/lodlk.png
https://cdn.discordapp.com/attachments/921008361342902274/1144217307170742363/Bez_tytuu692.png
Users viewing this thread:
( Members: 0, Guests: 1, Total: 1 )
Comments
jeersurprised 0 Reputation
Commented
I like the 2nd code the most. Because it's quite simple and easy to understand.
dinosaur game
1