Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add auto_correlation as function itself not only as plot. #52269

Closed
2 of 3 tasks
pokecheater opened this issue Mar 29, 2023 · 2 comments
Closed
2 of 3 tasks

ENH: Add auto_correlation as function itself not only as plot. #52269

pokecheater opened this issue Mar 29, 2023 · 2 comments
Labels
Enhancement Needs Discussion Requires discussion from core team before further action

Comments

@pokecheater
Copy link

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Dear Pandas Team,

I hope this message finds you well. I am reaching out to request a new feature that I believe would be incredibly useful for many users. Currently, the autocorrelation_plot function is available in Pandas, but it is only accessible as a plot. For users who need to automate their data analysis workflows, it would be extremely valuable to have access to the autocorrelation values directly through a method.

I recently found myself in this exact situation and ended up copying the code from the plot function and deleting the plot section to obtain the values I needed. However, I believe that having a dedicated method to calculate the autocorrelation values would greatly streamline this process and save users a lot of time.

I understand that the Pandas Team is likely very busy, but I wanted to bring this idea to your attention in case it is something that could be implemented relatively easily. I have included the code I used below in the Feature Description to make it easier for you to understand my request.

Thank you for considering my suggestion. I greatly appreciate all of the hard work that goes into developing and maintaining such a powerful and versatile data analysis library.

Best regards

Feature Description

def auto_correlation(series: pd.DataFrame, column:str):
"""
Uses code from pandas to calculate the same auto_correlation as it is done in the plot auto_correlation method.

Args:
    series (pd.DataFrame): The pd.DataFrame to auto_correlate.
    column (str): The column to auto_correlate.

Returns:
    pd.DataFrame: The autocorrelated values inside a DataFrame. Corr column will contain the correlation value and lag the corresponding lag value. 
"""
series = series[column]
# SRC: https://github.com/pandas-dev/pandas/blob/2e218d10984e9919f0296931d92ea851c6a6faf5/pandas/plotting/_matplotlib/misc.py#L447
n = len(series)
data = np.asarray(series)
mean = np.mean(data)
c0 = np.sum((data - mean) ** 2) / n

def r(h):
    return ((data[: n - h] - mean) * (data[h:] - mean)).sum() / n / c0

x = np.arange(n) + 1
y = [r(loc) for loc in x]
return pd.DataFrame({"corr": y, "lag": x})

Alternative Solutions

null

Additional Context

No response

@pokecheater pokecheater added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 29, 2023
@DeaMariaLeon DeaMariaLeon added Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member Needs Discussion Requires discussion from core team before further action labels Mar 31, 2023
@DeaMariaLeon
Copy link
Member

DeaMariaLeon commented Mar 31, 2023

Hi @pokecheater, thank you for opening an issue with your suggestion.

I will (re)label it to "needs discussion", as core developers opinions are needed.

In the meantime, maybe this is useful to you:
https://pandas.pydata.org/docs/reference/api/pandas.Series.autocorr.html

This link is a bit old, but migh give you some ideas:
https://stackoverflow.com/questions/26083293/calculating-autocorrelation-of-pandas-dataframe-along-each-column

@DeaMariaLeon DeaMariaLeon added the Needs Discussion Requires discussion from core team before further action label Mar 31, 2023
@mroeschke
Copy link
Member

Thanks for the issue, but it appears this hasn't gotten traction in a while so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants