Skip to content

A collection of scripts for analyzing and visualizing Git repositories. Explore commit history, contributor activity, code changes, and more with easy-to-use tools designed to help you gain insights into your codebase.

License

Notifications You must be signed in to change notification settings

drew2a/git-insights

Repository files navigation

git-insights

A collection of scripts for analyzing and visualizing Git repositories. Explore commit history, contributor activity, code changes, and more with easy-to-use tools designed to help you gain insights into your codebase.

Table of Contents

Prerequisites

You can install the required Python packages using:

pip install -r requirements.txt

plot_number_of_contributors.py

plot_number_of_contributors.py is a versatile script designed to analyze and visualize contributor activity within a Git repository. By examining commit history, it identifies continuous contribution periods and aggregates contributor data over time, providing insights into both individual and collective engagement patterns. The script offers configurable parameters to tailor the analysis, such as specifying the branch, excluding certain contributors, and defining the time window for activity periods. It generates visualizations that highlight both all contributors and those with sustained activity, making it a valuable tool for understanding contributor dynamics in a project.

The original work for this script was done here: GitHub Issue Comment.

Configurable Parameters

The script plot_number_of_contributors.py accepts the following configurable parameters:

  • --repo_path: Path to the repository. Default is the current directory (.).
  • --branch: Branch to analyze. Default is main.
  • --exclusions: List of contributors to exclude. Default is ["dependabot", "snyk"].
  • --delta_days: Number of days to look back for commits. Default is 30 years (365 * 30 days).
  • --window_days: The maximum allowed gap between consecutive commits to be considered as part of the same activity period. For example, a 7-day window means that if the gap between two commits is less than or equal to 7 days, they are considered part of a continuous contribution period. Default is 90 days.
  • --granularity_days: The minimum length of time that a contribution period must be to be considered. For instance, a 1-day granularity means that any period shorter than 1 day is extended to 1 day. Default is 15 days.
  • --contribution_duration: The minimum total number of days a contributor must have contributed to be included in the analysis. For example, a filter of "at least two days in total" means that only contributors who have made commits on two or more separate days throughout the entire period are included. Default is 1 day.
  • --less_than_year: Use less frequent date ticks on x-axis. This is a flag, so it has no default value.
  • --activity_plot_file: File name for the activity plot. Default is out/activity_plot.png.
  • --contributor_count_plot_file: File name for the contributor count plot. Default is out/contributor_count_plot.png.

Examples

To generate the graphs, follow these steps:

git clone <repository-url>

Then, use the following commands:

All Contributors

This example visualizes all contributors over time with a window of 90 days, granularity of 15 days, and a minimum contribution duration of 1 day. This setting captures all contributors who have made at least one commit within any 90-day period, providing a broad view of contributor activity.

python plot_number_of_contributors.py --repo_path /path/to/repo --branch main --window_days 90 --granularity_days 15 --contribution_duration 1 --activity_plot_file all_contributors.png

all_contributors

Continuous Contributors

This example focuses on continuous contributors, using a window of 90 days, granularity of 1 day, and a minimum contribution duration of 30 days. It highlights contributors who have been consistently active, making contributions over a longer period, thus offering insights into sustained engagement.

python plot_number_of_contributors.py --repo_path /path/to/repo --branch main --window_days 90 --granularity_days 1 --contribution_duration 30 --contributor_count_plot_file continuous_contributors.png

ccontributors2

calculate_branch_age.py

calculate_branch_age.py is a script designed to calculate and visualize the age of branches in a Git repository. It fetches all branches, determines the fork and latest commit dates for each branch, and calculates the age in days. The script then generates a horizontal bar plot showing the age of each branch, with additional labels for the start and end dates of each branch's age.

Configurable Parameters

The script calculate_branch_age.py accepts the following configurable parameters:

  • --repo_path: Path to the repository. This parameter is required.
  • --output_file: File name for the branch age plot. Default is out/branch_ages.png.
  • --branch_regex: Regex pattern to filter branches. Default is .+.
  • --min_age: Minimum age of branches to include in days. Default is 0.
  • --main_branch: Name of the main branch to compare against. Default is main.

Examples

To generate the branch age plot, follow these steps:

Clone the target repository:

git clone <repository-url>

Then, use the following command to plot all branches older than 100 days for https://github.com/arvidn/libtorrent

python calculate_branch_age.py --repo_path ../../arvidn/libtorrent --main_branch master --min_age 100

libtorrent_branch_ages

plot_open_issues.py

plot_open_issues.py is a script designed to fetch and visualize open issues from a GitHub repository over time. It allows you to analyze the trend of open issues and visualize release periods with optional coloring and timestamp display.

Due to the limitations of the public GitHub REST API, the number of requests is restricted. To avoid frequent requests to GitHub, the script operates in two stages: first, it fetches all issues and releases and saves them to files; then, it analyzes these files.

Configurable Parameters

The script plot_open_issues.py accepts the following configurable parameters:

  • --repo: GitHub repository in the format "owner/repo". Default is Tribler/tribler.
  • --issues_file: File to save issues data. Default is out/issues.json.
  • --releases_file: File to save releases data. Default is out/releases.json.
  • --state: State of issues to fetch (e.g., open, closed, all). Default is all.
  • --labels: Labels to filter issues by. Default is type: bug.
  • --override: Override existing files and fetch new data. This is a flag, so it has no default value.
  • --output_plot: Output file for the plot. Default is out/open_issues_plot.png.
  • --show_release_timestamps: Display release timestamps on the plot. This is a flag, so it has no default value.
  • --color_releases: Color the release periods on the plot. This is a flag, so it has no default value.

Examples

To generate the plot, use the following command:

python plot_open_issues.py --repo Tribler/tribler --labels "type: bug" --output_plot out/open_issues_plot.png --show_release_timestamps --color_releases

This command will fetch issues and releases from the specified repository, save the data to JSON files, and generate a plot of open bugs over time with colored release periods and timestamps.

open_issues_plot

About

A collection of scripts for analyzing and visualizing Git repositories. Explore commit history, contributor activity, code changes, and more with easy-to-use tools designed to help you gain insights into your codebase.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages