Welcome to Git-Pandas Documentation
Git-Pandas is a powerful Python library that transforms Git repository data into pandas DataFrames, making it easy to analyze and visualize your codebase’s history, contributors, and development patterns.

Quick Start
Install Git-Pandas using pip:
pip install git-pandas
Basic Usage
Analyze a single repository:
from gitpandas import Repository
repo = Repository('/path/to/repo')
commits_df = repo.commit_history()
blame_df = repo.blame()
Analyze multiple repositories:
from gitpandas import ProjectDirectory
project = ProjectDirectory('/path/to/project')
project_info = project.general_information()
Key Features
Repository Analysis: Extract commit history, file changes, and blame information
Project Insights: Calculate bus factor, development time, and contributor metrics
GitHub Integration: Analyze GitHub profiles and repository metrics
Visualization Tools: Built-in plotting utilities for common Git analytics
Performance Optimization: Optional caching support for memory-intensive operations
Core Components
The library is built around two main components:
Repository: A wrapper around a single Git repository
ProjectDirectory: A collection of Git repositories for aggregate analysis
For detailed information about these components, see the Repository and Project Directory documentation.
Documentation Contents
Additional Resources
Index - Complete API reference
Module Index - Module index
Search Page - Search the documentation
License
This project is BSD licensed (see LICENSE.md)
Detailed Documentation
Currently, the two main sources of documentation are the repository and project pages, which have the Sphinx docs from those two classes, as well as instructions on how to create the objects. For detailed examples, check out the use-cases page.
Contents: