Home
diffhouse is a Python solution for structuring Git metadata, designed to enable large-scale codebase analysis at practical speeds.
Key features are:
- 🚀 Fast access to commit data, file changes and more
- 📊 Easy integration with pandas and Polars
- 🐍 Simple-to-use Python interface
Performance
Processing times for tween.js. Lower is better.
For more details, see benchmarks.
Requirements
| Python | 3.10 or higher |
| Git | 2.22 or higher |
Git also needs to be added to the system PATH.
Limitations
At its core, diffhouse is a data extraction tool and therefore does not calculate software metrics like code churn or cyclomatic complexity; if this is needed, take a look at PyDriller instead.