Home
diffhouse is a Python solution for structuring Git metadata, designed to enable large-scale codebase analysis at practical speeds.
Key features are:
- Fast access to commit data, file changes and more
- Easy integration with pandas and polars
- Simple-to-use Python interface
Requirements
| Python | 3.10 or higher |
| Git | 2.22 or higher |
Git also needs to be added to the system PATH.
Limitations
At its core, diffhouse is a data extraction tool and therefore does not calculate software metrics like code churn or cyclomatic complexity; if this is needed, take a look at PyDriller instead.
Also note that revision data is limited to default branches only.