GitSweeper Performance Analysis & Optimization Report¶
Current State Analysis¶
Binary Size¶
- Current binary size: 18MB (unoptimized)
- Stripped binary size: 12MB (with
-ldflags="-s -w") - Size reduction potential: ~33% with basic optimizations
Dependencies Analysis¶
Major Dependencies Contributing to Binary Size¶
- github.com/go-git/go-git/v5 - Full Git implementation in Go
- code.gitea.io/git - Additional Git shell operations
- github.com/sirupsen/logrus - Structured logging
- gopkg.in/alecthomas/kingpin.v2 - Command-line parsing
- golang.org/x/crypto - Cryptographic operations for Git
Redundant Dependencies Identified¶
- Both
go-git/v5andcode.gitea.io/gitare used for Git operations - Multiple versions of similar packages (e.g.,
go-git.v4andgo-git/v5)
Performance Bottlenecks Identified¶
1. Git Operations Performance¶
Location: internal/githelpers.go:GetMergedBranches()
Issues: - Iterates through ALL master commits to find merged branches - No early termination or optimization for large repositories - Loads entire commit history into memory
Impact: O(n*m) complexity where n = commits in master, m = remote branches
2. Memory Usage¶
Location: internal/githelpers.go:94-130
Issues: - Stores all branch heads in memory simultaneously - Loads full commit history without pagination - No garbage collection optimization
3. Dependency Bloat¶
Issues: - Dual Git libraries (go-git + gitea) - Heavy logging framework for simple CLI tool - Unused features in large dependencies
4. Build Optimizations Missing¶
Issues: - No build constraints for size optimization - Debug symbols included in production builds - No dead code elimination
Optimization Strategies¶
1. Algorithm Optimizations¶
A. Optimize Branch Comparison Algorithm¶
// Current: O(n*m) - checks every commit against every branch
// Optimized: O(n+m) - use hash lookup and early termination
B. Implement Commit History Pagination¶
- Process commits in batches instead of loading all at once
- Early termination when all branches are found
C. Use Git Merge-Base¶
- Leverage
git merge-basefor more efficient merged branch detection - Avoid manual commit iteration
2. Dependency Optimizations¶
A. Consolidate Git Libraries¶
- Remove:
code.gitea.io/git(only used for push operation) - Keep:
github.com/go-git/go-git/v5for all Git operations - Savings: ~2-3MB binary size
B. Replace Heavy Dependencies¶
- logrus → log (standard library) or lightweight alternative
- kingpin → flag (standard library) for simple CLI
- Savings: ~1-2MB binary size
C. Use Build Tags for Optional Features¶
- Separate debug logging into build-tagged files
- Optional features only included when needed
3. Build Optimizations¶
A. Enhanced Build Flags¶
# Current
go build -ldflags "-X main.gitCommit=${GIT_COMMIT}"
# Optimized
go build -ldflags="-s -w -X main.gitCommit=${GIT_COMMIT}" -trimpath
B. Dead Code Elimination¶
- Use
go build -tags netgofor static builds - Enable dead code elimination with proper build constraints
4. Runtime Optimizations¶
A. Lazy Loading¶
- Load Git repository data only when needed
- Implement caching for repeated operations
B. Concurrent Processing¶
- Process multiple branches concurrently
- Use worker pools for Git operations
Implementation Priority¶
High Priority (Immediate Impact)¶
- Build optimization - Add stripped builds to Makefile
- Remove gitea dependency - Replace with go-git equivalent
- Algorithm optimization - Fix O(n*m) complexity in GetMergedBranches
Medium Priority (Significant Impact)¶
- Replace heavy dependencies - logrus → stdlib, kingpin → flag
- Implement pagination - Process commits in batches
- Add concurrency - Parallel branch processing
Low Priority (Polish)¶
- Memory profiling - Add memory usage monitoring
- Caching layer - Cache Git operations
- Progressive loading - Stream results to user
Expected Performance Improvements¶
Binary Size¶
- Current: 18MB
- After optimizations: 6-8MB (60-65% reduction)
Runtime Performance¶
- Large repositories: 50-70% faster due to algorithm improvements
- Memory usage: 40-60% reduction
- Cold start time: 30-40% faster due to smaller binary
Load Time Improvements¶
- Binary loading: Faster due to smaller size
- Dependency resolution: Fewer dependencies to load
- Git operations: More efficient algorithms
Monitoring & Measurement¶
Benchmarks to Add¶
- Binary size tracking in CI
- Runtime performance tests with various repository sizes
- Memory usage profiling
- Cold start time measurement
Metrics to Track¶
- Binary size over time
- Git operation performance
- Memory allocation patterns
- User-perceived performance (time to first result)