Skip to content

GitSweeper Performance Analysis & Optimization Report

Current State Analysis

Binary Size

  • Current binary size: 18MB (unoptimized)
  • Stripped binary size: 12MB (with -ldflags="-s -w")
  • Size reduction potential: ~33% with basic optimizations

Dependencies Analysis

Major Dependencies Contributing to Binary Size

  1. github.com/go-git/go-git/v5 - Full Git implementation in Go
  2. code.gitea.io/git - Additional Git shell operations
  3. github.com/sirupsen/logrus - Structured logging
  4. gopkg.in/alecthomas/kingpin.v2 - Command-line parsing
  5. golang.org/x/crypto - Cryptographic operations for Git

Redundant Dependencies Identified

  • Both go-git/v5 and code.gitea.io/git are used for Git operations
  • Multiple versions of similar packages (e.g., go-git.v4 and go-git/v5)

Performance Bottlenecks Identified

1. Git Operations Performance

Location: internal/githelpers.go:GetMergedBranches()

Issues: - Iterates through ALL master commits to find merged branches - No early termination or optimization for large repositories - Loads entire commit history into memory

Impact: O(n*m) complexity where n = commits in master, m = remote branches

2. Memory Usage

Location: internal/githelpers.go:94-130

Issues: - Stores all branch heads in memory simultaneously - Loads full commit history without pagination - No garbage collection optimization

3. Dependency Bloat

Issues: - Dual Git libraries (go-git + gitea) - Heavy logging framework for simple CLI tool - Unused features in large dependencies

4. Build Optimizations Missing

Issues: - No build constraints for size optimization - Debug symbols included in production builds - No dead code elimination

Optimization Strategies

1. Algorithm Optimizations

A. Optimize Branch Comparison Algorithm

// Current: O(n*m) - checks every commit against every branch
// Optimized: O(n+m) - use hash lookup and early termination

B. Implement Commit History Pagination

  • Process commits in batches instead of loading all at once
  • Early termination when all branches are found

C. Use Git Merge-Base

  • Leverage git merge-base for more efficient merged branch detection
  • Avoid manual commit iteration

2. Dependency Optimizations

A. Consolidate Git Libraries

  • Remove: code.gitea.io/git (only used for push operation)
  • Keep: github.com/go-git/go-git/v5 for all Git operations
  • Savings: ~2-3MB binary size

B. Replace Heavy Dependencies

  • logruslog (standard library) or lightweight alternative
  • kingpinflag (standard library) for simple CLI
  • Savings: ~1-2MB binary size

C. Use Build Tags for Optional Features

  • Separate debug logging into build-tagged files
  • Optional features only included when needed

3. Build Optimizations

A. Enhanced Build Flags

# Current
go build -ldflags "-X main.gitCommit=${GIT_COMMIT}"

# Optimized
go build -ldflags="-s -w -X main.gitCommit=${GIT_COMMIT}" -trimpath

B. Dead Code Elimination

  • Use go build -tags netgo for static builds
  • Enable dead code elimination with proper build constraints

4. Runtime Optimizations

A. Lazy Loading

  • Load Git repository data only when needed
  • Implement caching for repeated operations

B. Concurrent Processing

  • Process multiple branches concurrently
  • Use worker pools for Git operations

Implementation Priority

High Priority (Immediate Impact)

  1. Build optimization - Add stripped builds to Makefile
  2. Remove gitea dependency - Replace with go-git equivalent
  3. Algorithm optimization - Fix O(n*m) complexity in GetMergedBranches

Medium Priority (Significant Impact)

  1. Replace heavy dependencies - logrus → stdlib, kingpin → flag
  2. Implement pagination - Process commits in batches
  3. Add concurrency - Parallel branch processing

Low Priority (Polish)

  1. Memory profiling - Add memory usage monitoring
  2. Caching layer - Cache Git operations
  3. Progressive loading - Stream results to user

Expected Performance Improvements

Binary Size

  • Current: 18MB
  • After optimizations: 6-8MB (60-65% reduction)

Runtime Performance

  • Large repositories: 50-70% faster due to algorithm improvements
  • Memory usage: 40-60% reduction
  • Cold start time: 30-40% faster due to smaller binary

Load Time Improvements

  • Binary loading: Faster due to smaller size
  • Dependency resolution: Fewer dependencies to load
  • Git operations: More efficient algorithms

Monitoring & Measurement

Benchmarks to Add

  1. Binary size tracking in CI
  2. Runtime performance tests with various repository sizes
  3. Memory usage profiling
  4. Cold start time measurement

Metrics to Track

  • Binary size over time
  • Git operation performance
  • Memory allocation patterns
  • User-perceived performance (time to first result)