docs/PERFORMANCE_GUIDE.md

# Performance Optimization Guide

## Overview

Ghost is designed for process injection detection with configurable performance characteristics. This guide covers actual optimization strategies and expected performance.

## Performance Characteristics

### Expected Detection Engine Performance

- **Process Enumeration**: 10-50ms for all system processes
- **Memory Region Analysis**: 1-5ms per process (platform-dependent)
- **Thread Enumeration**: 1-10ms per process
- **Detection Heuristics**: <1ms per process
- **Memory Usage**: ~10-20MB for core engine

**Note**: Actual performance varies significantly by:
- Number of processes (100-1000+ typical)
- Memory region count per process
- Thread count per process
- Platform (Windows APIs vs Linux procfs)

### Configuration Options

#### 1. Selective Detection

```rust
use ghost_core::config::DetectionConfig;

// Disable expensive detections for performance
let mut config = DetectionConfig::default();
config.rwx_detection = true;      // Fast: O(n) memory regions
config.shellcode_detection = false; // Skip pattern matching
config.hook_detection = false;    // Skip module enumeration
config.thread_detection = true;   // Moderate: thread enum
config.hollowing_detection = false; // Skip heuristics
```

#### 2. Preset Modes

```rust
// Fast scanning mode
let config = DetectionConfig::performance_mode();

// Thorough scanning mode
let config = DetectionConfig::thorough_mode();
```

#### 3. Process Filtering

```rust
// Skip system processes
config.skip_system_processes = true;

// Limit memory scan size
config.max_memory_scan_size = 10 * 1024 * 1024; // 10MB per process
```

## Performance Considerations

### Platform-Specific Performance

**Windows**:
- CreateToolhelp32Snapshot: Single syscall, fast
- VirtualQueryEx: Iterative, slower for processes with many regions
- ReadProcessMemory: Cross-process, requires proper handles
- NtQueryInformationThread: Undocumented API call per thread

**Linux**:
- /proc enumeration: Directory reads, fast
- /proc/[pid]/maps parsing: File I/O, moderate
- /proc/[pid]/mem reading: Requires ptrace or same user
- /proc/[pid]/task parsing: Per-thread file I/O

**macOS**:
- sysctl KERN_PROC_ALL: Single syscall, fast
- Memory/thread analysis: Not yet implemented

### Running Tests

```bash
# Run all tests including performance assertions
cargo test

# Run tests with timing output
cargo test -- --nocapture
```

## Tuning Guidelines

### For Continuous Monitoring

1. **Adjust scan interval**: Configure `scan_interval_ms` in DetectionConfig
2. **Skip system processes**: Set `skip_system_processes = true`
3. **Limit memory scans**: Reduce `max_memory_scan_size`
4. **Disable heavy detections**: Turn off hook_detection and shellcode_detection

### For One-Time Analysis

1. **Enable all detections**: Use `DetectionConfig::thorough_mode()`
2. **Full memory scanning**: Increase `max_memory_scan_size`
3. **Include system processes**: Set `skip_system_processes = false`

## Platform-Specific Optimizations

### Windows

- Run as Administrator for full process access
- Use `PROCESS_QUERY_LIMITED_INFORMATION` when `PROCESS_QUERY_INFORMATION` fails
- Handle access denied errors gracefully (system processes)

### Linux

- Run with appropriate privileges (root or CAP_SYS_PTRACE)
- Handle permission denied for /proc/[pid]/mem gracefully
- Consider using process groups for batch access

### macOS

- Limited functionality (process enumeration only)
- Most detection features require kernel extensions or Endpoint Security framework

## Troubleshooting Performance Issues

### High CPU Usage

1. Reduce scan frequency (`scan_interval_ms`)
2. Disable thread analysis for each scan
3. Skip memory region enumeration
4. Filter out known-good processes

### High Memory Usage

1. Reduce baseline cache size (limited processes tracked)
2. Clear detection history periodically
3. Limit memory reading buffer sizes

### Slow Detection Response

1. Disable hook detection (expensive module enumeration)
2. Skip shellcode pattern matching
3. Use performance preset mode

## Current Implementation Limits

**What's NOT implemented**:
- No performance metrics collection system
- No Prometheus/monitoring integration
- No SIMD-accelerated pattern matching
- No parallel/async process scanning (single-threaded)
- No LRU caching of results
- No batch processing APIs

**Current architecture**:
- Sequential process scanning
- Simple HashMap for baseline tracking
- Basic confidence scoring
- Manual timer-based intervals (TUI)

## Testing Performance

```rust
#[test]
fn test_detection_performance() {
    use std::time::Instant;

    let mut engine = DetectionEngine::new().unwrap();
    let process = ProcessInfo::new(1234, 4, "test.exe".to_string());
    let regions = vec![/* test regions */];

    let start = Instant::now();
    for _ in 0..100 {
        engine.analyze_process(&process, &regions, None);
    }
    let duration = start.elapsed();

    // Should complete 100 analyses in under 100ms
    assert!(duration.as_millis() < 100);
}
```

## Best Practices

1. **Start with defaults**: Use `DetectionConfig::default()` initially
2. **Profile specific modules**: Identify which detection is slow
3. **Adjust based on needs**: Disable features you don't need
4. **Handle errors gracefully**: Processes may exit during scan
5. **Test on target hardware**: Performance varies by system

## Future Performance Improvements

Potential enhancements (not yet implemented):
- Parallel process analysis using rayon
- Async I/O for file system operations (Linux)
- Result caching with TTL
- Incremental scanning (only changed processes)
- Memory-mapped file parsing
- SIMD pattern matching for shellcode
add performance optimization guide 2025-11-08 11:48:27 +02:00			`# Performance Optimization Guide`

			`## Overview`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`Ghost is designed for process injection detection with configurable performance characteristics. This guide covers actual optimization strategies and expected performance.`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`## Performance Characteristics`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### Expected Detection Engine Performance`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`- Process Enumeration: 10-50ms for all system processes`
			`- Memory Region Analysis: 1-5ms per process (platform-dependent)`
			`- Thread Enumeration: 1-10ms per process`
			`- Detection Heuristics: <1ms per process`
			`- Memory Usage: ~10-20MB for core engine`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`Note: Actual performance varies significantly by:`
			`- Number of processes (100-1000+ typical)`
			`- Memory region count per process`
			`- Thread count per process`
			`- Platform (Windows APIs vs Linux procfs)`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### Configuration Options`

			`#### 1. Selective Detection`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			```rust
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`use ghost_core::config::DetectionConfig;`

			`// Disable expensive detections for performance`
			`let mut config = DetectionConfig::default();`
			`config.rwx_detection = true; // Fast: O(n) memory regions`
			`config.shellcode_detection = false; // Skip pattern matching`
			`config.hook_detection = false; // Skip module enumeration`
			`config.thread_detection = true; // Moderate: thread enum`
			`config.hollowing_detection = false; // Skip heuristics`
add performance optimization guide 2025-11-08 11:48:27 +02:00			```

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`#### 2. Preset Modes`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			```rust
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`// Fast scanning mode`
			`let config = DetectionConfig::performance_mode();`

			`// Thorough scanning mode`
			`let config = DetectionConfig::thorough_mode();`
add performance optimization guide 2025-11-08 11:48:27 +02:00			```

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`#### 3. Process Filtering`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			```rust
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`// Skip system processes`
			`config.skip_system_processes = true;`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`// Limit memory scan size`
			`config.max_memory_scan_size = 10 * 1024 * 1024; // 10MB per process`
			```
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`## Performance Considerations`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### Platform-Specific Performance`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`Windows:`
			`- CreateToolhelp32Snapshot: Single syscall, fast`
			`- VirtualQueryEx: Iterative, slower for processes with many regions`
			`- ReadProcessMemory: Cross-process, requires proper handles`
			`- NtQueryInformationThread: Undocumented API call per thread`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`Linux:`
			`- /proc enumeration: Directory reads, fast`
			`- /proc/[pid]/maps parsing: File I/O, moderate`
			`- /proc/[pid]/mem reading: Requires ptrace or same user`
			`- /proc/[pid]/task parsing: Per-thread file I/O`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`macOS:`
			`- sysctl KERN_PROC_ALL: Single syscall, fast`
			`- Memory/thread analysis: Not yet implemented`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### Running Tests`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			```bash
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`# Run all tests including performance assertions`
			`cargo test`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`# Run tests with timing output`
			`cargo test -- --nocapture`
add performance optimization guide 2025-11-08 11:48:27 +02:00			```

			`## Tuning Guidelines`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### For Continuous Monitoring`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			1. Adjust scan interval: Configure `scan_interval_ms` in DetectionConfig
			2. Skip system processes: Set `skip_system_processes = true`
			3. Limit memory scans: Reduce `max_memory_scan_size`
			`4. Disable heavy detections: Turn off hook_detection and shellcode_detection`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`### For One-Time Analysis`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			1. Enable all detections: Use `DetectionConfig::thorough_mode()`
			2. Full memory scanning: Increase `max_memory_scan_size`
			3. Include system processes: Set `skip_system_processes = false`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`## Platform-Specific Optimizations`

			`### Windows`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`- Run as Administrator for full process access`
			- Use `PROCESS_QUERY_LIMITED_INFORMATION` when `PROCESS_QUERY_INFORMATION` fails
			`- Handle access denied errors gracefully (system processes)`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`### Linux`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`- Run with appropriate privileges (root or CAP_SYS_PTRACE)`
			`- Handle permission denied for /proc/[pid]/mem gracefully`
			`- Consider using process groups for batch access`

			`### macOS`

			`- Limited functionality (process enumeration only)`
			`- Most detection features require kernel extensions or Endpoint Security framework`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`## Troubleshooting Performance Issues`

			`### High CPU Usage`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			1. Reduce scan frequency (`scan_interval_ms`)
			`2. Disable thread analysis for each scan`
			`3. Skip memory region enumeration`
			`4. Filter out known-good processes`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`### High Memory Usage`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`1. Reduce baseline cache size (limited processes tracked)`
			`2. Clear detection history periodically`
			`3. Limit memory reading buffer sizes`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			`### Slow Detection Response`

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`1. Disable hook detection (expensive module enumeration)`
			`2. Skip shellcode pattern matching`
			`3. Use performance preset mode`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`## Current Implementation Limits`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`What's NOT implemented:`
			`- No performance metrics collection system`
			`- No Prometheus/monitoring integration`
			`- No SIMD-accelerated pattern matching`
			`- No parallel/async process scanning (single-threaded)`
			`- No LRU caching of results`
			`- No batch processing APIs`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`Current architecture:`
			`- Sequential process scanning`
			`- Simple HashMap for baseline tracking`
			`- Basic confidence scoring`
			`- Manual timer-based intervals (TUI)`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`## Testing Performance`
add performance optimization guide 2025-11-08 11:48:27 +02:00
			```rust
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`#[test]`
			`fn test_detection_performance() {`
			`use std::time::Instant;`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`let mut engine = DetectionEngine::new().unwrap();`
			`let process = ProcessInfo::new(1234, 4, "test.exe".to_string());`
			`let regions = vec![/* test regions */];`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`let start = Instant::now();`
			`for _ in 0..100 {`
			`engine.analyze_process(&process, &regions, None);`
add performance optimization guide 2025-11-08 11:48:27 +02:00			`}`
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`let duration = start.elapsed();`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`// Should complete 100 analyses in under 100ms`
			`assert!(duration.as_millis() < 100);`
add performance optimization guide 2025-11-08 11:48:27 +02:00			`}`
			```

feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			`## Best Practices`
add performance optimization guide 2025-11-08 11:48:27 +02:00
feat: Add PE header validation and LD_PRELOAD detection 2025-11-17 22:02:41 +02:00			1. Start with defaults: Use `DetectionConfig::default()` initially
			`2. Profile specific modules: Identify which detection is slow`
			`3. Adjust based on needs: Disable features you don't need`
			`4. Handle errors gracefully: Processes may exit during scan`
			`5. Test on target hardware: Performance varies by system`

			`## Future Performance Improvements`

			`Potential enhancements (not yet implemented):`
			`- Parallel process analysis using rayon`
			`- Async I/O for file system operations (Linux)`
			`- Result caching with TTL`
			`- Incremental scanning (only changed processes)`
			`- Memory-mapped file parsing`
			`- SIMD pattern matching for shellcode`