nodeprofiling module #17

Open
opened 2025-12-21 11:48:09 +00:00 by thabeta · 0 comments
Owner

The profiling module provides comprehensive system profiling and benchmarking capabilities for nodes, enabling performance analysis, capacity planning, and workload optimization. This module integrates industry-standard tools like iperf for network testing while providing custom benchmarks for disk I/O, memory performance, and CPU capabilities.

The module is designed to help operators understand the performance characteristics of their infrastructure, identify bottlenecks, and make informed decisions about workload placement. It generates detailed reports with historical comparisons, enabling trend analysis and performance regression detection across the fleet.

Functionality

  • Network performance testing: Integrate iperf3 for measuring bandwidth, latency, jitter, and packet loss with configurable test parameters and parallel stream support
  • Disk I/O benchmarking: Perform comprehensive disk performance tests including sequential and random read/write operations, IOPS measurements, and latency analysis
  • Memory profiling: Test memory bandwidth, latency, and performance under various access patterns including NUMA-aware testing
  • CPU performance analysis: Benchmark single-thread and multi-thread performance, instruction throughput, and cache performance
  • Comprehensive reporting: Generate detailed benchmark reports in multiple formats with historical comparisons and trend analysis
  • Resource utilization monitoring: Profile system resource usage during benchmarks to understand performance characteristics under load

Module API

Network Profiling

  • IperfClient::new(server: &str) -> Result<Self, ProfilingError> - Create iperf client for server
  • run_iperf_test(config: IperfConfig) -> Result<IperfResult, ProfilingError> - Execute iperf test
  • measure_bandwidth(server: &str, duration: Duration) -> Result<BandwidthResult, ProfilingError> - Measure bandwidth
  • measure_latency(host: &str, count: u32) -> Result<LatencyResult, ProfilingError> - Measure network latency
  • run_jitter_test(server: &str) -> Result<JitterResult, ProfilingError> - Measure network jitter
  • test_udp_throughput(server: &str, bandwidth: u64) -> Result<ThroughputResult, ProfilingError> - Test UDP throughput
  • IperfConfig::new() -> Self - Create iperf configuration
  • IperfConfig::with_duration(duration: Duration) -> Self - Set test duration
  • IperfConfig::with_parallel_streams(streams: u32) -> Self - Set parallel stream count
  • IperfConfig::with_window_size(size: u32) -> Self - Set TCP window size

Disk Profiling

  • DiskProfiler::new(device: &str) -> Result<Self, ProfilingError> - Create disk profiler
  • run_disk_benchmark(path: &Path) -> Result<DiskBenchmarkResult, ProfilingError> - Run full benchmark
  • measure_sequential_read(path: &Path, size: u64) -> Result<SequentialResult, ProfilingError> - Sequential read test
  • measure_sequential_write(path: &Path, size: u64) -> Result<SequentialResult, ProfilingError> - Sequential write test
  • measure_random_read(path: &Path, size: u64) -> Result<RandomResult, ProfilingError> - Random read test
  • measure_random_write(path: &Path, size: u64) -> Result<RandomResult, ProfilingError> - Random write test
  • measure_iops(path: &Path) -> Result<IOPSResult, ProfilingError> - Measure IOPS performance
  • test_disk_latency(path: &Path) -> Result<LatencyResult, ProfilingError> - Test disk latency

Memory Profiling

  • MemoryProfiler::new() -> Self - Create memory profiler
  • measure_memory_bandwidth() -> Result<MemoryBandwidthResult, ProfilingError> - Measure bandwidth
  • test_memory_latency(size: u64) -> Result<MemoryLatencyResult, ProfilingError> - Test memory latency
  • run_memory_stress_test(duration: Duration) -> Result<StressTestResult, ProfilingError> - Stress test memory
  • measure_cache_performance() -> Result<CacheResult, ProfilingError> - Test cache performance
  • test_numa_bandwidth() -> Result<NumaResult, ProfilingError> - Test NUMA bandwidth
  • profile_memory_usage(pid: u32) -> Result<MemoryUsageProfile, ProfilingError> - Profile memory usage

CPU Profiling

  • CPUProfiler::new() -> Self - Create CPU profiler
  • run_cpu_benchmark(threads: u32, duration: Duration) -> Result<CPUBenchmarkResult, ProfilingError> - Run benchmark
  • measure_single_thread_performance() -> Result<SingleThreadResult, ProfilingError> - Single thread test
  • measure_multi_thread_scaling(threads: u32) -> Result<ScalingResult, ProfilingError> - Multi-thread scaling
  • test_instruction_throughput() -> Result<InstructionResult, ProfilingError> - Test instruction throughput
  • measure_cache_misses() -> Result<CacheMissResult, ProfilingError> - Measure cache misses
  • profile_cpu_usage(pid: u32, duration: Duration) -> Result<CPUUsageProfile, ProfilingError> - Profile CPU usage

Benchmark Reports

  • BenchmarkReport::new() -> Self - Create new benchmark report
  • add_network_result(&mut self, result: IperfResult) -> &mut Self - Add network results
  • add_disk_result(&mut self, result: DiskBenchmarkResult) -> &mut Self - Add disk results
  • add_memory_result(&mut self, result: MemoryBandwidthResult) -> &mut Self - Add memory results
  • add_cpu_result(&mut self, result: CPUBenchmarkResult) -> &mut Self - Add CPU results
  • generate_json(&self) -> Result<String, ProfilingError> - Generate JSON report
  • generate_html(&self) -> Result<String, ProfilingError> - Generate HTML report
  • compare_baselines(&self, baseline: &BenchmarkReport) -> ComparisonResult - Compare with baseline
  • export_to_csv(&self, path: &Path) -> Result<(), ProfilingError> - Export to CSV format
The profiling module provides comprehensive system profiling and benchmarking capabilities for nodes, enabling performance analysis, capacity planning, and workload optimization. This module integrates industry-standard tools like iperf for network testing while providing custom benchmarks for disk I/O, memory performance, and CPU capabilities. The module is designed to help operators understand the performance characteristics of their infrastructure, identify bottlenecks, and make informed decisions about workload placement. It generates detailed reports with historical comparisons, enabling trend analysis and performance regression detection across the fleet. ## Functionality - **Network performance testing**: Integrate iperf3 for measuring bandwidth, latency, jitter, and packet loss with configurable test parameters and parallel stream support - **Disk I/O benchmarking**: Perform comprehensive disk performance tests including sequential and random read/write operations, IOPS measurements, and latency analysis - **Memory profiling**: Test memory bandwidth, latency, and performance under various access patterns including NUMA-aware testing - **CPU performance analysis**: Benchmark single-thread and multi-thread performance, instruction throughput, and cache performance - **Comprehensive reporting**: Generate detailed benchmark reports in multiple formats with historical comparisons and trend analysis - **Resource utilization monitoring**: Profile system resource usage during benchmarks to understand performance characteristics under load ## Module API ### Network Profiling - `IperfClient::new(server: &str) -> Result<Self, ProfilingError>` - Create iperf client for server - `run_iperf_test(config: IperfConfig) -> Result<IperfResult, ProfilingError>` - Execute iperf test - `measure_bandwidth(server: &str, duration: Duration) -> Result<BandwidthResult, ProfilingError>` - Measure bandwidth - `measure_latency(host: &str, count: u32) -> Result<LatencyResult, ProfilingError>` - Measure network latency - `run_jitter_test(server: &str) -> Result<JitterResult, ProfilingError>` - Measure network jitter - `test_udp_throughput(server: &str, bandwidth: u64) -> Result<ThroughputResult, ProfilingError>` - Test UDP throughput - `IperfConfig::new() -> Self` - Create iperf configuration - `IperfConfig::with_duration(duration: Duration) -> Self` - Set test duration - `IperfConfig::with_parallel_streams(streams: u32) -> Self` - Set parallel stream count - `IperfConfig::with_window_size(size: u32) -> Self` - Set TCP window size ### Disk Profiling - `DiskProfiler::new(device: &str) -> Result<Self, ProfilingError>` - Create disk profiler - `run_disk_benchmark(path: &Path) -> Result<DiskBenchmarkResult, ProfilingError>` - Run full benchmark - `measure_sequential_read(path: &Path, size: u64) -> Result<SequentialResult, ProfilingError>` - Sequential read test - `measure_sequential_write(path: &Path, size: u64) -> Result<SequentialResult, ProfilingError>` - Sequential write test - `measure_random_read(path: &Path, size: u64) -> Result<RandomResult, ProfilingError>` - Random read test - `measure_random_write(path: &Path, size: u64) -> Result<RandomResult, ProfilingError>` - Random write test - `measure_iops(path: &Path) -> Result<IOPSResult, ProfilingError>` - Measure IOPS performance - `test_disk_latency(path: &Path) -> Result<LatencyResult, ProfilingError>` - Test disk latency ### Memory Profiling - `MemoryProfiler::new() -> Self` - Create memory profiler - `measure_memory_bandwidth() -> Result<MemoryBandwidthResult, ProfilingError>` - Measure bandwidth - `test_memory_latency(size: u64) -> Result<MemoryLatencyResult, ProfilingError>` - Test memory latency - `run_memory_stress_test(duration: Duration) -> Result<StressTestResult, ProfilingError>` - Stress test memory - `measure_cache_performance() -> Result<CacheResult, ProfilingError>` - Test cache performance - `test_numa_bandwidth() -> Result<NumaResult, ProfilingError>` - Test NUMA bandwidth - `profile_memory_usage(pid: u32) -> Result<MemoryUsageProfile, ProfilingError>` - Profile memory usage ### CPU Profiling - `CPUProfiler::new() -> Self` - Create CPU profiler - `run_cpu_benchmark(threads: u32, duration: Duration) -> Result<CPUBenchmarkResult, ProfilingError>` - Run benchmark - `measure_single_thread_performance() -> Result<SingleThreadResult, ProfilingError>` - Single thread test - `measure_multi_thread_scaling(threads: u32) -> Result<ScalingResult, ProfilingError>` - Multi-thread scaling - `test_instruction_throughput() -> Result<InstructionResult, ProfilingError>` - Test instruction throughput - `measure_cache_misses() -> Result<CacheMissResult, ProfilingError>` - Measure cache misses - `profile_cpu_usage(pid: u32, duration: Duration) -> Result<CPUUsageProfile, ProfilingError>` - Profile CPU usage ### Benchmark Reports - `BenchmarkReport::new() -> Self` - Create new benchmark report - `add_network_result(&mut self, result: IperfResult) -> &mut Self` - Add network results - `add_disk_result(&mut self, result: DiskBenchmarkResult) -> &mut Self` - Add disk results - `add_memory_result(&mut self, result: MemoryBandwidthResult) -> &mut Self` - Add memory results - `add_cpu_result(&mut self, result: CPUBenchmarkResult) -> &mut Self` - Add CPU results - `generate_json(&self) -> Result<String, ProfilingError>` - Generate JSON report - `generate_html(&self) -> Result<String, ProfilingError>` - Generate HTML report - `compare_baselines(&self, baseline: &BenchmarkReport) -> ComparisonResult` - Compare with baseline - `export_to_csv(&self, path: &Path) -> Result<(), ProfilingError>` - Export to CSV format
despiegk added this to the later milestone 2025-12-21 20:38:29 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_research/herolib_rust#17
No description provided.