Integration Guide - Performance Variants¶
🎯 Purpose¶
This guide explains how to integrate the Performance Variants subsystem with other AudioLab components, particularly: - 05_15_REFERENCE_IMPLEMENTATIONS - For validation - 05_18_QUALITY_METRICS - For accuracy measurement - 05_MODULES - For actual audio processing
🏗️ Architecture Overview¶
┌─────────────────────────────────────────────────────────┐
│ AudioLab Audio Processing Pipeline │
└─────────────────────────────────────────────────────────┘
│
├─► Gain, Mix, Filter, etc.
│
▼
┌────────────────────────────────┐
│ VariantDispatcher │
│ (Auto-selects best variant) │
└────────────────────────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Scalar │ │ SIMD │ │ GPU │
│ Variant │ │ Variants │ │ Variants │
└───────────┘ └───────────┘ └───────────┘
│ │ │
│ ┌───────────┴────────┐ │
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────────────────────────────────────┐
│ Reference Implementations │
│ (Gold standard for validation) │
└──────────────────────────────────────────────┘
│
▼
┌────────────────────┐
│ Quality Metrics │
│ (THD, SNR, etc.) │
└────────────────────┘
📦 Component Integration¶
1. With Reference Implementations (05_15)¶
Purpose: Validate that optimized variants produce correct results.
Integration Points:
// In validation tests
#include "SSE4Variants.h"
#include "kernels/oscillators/sine_kernel.hpp" // Reference
// Compare SIMD variant against reference
SSE4GainVariant simdGain;
ReferenceGain refGain; // Scalar implementation
// Process same input
simdGain.process(input, outputSIMD, numSamples);
refGain.process(input, outputRef, numSamples);
// Validate accuracy
float maxError = calculateMaxError(outputSIMD, outputRef, numSamples);
REQUIRE(maxError < tolerance);
Workflow:
- Development Phase:
- Implement scalar reference first (05_15)
- Certify reference as "Gold Standard"
-
Implement SIMD variant (05_16)
-
Validation Phase:
- Create validation test comparing SIMD vs reference
- Run tests with various inputs (sine, noise, edge cases)
- Measure max error, RMS error
-
Verify accuracy within tolerance
-
Certification Phase:
- All tests pass → SIMD variant certified
- Document any numerical differences
- Add to certified variants list
Tolerances:
| Operation | Max Error | RMS Error | Notes |
|---|---|---|---|
| Gain | < 1e-6 | < 1e-7 | Should be bit-exact |
| Mix | < 1e-6 | < 1e-7 | Should be bit-exact |
| IIR Filter | < 1e-5 | < 1e-6 | Accumulation errors OK |
| FIR Filter | < 1e-6 | < 1e-7 | Should be near bit-exact |
| FFT | < 1e-4 | < 1e-5 | More relaxed for complex ops |
2. With Quality Metrics (05_18)¶
Purpose: Measure audio quality metrics (THD, SNR, frequency response).
Integration Points:
#include "metrics_core.hpp"
#include "SSE4Variants.h"
// Generate test tone
generateSineWave(input, 1000.0f, 48000.0f);
// Process with variant
SSE4GainVariant gain;
gain.setGain(0.5f);
gain.process(input, output, numSamples);
// Measure quality
THDAnalyzer thd;
float thdValue = thd.analyze(output, numSamples, 48000.0f);
REQUIRE(thdValue < -90.0f); // THD < -90dB
SNRAnalyzer snr;
float snrValue = snr.analyze(output, numSamples);
REQUIRE(snrValue > 100.0f); // SNR > 100dB
Quality Gates:
| Variant Type | THD+N | SNR | Freq Response | Phase |
|---|---|---|---|---|
| Gain | < -120dB | > 120dB | ±0.01dB | ±0.1° |
| Mix | < -120dB | > 110dB | ±0.01dB | ±0.1° |
| IIR Filter | < -90dB | > 90dB | ±0.5dB | ±2° |
| FIR Filter | < -100dB | > 100dB | ±0.1dB | ±0.5° |
3. With Audio Engines (05_13)¶
Purpose: Use variants in actual audio processing engines.
Integration Example:
#include "VariantDispatcher.h"
#include "SSE4Variants.h"
#include "AVX2Variants.h"
class AudioEngine {
public:
void init(double sampleRate) {
// Register all available variants
dispatcher_.registerVariant(
std::make_unique<SSE4GainVariant>(),
VariantType::SIMD,
1.2f
);
if (HAS_FEATURE(AVX2)) {
dispatcher_.registerVariant(
std::make_unique<AVX2GainVariant>(),
VariantType::SIMD,
1.5f
);
}
dispatcher_.init(sampleRate);
// Configure for low-latency
RuntimeContext context;
context.bufferSize = 128;
context.latencyBudgetMs = 3.0f;
context.maxCPUUsage = 0.7f;
dispatcher_.setScoringWeights(ScoringProfiles::speed());
dispatcher_.selectOptimalVariant(context);
}
void processBlock(float* input, float* output, int numSamples) {
// Dispatcher automatically uses best variant
dispatcher_.process(input, output, numSamples);
}
private:
VariantDispatcher dispatcher_;
};
🔧 Build Integration¶
CMake Integration¶
Option 1: As Subdirectory
# In your project's CMakeLists.txt
add_subdirectory(05_16_PERFORMANCE_VARIANTS/05_16_00_variant_framework)
add_subdirectory(05_16_PERFORMANCE_VARIANTS/05_16_01_simd_variants)
# Link against variants
target_link_libraries(your_target
PRIVATE
variant_framework
simd_variants
)
Option 2: Find Package
# After installation
find_package(VariantFramework REQUIRED)
find_package(SIMDVariants REQUIRED)
target_link_libraries(your_target
PRIVATE
AudioLab::variant_framework
AudioLab::simd_variants
)
Option 3: FetchContent
include(FetchContent)
FetchContent_Declare(
audiolab_variants
GIT_REPOSITORY https://github.com/audiolab/variants.git
GIT_TAG v1.0.0
)
FetchContent_MakeAvailable(audiolab_variants)
Compiler Flags¶
Important: SIMD variants require appropriate compiler flags:
# For SSE4
if(ENABLE_SSE4)
if(MSVC)
# SSE4 is default on x64
else()
target_compile_options(your_target PRIVATE -msse4.1 -msse4.2)
endif()
endif()
# For AVX2
if(ENABLE_AVX2)
if(MSVC)
target_compile_options(your_target PRIVATE /arch:AVX2)
else()
target_compile_options(your_target PRIVATE -mavx2 -mfma)
endif()
endif()
✅ Validation Workflow¶
Step 1: Implement Reference¶
// In 05_15_REFERENCE_IMPLEMENTATIONS
class ReferenceGain {
public:
void process(const float* input, float* output, size_t n) {
for (size_t i = 0; i < n; ++i) {
output[i] = input[i] * gain_;
}
}
private:
float gain_ = 1.0f;
};
Step 2: Implement SIMD Variant¶
// In 05_16_01_simd_variants
class SSE4GainVariant : public IVariant {
public:
bool process(const float* input, float* output, size_t n) override {
// SSE4 implementation
__m128 gainVec = _mm_set1_ps(gain_);
for (size_t i = 0; i < n; i += 4) {
__m128 in = _mm_loadu_ps(input + i);
__m128 out = _mm_mul_ps(in, gainVec);
_mm_storeu_ps(output + i, out);
}
// ... handle remainder
return true;
}
};
Step 3: Create Validation Test¶
// In 05_16_01_simd_variants/tests/
TEST_CASE("SSE4Gain vs Reference") {
ReferenceGain ref;
SSE4GainVariant simd;
// Process same input
ref.process(input, outputRef, n);
simd.process(input, outputSIMD, n);
// Validate
float maxErr = calculateMaxError(outputRef, outputSIMD, n);
REQUIRE(maxErr < 1e-6f);
}
Step 4: Run Tests¶
cd build
./test_simd_variants --reporter compact
# Run validation tests specifically
./test_simd_variants "[validation]"
Step 5: Measure Quality¶
// Optional: Deep quality analysis
#include "metrics_core.hpp"
THDAnalyzer thd;
float thdValue = thd.analyze(outputSIMD, n, 48000.0f);
FrequencyResponseAnalyzer freq;
auto response = freq.analyze(outputSIMD, n);
// Generate report
QualityReport report;
report.addMetric("THD+N", thdValue);
report.addMetric("Frequency Response", response);
report.save("validation_report.json");
📊 Performance Benchmarking¶
Benchmark Against Reference¶
#include <chrono>
// Benchmark reference
auto start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterations; ++i) {
ref.process(input, output, n);
}
auto end = std::chrono::high_resolution_clock::now();
double refTime = duration_cast<microseconds>(end - start).count();
// Benchmark SIMD
start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterations; ++i) {
simd.process(input, output, n);
}
end = std::chrono::high_resolution_clock::now();
double simdTime = duration_cast<microseconds>(end - start).count();
// Calculate speedup
double speedup = refTime / simdTime;
std::cout << "Speedup: " << speedup << "x" << std::endl;
Expected Speedups¶
| Variant | Expected Speedup | Measured Speedup |
|---|---|---|
| SSE4Gain | 3.5-4.5x | TBD |
| AVX2Gain | 6-8x | TBD |
| SSE4Mix | 4-6x | TBD |
| AVX2Mix | 7-10x | TBD |
| SSE4Biquad | 1.5-2x | TBD |
| AVX2Biquad | 2-3x | TBD |
🐛 Troubleshooting Integration¶
Problem: Linker Errors¶
Symptom: Undefined references to variant functions
Solution:
# Ensure proper linking
target_link_libraries(your_target PRIVATE simd_variants)
# Check that library was built
ls build/lib/libsimd_variants.a # Linux/Mac
dir build\lib\simd_variants.lib # Windows
Problem: Runtime Crashes¶
Symptom: Illegal instruction / SIGILL
Cause: Using SIMD instructions on CPU without support
Solution:
// Always check CPU features
if (!HAS_FEATURE(AVX2)) {
// Fall back to SSE4 or scalar
dispatcher.registerVariant(
std::make_unique<SSE4GainVariant>(),
VariantType::SIMD
);
} else {
dispatcher.registerVariant(
std::make_unique<AVX2GainVariant>(),
VariantType::SIMD
);
}
Problem: Numerical Differences¶
Symptom: Validation tests fail with small errors
Expected: IIR filters may have minor differences due to: - Floating-point operation reordering - FMA (Fused Multiply-Add) changes precision - Accumulation order differences
Solution:
// Use relaxed tolerance for IIR
if (isIIRFilter) {
REQUIRE(maxError < 1e-5f); // More relaxed
} else {
REQUIRE(maxError < 1e-6f); // Strict
}
📚 API Usage Examples¶
Example 1: Simple Gain Processing¶
#include "VariantDispatcher.h"
#include "SSE4Variants.h"
void processAudio(float* buffer, int numSamples) {
// Create and setup dispatcher (once at startup)
static VariantDispatcher dispatcher;
static bool initialized = false;
if (!initialized) {
// Register variants
auto variants = createSSE4Variants();
for (auto& variant : variants) {
dispatcher.registerVariant(std::move(variant), VariantType::SIMD);
}
dispatcher.init(48000.0);
initialized = true;
}
// Process (in real-time callback)
dispatcher.process(buffer, buffer, numSamples);
}
Example 2: Stereo Mixer¶
void mixTracks(
const float* track1L, const float* track1R,
const float* track2L, const float* track2R,
float* outputL, float* outputR,
int numSamples
) {
AVX2MixVariant mixer;
mixer.init(48000.0);
mixer.setGain1(0.7f);
mixer.setGain2(0.5f);
mixer.mixStereo(
track1L, track1R,
track2L, track2R,
outputL, outputR,
numSamples
);
}
Example 3: Filter Chain¶
void applyFilterChain(float* buffer, int numSamples) {
// Lowpass @ 5kHz
SSE4BiquadVariant lpf;
lpf.init(48000.0);
lpf.designLowpass(48000.0, 5000.0, 0.707);
lpf.process(buffer, buffer, numSamples);
// Highpass @ 100Hz
SSE4BiquadVariant hpf;
hpf.init(48000.0);
hpf.designHighpass(48000.0, 100.0, 0.707);
hpf.process(buffer, buffer, numSamples);
}
🎯 Best Practices¶
1. Always Validate¶
✅ DO: Create validation tests for every SIMD variant ✅ DO: Compare against reference implementation ✅ DO: Test with various input types (sine, noise, edge cases) ✅ DO: Measure quality metrics (THD, SNR)
❌ DON'T: Assume SIMD is correct without testing ❌ DON'T: Skip edge case testing ❌ DON'T: Ignore small numerical differences without understanding them
2. Profile Before Optimizing¶
✅ DO: Measure actual performance gain ✅ DO: Test on target hardware ✅ DO: Compare against baseline
❌ DON'T: Assume theoretical speedup matches reality ❌ DON'T: Optimize without profiling first
3. Handle Feature Detection¶
✅ DO: Check CPU features at runtime ✅ DO: Provide fallbacks ✅ DO: Test on various CPUs
❌ DON'T: Assume all CPUs have AVX2 ❌ DON'T: Crash on unsupported instructions
📞 Status¶
Integration Status: ✅ Complete
Components Integrated: - ✅ Variant Framework - ✅ SIMD Variants (SSE4, AVX2) - ✅ Validation Tests - 🔄 Quality Metrics (planned) - 🔄 Audio Engines (planned)
Next Steps: - Integrate with actual audio engines - Add comprehensive benchmarking - Create quality metric reports - Document real-world performance
Last Updated: 2025-10-15 Version: 1.0.0 Maintainer: AudioLab Performance Team