Persistence Optimization Module¶

Location: 2 - FOUNDATION/04_CORE/04_11_state_serialization/04_11_04_persistence_optimization/

Compression and atomic file operations for efficient, crash-safe state persistence in audio plugins.

Overview¶

This module provides:

Compression Strategy Pattern - Pluggable compression algorithms (LZ4, ZSTD, None)
Atomic File Operations - Crash-safe file saves using temp-file + rename pattern
Performance Benchmarks - Compare compression algorithms on real-world data
Comprehensive Tests - Unit tests for all components using Catch2

Key Features¶

Multiple compression algorithms with different speed/ratio trade-offs
Thread-safe stateless compression operations
Crash-safe atomic saves - no partial writes
Comprehensive metrics - ratio, throughput, timing for all operations
Zero-copy design - caller manages memory for optimal performance

Files¶

Core Implementation¶

compression_strategy.hpp - Abstract compression strategy interface
compression_strategy.cpp - Factory functions and None implementation
lz4_compressor.hpp - LZ4 fast compression implementation
zstd_compressor.hpp - ZSTD high compression implementation
atomic_file_ops.hpp - Atomic file save/load operations
atomic_file_ops.cpp - Platform-specific fsync implementation

Tests¶

tests/test_compression.cpp - Compression strategy tests
tests/test_atomic_file_ops.cpp - Atomic file operation tests

Benchmarks¶

benchmarks/compression_benchmark.cpp - Performance comparison

Build¶

CMakeLists.txt - CMake build configuration

Architecture¶

Strategy Pattern¶

ICompressionStrategy (interface)
├── LZ4Compressor (fast, moderate ratio)
├── ZSTDCompressor (slower, high ratio)
└── NoneCompressor (passthrough for debugging)

Compression Flow¶

// 1. Create strategy
auto strategy = create_lz4_strategy();

// 2. Allocate output buffer
size_t max_size = strategy->get_max_compressed_size(data.size());
std::vector<uint8_t> compressed(max_size);

// 3. Compress
auto result = strategy->compress(
    data.data(), data.size(),
    compressed.data(), compressed.size()
);

if (result.success) {
    compressed.resize(result.compressed_size);
    // Use compressed data...
}

Atomic Save Flow¶

// 1. Prepare data
std::vector<uint8_t> state_data = serialize_plugin_state();

// 2. Atomic save (temp file + rename)
auto result = AtomicFileOps::save_atomic(
    "plugin_state.dat",
    state_data.data(),
    state_data.size()
);

if (result.success) {
    // File saved atomically, crash-safe
}

Compression Algorithms¶

LZ4 - Fast Compression¶

Characteristics: - Compression: ~500 MB/s - Decompression: ~2000 MB/s - Ratio: 2-3x typical (structured data) - Best for: Real-time autosaves, frequent operations

When to use: - Autosave during playback - Quick session save/load - Frequent undo/redo state snapshots

Example:

auto strategy = create_lz4_strategy();
auto result = strategy->compress(data, size, out, out_size);
// Typical: 256 KB → 100 KB in 0.5 ms

ZSTD - High Compression¶

Characteristics: - Compression: ~100 MB/s (level 3, default) - Decompression: ~500 MB/s - Ratio: 3-5x typical (structured data) - Tunable: levels 1-22 (higher = better ratio, slower) - Best for: Archival, network transfer, large states

When to use: - Preset library storage - Project archival - Network transfer to cloud - Large sample-based states

Example:

auto strategy = create_zstd_strategy(3);  // Default level
auto result = strategy->compress(data, size, out, out_size);
// Typical: 256 KB → 70 KB in 2.5 ms

// High compression for archival
auto archival = create_zstd_strategy(9);
// Typical: 256 KB → 50 KB in 15 ms

None - No Compression¶

Characteristics: - Compression: Memory bandwidth limited (~10 GB/s) - Ratio: 1.0 (no compression) - Best for: Debugging, very small data (<1KB)

When to use: - Debugging compression issues - Benchmarking overhead - Data that doesn't compress (already compressed audio)

Atomic File Operations¶

Crash Safety Guarantee¶

Problem: Direct file writes can leave partial data on crash/power loss.

Solution: Temp-file + atomic rename pattern:

Write to <path>.tmp
Flush to disk (fsync)
Atomic rename to <path> (OS-level atomic operation)
Clean up temp file on failure

Result: File always contains either old data or new data, never partial.

API¶

// Save atomically
auto save_result = AtomicFileOps::save_atomic(
    "plugin_state.dat",
    data.data(),
    data.size()
);

// Load file
auto load_result = AtomicFileOps::load("plugin_state.dat");
if (load_result.success) {
    deserialize_state(load_result.data);
}

// File management
bool exists = AtomicFileOps::exists(path);
size_t size = AtomicFileOps::file_size(path);
AtomicFileOps::remove(path);
AtomicFileOps::create_backup(path);  // Creates .backup file

Streaming API¶

For large files that don't fit in memory:

auto result = AtomicFileOps::save_atomic_streamed(
    "large_state.dat",
    [&](std::ostream& out) {
        for (const auto& chunk : large_data) {
            out.write(chunk.data(), chunk.size());
            if (!out.good()) return false;
        }
        return true;
    }
);

Performance Characteristics¶

Compression Speed (typical, 256 KB plugin state)¶

Algorithm	Compress Time	Decompress Time	Ratio	Total Time
None	0.02 ms	0.02 ms	1.0x	0.04 ms
LZ4	0.5 ms	0.13 ms	2.5x	0.63 ms
LZ4-HC	2.0 ms	0.13 ms	3.0x	2.13 ms
ZSTD-1	1.0 ms	0.5 ms	3.0x	1.5 ms
ZSTD-3	2.5 ms	0.5 ms	3.5x	3.0 ms
ZSTD-9	15 ms	0.5 ms	4.5x	15.5 ms

File I/O Performance (100 KB, SSD)¶

Operation	Time	Throughput
Direct write	0.3 ms	320 MB/s
Atomic save	2-5 ms	20-50 MB/s
Load	0.2 ms	500 MB/s

Note: Atomic save is slower due to fsync() overhead but provides crash safety.

Usage Examples¶

Example 1: Quick Autosave with LZ4¶

#include "compression_strategy.hpp"
#include "atomic_file_ops.hpp"

void autosave_plugin_state(const PluginState& state) {
    // Serialize to binary
    auto serialized = binary_serialize(state);

    // Compress with LZ4 (fast)
    auto compressor = create_lz4_strategy();
    size_t max_compressed = compressor->get_max_compressed_size(serialized.size());
    std::vector<uint8_t> compressed(max_compressed);

    auto compress_result = compressor->compress(
        serialized.data(), serialized.size(),
        compressed.data(), compressed.size()
    );

    if (!compress_result.success) {
        log_error("Compression failed");
        return;
    }

    compressed.resize(compress_result.compressed_size);

    // Atomic save
    auto save_result = AtomicFileOps::save_atomic(
        "autosave.dat",
        compressed.data(),
        compressed.size()
    );

    if (save_result.success) {
        log_info("Autosave: {} KB → {} KB ({:.1f}x) in {:.1f} ms",
            serialized.size() / 1024,
            compressed.size() / 1024,
            compress_result.compression_ratio,
            compress_result.elapsed_ms + save_result.elapsed_ms
        );
    }
}

Example 2: Load and Decompress State¶

std::optional<PluginState> load_plugin_state(const std::filesystem::path& path) {
    // Load compressed file
    auto load_result = AtomicFileOps::load(path);
    if (!load_result.success) {
        log_error("Load failed: {}", load_result.error_message);
        return std::nullopt;
    }

    // Decompress (algorithm stored in file header - not shown)
    auto decompressor = create_lz4_strategy();

    // Allocate decompression buffer (size from header - not shown)
    size_t uncompressed_size = read_header_size(load_result.data);
    std::vector<uint8_t> decompressed(uncompressed_size);

    auto decompress_result = decompressor->decompress(
        load_result.data.data(), load_result.data.size(),
        decompressed.data(), decompressed.size()
    );

    if (!decompress_result.success) {
        log_error("Decompression failed");
        return std::nullopt;
    }

    // Deserialize
    return binary_deserialize<PluginState>(decompressed);
}

Example 3: Preset Library with ZSTD¶

void save_preset_to_library(const Preset& preset, const std::string& name) {
    auto serialized = binary_serialize(preset);

    // Use ZSTD for better compression (presets stored long-term)
    auto compressor = create_zstd_strategy(9);  // High compression

    size_t max_compressed = compressor->get_max_compressed_size(serialized.size());
    std::vector<uint8_t> compressed(max_compressed);

    auto result = compressor->compress(
        serialized.data(), serialized.size(),
        compressed.data(), compressed.size()
    );

    if (result.success) {
        compressed.resize(result.compressed_size);

        std::filesystem::path preset_path =
            get_preset_library_path() / (name + ".preset");

        AtomicFileOps::save_atomic(
            preset_path,
            compressed.data(),
            compressed.size()
        );

        log_info("Preset '{}' saved: {:.1f}x compression",
            name, result.compression_ratio);
    }
}

Example 4: Streaming Large Project Save¶

void save_large_project(const Project& project, const std::filesystem::path& path) {
    // Use streaming API for large project
    auto result = AtomicFileOps::save_atomic_streamed(
        path,
        [&](std::ostream& out) {
            // Write header
            ProjectHeader header = create_header(project);
            out.write(reinterpret_cast<const char*>(&header), sizeof(header));

            // Compress and write tracks
            auto compressor = create_zstd_strategy(3);
            for (const auto& track : project.tracks) {
                auto serialized = binary_serialize(track);

                size_t max_size = compressor->get_max_compressed_size(serialized.size());
                std::vector<uint8_t> compressed(max_size);

                auto compress_result = compressor->compress(
                    serialized.data(), serialized.size(),
                    compressed.data(), compressed.size()
                );

                if (!compress_result.success) return false;

                // Write compressed track
                uint32_t compressed_size = compress_result.compressed_size;
                out.write(reinterpret_cast<const char*>(&compressed_size), sizeof(compressed_size));
                out.write(reinterpret_cast<const char*>(compressed.data()), compressed_size);
            }

            return out.good();
        }
    );

    if (result.success) {
        log_info("Project saved: {} MB in {:.1f} seconds",
            result.bytes_written / (1024 * 1024),
            result.elapsed_ms / 1000.0
        );
    }
}

Building and Testing¶

Prerequisites¶

# Install dependencies via vcpkg
vcpkg install lz4 zstd catch2

CMake Configuration¶

cd "2 - FOUNDATION"
mkdir build && cd build
cmake .. -DCMAKE_TOOLCHAIN_FILE=<vcpkg-root>/scripts/buildsystems/vcpkg.cmake
cmake --build . --config Release

Running Tests¶

# All tests
ctest -C Release

# Specific tests
ctest -C Release -R test_compression
ctest -C Release -R test_atomic_file_ops

# Verbose output
ctest -C Release -VV

Running Benchmark¶

cd build/04_CORE/04_11_state_serialization/04_11_04_persistence_optimization
./compression_benchmark

Expected output:

╔═══════════════════════════════════════════════════════════════╗
║         COMPRESSION BENCHMARK - AudioLab Persistence          ║
╚═══════════════════════════════════════════════════════════════╝

Structured Data (256 KB)
----------------------------------------------------------------------
  Algorithm  :  Ratio  | Comp MB/s | Decomp MB/s | Total ms
----------------------------------------------------------------------
        None :    1.00x |  10234 MB/s |  10123 MB/s |   0.05 ms
         LZ4 :    2.56x |    512 MB/s |   2048 MB/s |   0.63 ms
      LZ4-HC :    3.12x |    128 MB/s |   2048 MB/s |   2.13 ms
      ZSTD-1 :    2.89x |    256 MB/s |    512 MB/s |   1.50 ms
      ZSTD-3 :    3.45x |    102 MB/s |    512 MB/s |   3.01 ms
      ZSTD-9 :    4.67x |     17 MB/s |    512 MB/s |  15.12 ms

...

Integration with State Serialization¶

This module is designed to integrate with the existing state serialization system:

// In your serialization layer
class CompressedBinarySerializer {
public:
    template<typename T>
    std::vector<uint8_t> serialize(const T& obj, CompressionAlgorithm algo) {
        // 1. Binary serialize
        auto binary = binary_serialize(obj);

        // 2. Compress
        auto strategy = create_compression_strategy(algo);
        size_t max_size = strategy->get_max_compressed_size(binary.size());
        std::vector<uint8_t> compressed(max_size);

        auto result = strategy->compress(
            binary.data(), binary.size(),
            compressed.data(), compressed.size()
        );

        if (!result.success) {
            throw std::runtime_error("Compression failed");
        }

        // 3. Prepend header with algorithm and uncompressed size
        CompressedHeader header{
            .magic = MAGIC_NUMBER,
            .algorithm = algo,
            .uncompressed_size = binary.size(),
            .compressed_size = result.compressed_size
        };

        std::vector<uint8_t> output;
        output.reserve(sizeof(header) + result.compressed_size);

        // Copy header
        const auto* header_bytes = reinterpret_cast<const uint8_t*>(&header);
        output.insert(output.end(), header_bytes, header_bytes + sizeof(header));

        // Copy compressed data
        output.insert(output.end(), compressed.begin(),
                     compressed.begin() + result.compressed_size);

        return output;
    }

    template<typename T>
    T deserialize(const std::vector<uint8_t>& data) {
        // 1. Read header
        const auto* header = reinterpret_cast<const CompressedHeader*>(data.data());

        // 2. Decompress
        auto strategy = create_compression_strategy(header->algorithm);
        std::vector<uint8_t> decompressed(header->uncompressed_size);

        auto result = strategy->decompress(
            data.data() + sizeof(CompressedHeader),
            header->compressed_size,
            decompressed.data(),
            decompressed.size()
        );

        if (!result.success) {
            throw std::runtime_error("Decompression failed");
        }

        // 3. Binary deserialize
        return binary_deserialize<T>(decompressed);
    }
};

Design Decisions¶

1. Strategy Pattern for Compression¶

Why: Allows runtime selection of compression algorithm based on use case.

Alternative considered: Template-based compile-time selection - Rejected: Requires recompilation to change algorithm, can't be user-configurable

2. Caller-Managed Memory¶

Why: Zero-copy design, optimal performance, no hidden allocations.

Alternative considered: Return vector from compress() - Rejected: Forces allocation, can't reuse buffers, less flexible

3. Atomic Rename Pattern¶

Why: OS-level atomic operation, portable, simple implementation.

Alternative considered: Write-ahead logging (WAL) - Rejected: More complex, overkill for plugin state, slower

4. fsync() for Durability¶

Why: Guarantees data on disk, prevents data loss on power failure.

Alternative considered: Skip fsync for speed - Rejected: Not crash-safe, can lose recent saves

Performance Optimization Tips¶

1. Choose Algorithm Based on Use Case¶

// Real-time autosave (< 1ms budget)
auto fast_save = create_lz4_strategy();

// User-initiated save (< 10ms acceptable)
auto balanced_save = create_zstd_strategy(3);

// Preset library (no time constraint)
auto archival_save = create_zstd_strategy(9);

2. Reuse Compression Buffers¶

class StateManager {
    std::unique_ptr<ICompressionStrategy> strategy_;
    std::vector<uint8_t> compression_buffer_;

    void save() {
        auto data = serialize_state();

        // Resize buffer if needed (doesn't reallocate if same size)
        size_t max_size = strategy_->get_max_compressed_size(data.size());
        if (compression_buffer_.size() < max_size) {
            compression_buffer_.resize(max_size);
        }

        auto result = strategy_->compress(
            data.data(), data.size(),
            compression_buffer_.data(), compression_buffer_.size()
        );

        // Use result...
    }
};

3. Batch Small Saves¶

// Instead of many small saves
for (const auto& param : params) {
    save_parameter(param);  // Each triggers fsync - SLOW
}

// Batch into single save
save_all_parameters(params);  // Single fsync - FAST

4. Use Streaming for Large Data¶

// Don't load entire project into memory
auto project_data = load_entire_project();  // BAD: huge allocation

// Stream chunks instead
save_atomic_streamed(path, [&](std::ostream& out) {
    for (const auto& chunk : get_project_chunks()) {
        out.write(chunk.data(), chunk.size());
    }
});

Thread Safety¶

Compression Operations¶

Thread-safe: All compression/decompression operations are stateless
Can compress different data concurrently from multiple threads
No synchronization needed for compression itself

File Operations¶

Thread-safe for different files: Multiple threads can save to different files
NOT thread-safe for same file: Requires external synchronization
Atomic rename is OS-level atomic: Safe even with concurrent readers

Example: Concurrent Saves¶

// SAFE: Different files
std::thread t1([&]{ save_state("plugin1.dat", state1); });
std::thread t2([&]{ save_state("plugin2.dat", state2); });

// UNSAFE: Same file (requires mutex)
std::mutex file_mutex;
std::thread t3([&]{
    std::lock_guard lock(file_mutex);
    save_state("shared.dat", state3);
});

Troubleshooting¶

Compression Fails¶

Symptom: compress_result.success == false

Causes: 1. Output buffer too small - Fix: Use get_max_compressed_size() to allocate buffer 2. Input data too large (> INT_MAX for LZ4) - Fix: Split data into chunks or use ZSTD

Decompression Fails¶

Symptom: decompress_result.success == false

Causes: 1. Corrupted compressed data - Fix: Add checksums/CRC to detect corruption 2. Output buffer too small - Fix: Store uncompressed size in header 3. Wrong decompression algorithm - Fix: Store algorithm ID in header

Atomic Save Fails¶

Symptom: save_result.success == false

Causes: 1. Disk full - Check: save_result.error_message for details 2. Permission denied - Fix: Check file/directory permissions 3. Invalid path - Fix: Validate path before saving

Benchmark Shows Low Performance¶

Possible reasons: 1. Debug build (much slower than release) - Fix: Build with -DCMAKE_BUILD_TYPE=Release 2. Cold cache (first run slower) - Fix: Run benchmark multiple times, ignore first 3. Antivirus scanning files - Fix: Exclude build directory from antivirus

Future Enhancements¶

Planned Features¶

Encryption Support
AES-256 encryption for sensitive state data
Integrate with compression (compress then encrypt)
Delta Compression
Only save changed data since last save
Faster saves for large states with small changes
Background Compression
Compress on worker thread
Non-blocking autosave
Adaptive Algorithm Selection
Automatically choose algorithm based on data characteristics
Profile compression ratio vs. time trade-off
Additional Algorithms
Brotli (better compression than ZSTD for text)
Snappy (faster than LZ4, but lower ratio)

Integration TODO¶

Add compression to XML/JSON serializers
Implement compressed file header format
Add CRC32 checksums for data integrity
Create unified StateManager API
Add preset library compression utilities

References¶

License¶

Part of the AudioLab project. See root LICENSE file.

Contact¶

For questions or issues with this module, please open an issue on the AudioLab GitHub repository.