Skip to content

Persistence Optimization Module

Location: 2 - FOUNDATION/04_CORE/04_11_state_serialization/04_11_04_persistence_optimization/

Compression and atomic file operations for efficient, crash-safe state persistence in audio plugins.


Overview

This module provides:

  1. Compression Strategy Pattern - Pluggable compression algorithms (LZ4, ZSTD, None)
  2. Atomic File Operations - Crash-safe file saves using temp-file + rename pattern
  3. Performance Benchmarks - Compare compression algorithms on real-world data
  4. Comprehensive Tests - Unit tests for all components using Catch2

Key Features

  • Multiple compression algorithms with different speed/ratio trade-offs
  • Thread-safe stateless compression operations
  • Crash-safe atomic saves - no partial writes
  • Comprehensive metrics - ratio, throughput, timing for all operations
  • Zero-copy design - caller manages memory for optimal performance

Files

Core Implementation

Tests

Benchmarks

Build


Architecture

Strategy Pattern

ICompressionStrategy (interface)
├── LZ4Compressor (fast, moderate ratio)
├── ZSTDCompressor (slower, high ratio)
└── NoneCompressor (passthrough for debugging)

Compression Flow

// 1. Create strategy
auto strategy = create_lz4_strategy();

// 2. Allocate output buffer
size_t max_size = strategy->get_max_compressed_size(data.size());
std::vector<uint8_t> compressed(max_size);

// 3. Compress
auto result = strategy->compress(
    data.data(), data.size(),
    compressed.data(), compressed.size()
);

if (result.success) {
    compressed.resize(result.compressed_size);
    // Use compressed data...
}

Atomic Save Flow

// 1. Prepare data
std::vector<uint8_t> state_data = serialize_plugin_state();

// 2. Atomic save (temp file + rename)
auto result = AtomicFileOps::save_atomic(
    "plugin_state.dat",
    state_data.data(),
    state_data.size()
);

if (result.success) {
    // File saved atomically, crash-safe
}

Compression Algorithms

LZ4 - Fast Compression

Characteristics: - Compression: ~500 MB/s - Decompression: ~2000 MB/s - Ratio: 2-3x typical (structured data) - Best for: Real-time autosaves, frequent operations

When to use: - Autosave during playback - Quick session save/load - Frequent undo/redo state snapshots

Example:

auto strategy = create_lz4_strategy();
auto result = strategy->compress(data, size, out, out_size);
// Typical: 256 KB → 100 KB in 0.5 ms

ZSTD - High Compression

Characteristics: - Compression: ~100 MB/s (level 3, default) - Decompression: ~500 MB/s - Ratio: 3-5x typical (structured data) - Tunable: levels 1-22 (higher = better ratio, slower) - Best for: Archival, network transfer, large states

When to use: - Preset library storage - Project archival - Network transfer to cloud - Large sample-based states

Example:

auto strategy = create_zstd_strategy(3);  // Default level
auto result = strategy->compress(data, size, out, out_size);
// Typical: 256 KB → 70 KB in 2.5 ms

// High compression for archival
auto archival = create_zstd_strategy(9);
// Typical: 256 KB → 50 KB in 15 ms

None - No Compression

Characteristics: - Compression: Memory bandwidth limited (~10 GB/s) - Ratio: 1.0 (no compression) - Best for: Debugging, very small data (<1KB)

When to use: - Debugging compression issues - Benchmarking overhead - Data that doesn't compress (already compressed audio)


Atomic File Operations

Crash Safety Guarantee

Problem: Direct file writes can leave partial data on crash/power loss.

Solution: Temp-file + atomic rename pattern:

  1. Write to <path>.tmp
  2. Flush to disk (fsync)
  3. Atomic rename to <path> (OS-level atomic operation)
  4. Clean up temp file on failure

Result: File always contains either old data or new data, never partial.

API

// Save atomically
auto save_result = AtomicFileOps::save_atomic(
    "plugin_state.dat",
    data.data(),
    data.size()
);

// Load file
auto load_result = AtomicFileOps::load("plugin_state.dat");
if (load_result.success) {
    deserialize_state(load_result.data);
}

// File management
bool exists = AtomicFileOps::exists(path);
size_t size = AtomicFileOps::file_size(path);
AtomicFileOps::remove(path);
AtomicFileOps::create_backup(path);  // Creates .backup file

Streaming API

For large files that don't fit in memory:

auto result = AtomicFileOps::save_atomic_streamed(
    "large_state.dat",
    [&](std::ostream& out) {
        for (const auto& chunk : large_data) {
            out.write(chunk.data(), chunk.size());
            if (!out.good()) return false;
        }
        return true;
    }
);

Performance Characteristics

Compression Speed (typical, 256 KB plugin state)

Algorithm Compress Time Decompress Time Ratio Total Time
None 0.02 ms 0.02 ms 1.0x 0.04 ms
LZ4 0.5 ms 0.13 ms 2.5x 0.63 ms
LZ4-HC 2.0 ms 0.13 ms 3.0x 2.13 ms
ZSTD-1 1.0 ms 0.5 ms 3.0x 1.5 ms
ZSTD-3 2.5 ms 0.5 ms 3.5x 3.0 ms
ZSTD-9 15 ms 0.5 ms 4.5x 15.5 ms

File I/O Performance (100 KB, SSD)

Operation Time Throughput
Direct write 0.3 ms 320 MB/s
Atomic save 2-5 ms 20-50 MB/s
Load 0.2 ms 500 MB/s

Note: Atomic save is slower due to fsync() overhead but provides crash safety.


Usage Examples

Example 1: Quick Autosave with LZ4

#include "compression_strategy.hpp"
#include "atomic_file_ops.hpp"

void autosave_plugin_state(const PluginState& state) {
    // Serialize to binary
    auto serialized = binary_serialize(state);

    // Compress with LZ4 (fast)
    auto compressor = create_lz4_strategy();
    size_t max_compressed = compressor->get_max_compressed_size(serialized.size());
    std::vector<uint8_t> compressed(max_compressed);

    auto compress_result = compressor->compress(
        serialized.data(), serialized.size(),
        compressed.data(), compressed.size()
    );

    if (!compress_result.success) {
        log_error("Compression failed");
        return;
    }

    compressed.resize(compress_result.compressed_size);

    // Atomic save
    auto save_result = AtomicFileOps::save_atomic(
        "autosave.dat",
        compressed.data(),
        compressed.size()
    );

    if (save_result.success) {
        log_info("Autosave: {} KB → {} KB ({:.1f}x) in {:.1f} ms",
            serialized.size() / 1024,
            compressed.size() / 1024,
            compress_result.compression_ratio,
            compress_result.elapsed_ms + save_result.elapsed_ms
        );
    }
}

Example 2: Load and Decompress State

std::optional<PluginState> load_plugin_state(const std::filesystem::path& path) {
    // Load compressed file
    auto load_result = AtomicFileOps::load(path);
    if (!load_result.success) {
        log_error("Load failed: {}", load_result.error_message);
        return std::nullopt;
    }

    // Decompress (algorithm stored in file header - not shown)
    auto decompressor = create_lz4_strategy();

    // Allocate decompression buffer (size from header - not shown)
    size_t uncompressed_size = read_header_size(load_result.data);
    std::vector<uint8_t> decompressed(uncompressed_size);

    auto decompress_result = decompressor->decompress(
        load_result.data.data(), load_result.data.size(),
        decompressed.data(), decompressed.size()
    );

    if (!decompress_result.success) {
        log_error("Decompression failed");
        return std::nullopt;
    }

    // Deserialize
    return binary_deserialize<PluginState>(decompressed);
}

Example 3: Preset Library with ZSTD

void save_preset_to_library(const Preset& preset, const std::string& name) {
    auto serialized = binary_serialize(preset);

    // Use ZSTD for better compression (presets stored long-term)
    auto compressor = create_zstd_strategy(9);  // High compression

    size_t max_compressed = compressor->get_max_compressed_size(serialized.size());
    std::vector<uint8_t> compressed(max_compressed);

    auto result = compressor->compress(
        serialized.data(), serialized.size(),
        compressed.data(), compressed.size()
    );

    if (result.success) {
        compressed.resize(result.compressed_size);

        std::filesystem::path preset_path =
            get_preset_library_path() / (name + ".preset");

        AtomicFileOps::save_atomic(
            preset_path,
            compressed.data(),
            compressed.size()
        );

        log_info("Preset '{}' saved: {:.1f}x compression",
            name, result.compression_ratio);
    }
}

Example 4: Streaming Large Project Save

void save_large_project(const Project& project, const std::filesystem::path& path) {
    // Use streaming API for large project
    auto result = AtomicFileOps::save_atomic_streamed(
        path,
        [&](std::ostream& out) {
            // Write header
            ProjectHeader header = create_header(project);
            out.write(reinterpret_cast<const char*>(&header), sizeof(header));

            // Compress and write tracks
            auto compressor = create_zstd_strategy(3);
            for (const auto& track : project.tracks) {
                auto serialized = binary_serialize(track);

                size_t max_size = compressor->get_max_compressed_size(serialized.size());
                std::vector<uint8_t> compressed(max_size);

                auto compress_result = compressor->compress(
                    serialized.data(), serialized.size(),
                    compressed.data(), compressed.size()
                );

                if (!compress_result.success) return false;

                // Write compressed track
                uint32_t compressed_size = compress_result.compressed_size;
                out.write(reinterpret_cast<const char*>(&compressed_size), sizeof(compressed_size));
                out.write(reinterpret_cast<const char*>(compressed.data()), compressed_size);
            }

            return out.good();
        }
    );

    if (result.success) {
        log_info("Project saved: {} MB in {:.1f} seconds",
            result.bytes_written / (1024 * 1024),
            result.elapsed_ms / 1000.0
        );
    }
}

Building and Testing

Prerequisites

# Install dependencies via vcpkg
vcpkg install lz4 zstd catch2

CMake Configuration

cd "2 - FOUNDATION"
mkdir build && cd build
cmake .. -DCMAKE_TOOLCHAIN_FILE=<vcpkg-root>/scripts/buildsystems/vcpkg.cmake
cmake --build . --config Release

Running Tests

# All tests
ctest -C Release

# Specific tests
ctest -C Release -R test_compression
ctest -C Release -R test_atomic_file_ops

# Verbose output
ctest -C Release -VV

Running Benchmark

cd build/04_CORE/04_11_state_serialization/04_11_04_persistence_optimization
./compression_benchmark

Expected output:

╔═══════════════════════════════════════════════════════════════╗
║         COMPRESSION BENCHMARK - AudioLab Persistence          ║
╚═══════════════════════════════════════════════════════════════╝

Structured Data (256 KB)
----------------------------------------------------------------------
  Algorithm  :  Ratio  | Comp MB/s | Decomp MB/s | Total ms
----------------------------------------------------------------------
        None :    1.00x |  10234 MB/s |  10123 MB/s |   0.05 ms
         LZ4 :    2.56x |    512 MB/s |   2048 MB/s |   0.63 ms
      LZ4-HC :    3.12x |    128 MB/s |   2048 MB/s |   2.13 ms
      ZSTD-1 :    2.89x |    256 MB/s |    512 MB/s |   1.50 ms
      ZSTD-3 :    3.45x |    102 MB/s |    512 MB/s |   3.01 ms
      ZSTD-9 :    4.67x |     17 MB/s |    512 MB/s |  15.12 ms

...


Integration with State Serialization

This module is designed to integrate with the existing state serialization system:

// In your serialization layer
class CompressedBinarySerializer {
public:
    template<typename T>
    std::vector<uint8_t> serialize(const T& obj, CompressionAlgorithm algo) {
        // 1. Binary serialize
        auto binary = binary_serialize(obj);

        // 2. Compress
        auto strategy = create_compression_strategy(algo);
        size_t max_size = strategy->get_max_compressed_size(binary.size());
        std::vector<uint8_t> compressed(max_size);

        auto result = strategy->compress(
            binary.data(), binary.size(),
            compressed.data(), compressed.size()
        );

        if (!result.success) {
            throw std::runtime_error("Compression failed");
        }

        // 3. Prepend header with algorithm and uncompressed size
        CompressedHeader header{
            .magic = MAGIC_NUMBER,
            .algorithm = algo,
            .uncompressed_size = binary.size(),
            .compressed_size = result.compressed_size
        };

        std::vector<uint8_t> output;
        output.reserve(sizeof(header) + result.compressed_size);

        // Copy header
        const auto* header_bytes = reinterpret_cast<const uint8_t*>(&header);
        output.insert(output.end(), header_bytes, header_bytes + sizeof(header));

        // Copy compressed data
        output.insert(output.end(), compressed.begin(),
                     compressed.begin() + result.compressed_size);

        return output;
    }

    template<typename T>
    T deserialize(const std::vector<uint8_t>& data) {
        // 1. Read header
        const auto* header = reinterpret_cast<const CompressedHeader*>(data.data());

        // 2. Decompress
        auto strategy = create_compression_strategy(header->algorithm);
        std::vector<uint8_t> decompressed(header->uncompressed_size);

        auto result = strategy->decompress(
            data.data() + sizeof(CompressedHeader),
            header->compressed_size,
            decompressed.data(),
            decompressed.size()
        );

        if (!result.success) {
            throw std::runtime_error("Decompression failed");
        }

        // 3. Binary deserialize
        return binary_deserialize<T>(decompressed);
    }
};

Design Decisions

1. Strategy Pattern for Compression

Why: Allows runtime selection of compression algorithm based on use case.

Alternative considered: Template-based compile-time selection - Rejected: Requires recompilation to change algorithm, can't be user-configurable

2. Caller-Managed Memory

Why: Zero-copy design, optimal performance, no hidden allocations.

Alternative considered: Return vector from compress() - Rejected: Forces allocation, can't reuse buffers, less flexible

3. Atomic Rename Pattern

Why: OS-level atomic operation, portable, simple implementation.

Alternative considered: Write-ahead logging (WAL) - Rejected: More complex, overkill for plugin state, slower

4. fsync() for Durability

Why: Guarantees data on disk, prevents data loss on power failure.

Alternative considered: Skip fsync for speed - Rejected: Not crash-safe, can lose recent saves


Performance Optimization Tips

1. Choose Algorithm Based on Use Case

// Real-time autosave (< 1ms budget)
auto fast_save = create_lz4_strategy();

// User-initiated save (< 10ms acceptable)
auto balanced_save = create_zstd_strategy(3);

// Preset library (no time constraint)
auto archival_save = create_zstd_strategy(9);

2. Reuse Compression Buffers

class StateManager {
    std::unique_ptr<ICompressionStrategy> strategy_;
    std::vector<uint8_t> compression_buffer_;

    void save() {
        auto data = serialize_state();

        // Resize buffer if needed (doesn't reallocate if same size)
        size_t max_size = strategy_->get_max_compressed_size(data.size());
        if (compression_buffer_.size() < max_size) {
            compression_buffer_.resize(max_size);
        }

        auto result = strategy_->compress(
            data.data(), data.size(),
            compression_buffer_.data(), compression_buffer_.size()
        );

        // Use result...
    }
};

3. Batch Small Saves

// Instead of many small saves
for (const auto& param : params) {
    save_parameter(param);  // Each triggers fsync - SLOW
}

// Batch into single save
save_all_parameters(params);  // Single fsync - FAST

4. Use Streaming for Large Data

// Don't load entire project into memory
auto project_data = load_entire_project();  // BAD: huge allocation

// Stream chunks instead
save_atomic_streamed(path, [&](std::ostream& out) {
    for (const auto& chunk : get_project_chunks()) {
        out.write(chunk.data(), chunk.size());
    }
});

Thread Safety

Compression Operations

  • Thread-safe: All compression/decompression operations are stateless
  • Can compress different data concurrently from multiple threads
  • No synchronization needed for compression itself

File Operations

  • Thread-safe for different files: Multiple threads can save to different files
  • NOT thread-safe for same file: Requires external synchronization
  • Atomic rename is OS-level atomic: Safe even with concurrent readers

Example: Concurrent Saves

// SAFE: Different files
std::thread t1([&]{ save_state("plugin1.dat", state1); });
std::thread t2([&]{ save_state("plugin2.dat", state2); });

// UNSAFE: Same file (requires mutex)
std::mutex file_mutex;
std::thread t3([&]{
    std::lock_guard lock(file_mutex);
    save_state("shared.dat", state3);
});

Troubleshooting

Compression Fails

Symptom: compress_result.success == false

Causes: 1. Output buffer too small - Fix: Use get_max_compressed_size() to allocate buffer 2. Input data too large (> INT_MAX for LZ4) - Fix: Split data into chunks or use ZSTD

Decompression Fails

Symptom: decompress_result.success == false

Causes: 1. Corrupted compressed data - Fix: Add checksums/CRC to detect corruption 2. Output buffer too small - Fix: Store uncompressed size in header 3. Wrong decompression algorithm - Fix: Store algorithm ID in header

Atomic Save Fails

Symptom: save_result.success == false

Causes: 1. Disk full - Check: save_result.error_message for details 2. Permission denied - Fix: Check file/directory permissions 3. Invalid path - Fix: Validate path before saving

Benchmark Shows Low Performance

Possible reasons: 1. Debug build (much slower than release) - Fix: Build with -DCMAKE_BUILD_TYPE=Release 2. Cold cache (first run slower) - Fix: Run benchmark multiple times, ignore first 3. Antivirus scanning files - Fix: Exclude build directory from antivirus


Future Enhancements

Planned Features

  1. Encryption Support
  2. AES-256 encryption for sensitive state data
  3. Integrate with compression (compress then encrypt)

  4. Delta Compression

  5. Only save changed data since last save
  6. Faster saves for large states with small changes

  7. Background Compression

  8. Compress on worker thread
  9. Non-blocking autosave

  10. Adaptive Algorithm Selection

  11. Automatically choose algorithm based on data characteristics
  12. Profile compression ratio vs. time trade-off

  13. Additional Algorithms

  14. Brotli (better compression than ZSTD for text)
  15. Snappy (faster than LZ4, but lower ratio)

Integration TODO

  • Add compression to XML/JSON serializers
  • Implement compressed file header format
  • Add CRC32 checksums for data integrity
  • Create unified StateManager API
  • Add preset library compression utilities

References

LZ4

ZSTD

Atomic File Operations


License

Part of the AudioLab project. See root LICENSE file.


Contact

For questions or issues with this module, please open an issue on the AudioLab GitHub repository.