Skip to content

05_20_00_specification_parser - El Intérprete de Blueprints

PROPÓSITO

El parser es la entrada del pipeline de fabricación. Convierte especificaciones en múltiples formatos (YAML, JSON, XML, DSL custom) en una Intermediate Representation (IR) unificada que el resto del sistema puede procesar. Es el traductor universal que entiende todos los dialectos de especificación.

Responsabilidad: Análisis léxico, sintáctico, y semántico de especificaciones → IR válido y consistente.


ESTRUCTURA

05_20_00_specification_parser/
├── include/
│   ├── specification_parser.hpp        # Parser base interface
│   ├── yaml_parser.hpp                  # YAML implementation
│   ├── json_parser.hpp                  # JSON implementation
│   ├── xml_parser.hpp                   # XML implementation
│   ├── dsp_dsl_parser.hpp               # Custom DSL for DSP
│   ├── ast.hpp                          # Abstract Syntax Tree
│   ├── ir_generator.hpp                 # IR generation
│   └── parser_factory.hpp               # Auto-detection
├── src/
│   ├── yaml_parser.cpp
│   ├── json_parser.cpp
│   ├── xml_parser.cpp
│   ├── dsp_dsl_parser.cpp
│   ├── ast_builder.cpp
│   ├── ir_generator.cpp
│   ├── semantic_validator.cpp
│   └── error_reporter.cpp
├── tests/
│   ├── test_yaml_parser.cpp
│   ├── test_json_parser.cpp
│   ├── test_semantic_validation.cpp
│   ├── test_error_reporting.cpp
│   └── test_malformed_inputs.cpp
├── examples/
│   ├── example_specs/
│   │   ├── simple_filter.yaml
│   │   ├── complex_synth.json
│   │   ├── modulator.xml
│   │   └── math_dsl.dsp
│   └── parse_example.cpp
└── README.md

PIPELINE DE PARSING

┌─────────────────────┐
│ Input Specification │  YAML/JSON/XML/DSL file
└──────────┬──────────┘
┌─────────────────────┐
│ Format Detection    │  Auto-detect or explicit
└──────────┬──────────┘
┌─────────────────────┐
│ Lexical Analysis    │  Tokenization
│ (Tokenizer)         │  String → Tokens
└──────────┬──────────┘
┌─────────────────────┐
│ Syntax Analysis     │  Parse tree construction
│ (Parser)            │  Tokens → AST
└──────────┬──────────┘
┌─────────────────────┐
│ Semantic Analysis   │  Type checking, validation
│ (Validator)         │  AST → Validated AST
└──────────┬──────────┘
┌─────────────────────┐
│ IR Generation       │  Platform-agnostic representation
│ (IR Generator)      │  AST → IR
└──────────┬──────────┘
┌─────────────────────┐
│ IR Output           │  To Template Engine (01)
└─────────────────────┘

API PRINCIPAL

ISpecificationParser

class ISpecificationParser {
public:
    virtual ~ISpecificationParser() = default;

    // Parse file to AST
    virtual AST parse(const std::string& file_path) = 0;

    // Parse string to AST
    virtual AST parse_string(const std::string& content) = 0;

    // Validate specification
    virtual ValidationResult validate(const AST& ast) = 0;

    // Generate IR from AST
    virtual IR generate_ir(const AST& ast) = 0;

    // Get parser name
    virtual std::string get_name() const = 0;

    // Supported file extensions
    virtual std::vector<std::string> get_extensions() const = 0;
};

ParserFactory

class ParserFactory {
public:
    // Auto-detect format and parse
    static IR parse_auto(const std::string& file_path);

    // Explicit format parsing
    static IR parse_yaml(const std::string& file_path);
    static IR parse_json(const std::string& file_path);
    static IR parse_xml(const std::string& file_path);
    static IR parse_dsl(const std::string& file_path);

    // Register custom parser
    static void register_parser(std::unique_ptr<ISpecificationParser> parser);

    // Get parser for extension
    static ISpecificationParser* get_parser(const std::string& extension);
};

EJEMPLO DE ESPECIFICACIÓN (YAML)

# simple_filter.yaml
module:
  name: ButterworthLowPass
  type: dsp_filter
  level: L0  # Kernel level

parameters:
  - name: cutoff_frequency
    type: float
    range: [20.0, 20000.0]
    default: 1000.0
    unit: Hz
    modulation: continuous

  - name: resonance
    type: float
    range: [0.1, 10.0]
    default: 0.707
    unit: Q
    modulation: continuous

processing:
  algorithm: butterworth_iir
  order: 2
  topology: biquad_cascade
  precision: double

optimization:
  simd: [sse, avx, neon]
  vectorization: auto
  cache_friendly: true

metadata:
  author: AudioLab Generator
  license: MIT
  tags: [filter, lowpass, butterworth]

IR (INTERMEDIATE REPRESENTATION)

struct IR {
    struct Module {
        std::string name;
        std::string type;  // dsp_filter, dsp_oscillator, dsp_effect, etc.
        std::string level; // L0, L1, L2, L3
    };

    struct Parameter {
        std::string name;
        std::string type;  // float, int, bool, enum
        Range range;
        Value default_value;
        std::string unit;
        std::string modulation; // continuous, discrete, none
    };

    struct Processing {
        std::string algorithm;
        int order;
        std::string topology;
        std::string precision; // float, double, fixed
    };

    struct Optimization {
        std::vector<std::string> simd;
        std::string vectorization; // auto, manual, none
        bool cache_friendly;
    };

    Module module;
    std::vector<Parameter> parameters;
    Processing processing;
    Optimization optimization;
    Metadata metadata;
};

VALIDACIÓN SEMÁNTICA

Checks Realizados

  1. Type Consistency
  2. Parámetros tienen tipos válidos
  3. Rangos compatibles con tipos
  4. Default values dentro del rango

  5. Algorithm Validity

  6. Algoritmo existe en 03_ALGORITHM_SPEC
  7. Parámetros requeridos están presentes
  8. Topología compatible con algoritmo

  9. Optimization Compatibility

  10. SIMD targets son válidos (sse, avx, neon)
  11. Vectorization es feasible para el algoritmo
  12. Precision es soportada

  13. Constraint Satisfaction

  14. Dependencies resolvibles
  15. No circular dependencies
  16. Resource constraints realistas

  17. Naming Conventions

  18. Nombres siguen standards (snake_case, etc.)
  19. No reserved keywords
  20. Unique identifiers

ERROR REPORTING

Error Example

ERROR: Invalid parameter range
  File: specifications/simple_filter.yaml
  Line: 14
  Column: 12

  13 |   - name: cutoff_frequency
  14 |     range: [20000.0, 20.0]
                  ^^^^^^^^^^^^^^^^
  15 |     default: 1000.0

  Error: Range minimum (20000.0) is greater than maximum (20.0)
  Suggestion: Swap range values to [20.0, 20000.0]

Error Categories

  • Syntax Errors: Malformed YAML/JSON/XML
  • Semantic Errors: Invalid algorithm, missing required fields
  • Type Errors: Type mismatch, invalid range
  • Constraint Errors: Unsatisfiable constraints
  • Warning: Suboptimal but valid configuration

CUSTOM DSL EXAMPLE

// math_dsl.dsp - Custom DSL for DSP math expressions

filter butterworth_lpf {
    parameters {
        float fc = 1000.0 [20.0..20000.0] "Cutoff Frequency (Hz)";
        float q = 0.707 [0.1..10.0] "Resonance (Q)";
    }

    processing {
        // Biquad coefficients calculation
        omega = 2.0 * PI * fc / sample_rate;
        alpha = sin(omega) / (2.0 * q);

        b0 = (1.0 - cos(omega)) / 2.0;
        b1 = 1.0 - cos(omega);
        b2 = (1.0 - cos(omega)) / 2.0;
        a1 = -2.0 * cos(omega);
        a2 = 1.0 - alpha;

        // Direct Form II processing
        output = b0*x[n] + b1*x[n-1] + b2*x[n-2]
               - a1*y[n-1] - a2*y[n-2];
    }

    optimize {
        simd: sse, avx;
        vectorize: auto;
    }
}

EJEMPLO DE USO

#include "parser_factory.hpp"

// Auto-detect format and parse
try {
    IR ir = ParserFactory::parse_auto("specs/simple_filter.yaml");

    std::cout << "Module: " << ir.module.name << "\n";
    std::cout << "Parameters: " << ir.parameters.size() << "\n";

    for (const auto& param : ir.parameters) {
        std::cout << "  - " << param.name
                  << " [" << param.range.min << ".." << param.range.max << "]\n";
    }

    // Pass IR to next stage (Template Engine)
    TemplateEngine engine;
    engine.process(ir);

} catch (const ParserException& e) {
    std::cerr << "Parse error: " << e.what() << "\n";
    std::cerr << "Location: " << e.file() << ":" << e.line() << "\n";
}

TAREAS DE DESARROLLO

  • Implementar ISpecificationParser interface
  • Implementar YAMLParser (yaml-cpp)
  • Implementar JSONParser (nlohmann/json)
  • Implementar XMLParser (pugixml)
  • Implementar DSL Parser (custom lexer/parser)
  • Implementar AST builder
  • Implementar IR generator
  • Implementar semantic validator
  • Implementar error reporting system con contexto
  • Implementar ParserFactory con auto-detection
  • Tests unitarios para cada parser (>95% coverage)
  • Tests de malformed inputs (robustez)
  • Tests de performance (archivos grandes >1MB)
  • Fuzzing tests para seguridad
  • Documentación de formatos soportados
  • Examples library de especificaciones

PERFORMANCE TARGETS

Small spec (<10KB): <10ms parsing time ✅ Medium spec (10-100KB): <50ms parsing time ✅ Large spec (>1MB): <500ms parsing time ✅ Memory usage: <2x file size ✅ Error detection: 100% syntax errors caught ✅ Test coverage: >95%


DEPENDENCIAS EXTERNAS


REFERENCIAS


TESTING

# Run all parser tests
cd tests
cmake -B build -S ..
cmake --build build
./build/test_yaml_parser
./build/test_json_parser
./build/test_semantic_validation

# Run with coverage
cmake -B build -S .. -DCMAKE_BUILD_TYPE=Coverage
cmake --build build
ctest --test-dir build --output-on-failure

Este es el módulo de entrada del sistema. La calidad del parsing determina la calidad de todo lo generado después.