05_20_00_specification_parser - El Intérprete de Blueprints¶
PROPÓSITO¶
El parser es la entrada del pipeline de fabricación. Convierte especificaciones en múltiples formatos (YAML, JSON, XML, DSL custom) en una Intermediate Representation (IR) unificada que el resto del sistema puede procesar. Es el traductor universal que entiende todos los dialectos de especificación.
Responsabilidad: Análisis léxico, sintáctico, y semántico de especificaciones → IR válido y consistente.
ESTRUCTURA¶
05_20_00_specification_parser/
├── include/
│ ├── specification_parser.hpp # Parser base interface
│ ├── yaml_parser.hpp # YAML implementation
│ ├── json_parser.hpp # JSON implementation
│ ├── xml_parser.hpp # XML implementation
│ ├── dsp_dsl_parser.hpp # Custom DSL for DSP
│ ├── ast.hpp # Abstract Syntax Tree
│ ├── ir_generator.hpp # IR generation
│ └── parser_factory.hpp # Auto-detection
├── src/
│ ├── yaml_parser.cpp
│ ├── json_parser.cpp
│ ├── xml_parser.cpp
│ ├── dsp_dsl_parser.cpp
│ ├── ast_builder.cpp
│ ├── ir_generator.cpp
│ ├── semantic_validator.cpp
│ └── error_reporter.cpp
├── tests/
│ ├── test_yaml_parser.cpp
│ ├── test_json_parser.cpp
│ ├── test_semantic_validation.cpp
│ ├── test_error_reporting.cpp
│ └── test_malformed_inputs.cpp
├── examples/
│ ├── example_specs/
│ │ ├── simple_filter.yaml
│ │ ├── complex_synth.json
│ │ ├── modulator.xml
│ │ └── math_dsl.dsp
│ └── parse_example.cpp
└── README.md
PIPELINE DE PARSING¶
┌─────────────────────┐
│ Input Specification │ YAML/JSON/XML/DSL file
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ Format Detection │ Auto-detect or explicit
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ Lexical Analysis │ Tokenization
│ (Tokenizer) │ String → Tokens
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ Syntax Analysis │ Parse tree construction
│ (Parser) │ Tokens → AST
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ Semantic Analysis │ Type checking, validation
│ (Validator) │ AST → Validated AST
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ IR Generation │ Platform-agnostic representation
│ (IR Generator) │ AST → IR
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ IR Output │ To Template Engine (01)
└─────────────────────┘
API PRINCIPAL¶
ISpecificationParser¶
class ISpecificationParser {
public:
virtual ~ISpecificationParser() = default;
// Parse file to AST
virtual AST parse(const std::string& file_path) = 0;
// Parse string to AST
virtual AST parse_string(const std::string& content) = 0;
// Validate specification
virtual ValidationResult validate(const AST& ast) = 0;
// Generate IR from AST
virtual IR generate_ir(const AST& ast) = 0;
// Get parser name
virtual std::string get_name() const = 0;
// Supported file extensions
virtual std::vector<std::string> get_extensions() const = 0;
};
ParserFactory¶
class ParserFactory {
public:
// Auto-detect format and parse
static IR parse_auto(const std::string& file_path);
// Explicit format parsing
static IR parse_yaml(const std::string& file_path);
static IR parse_json(const std::string& file_path);
static IR parse_xml(const std::string& file_path);
static IR parse_dsl(const std::string& file_path);
// Register custom parser
static void register_parser(std::unique_ptr<ISpecificationParser> parser);
// Get parser for extension
static ISpecificationParser* get_parser(const std::string& extension);
};
EJEMPLO DE ESPECIFICACIÓN (YAML)¶
# simple_filter.yaml
module:
name: ButterworthLowPass
type: dsp_filter
level: L0 # Kernel level
parameters:
- name: cutoff_frequency
type: float
range: [20.0, 20000.0]
default: 1000.0
unit: Hz
modulation: continuous
- name: resonance
type: float
range: [0.1, 10.0]
default: 0.707
unit: Q
modulation: continuous
processing:
algorithm: butterworth_iir
order: 2
topology: biquad_cascade
precision: double
optimization:
simd: [sse, avx, neon]
vectorization: auto
cache_friendly: true
metadata:
author: AudioLab Generator
license: MIT
tags: [filter, lowpass, butterworth]
IR (INTERMEDIATE REPRESENTATION)¶
struct IR {
struct Module {
std::string name;
std::string type; // dsp_filter, dsp_oscillator, dsp_effect, etc.
std::string level; // L0, L1, L2, L3
};
struct Parameter {
std::string name;
std::string type; // float, int, bool, enum
Range range;
Value default_value;
std::string unit;
std::string modulation; // continuous, discrete, none
};
struct Processing {
std::string algorithm;
int order;
std::string topology;
std::string precision; // float, double, fixed
};
struct Optimization {
std::vector<std::string> simd;
std::string vectorization; // auto, manual, none
bool cache_friendly;
};
Module module;
std::vector<Parameter> parameters;
Processing processing;
Optimization optimization;
Metadata metadata;
};
VALIDACIÓN SEMÁNTICA¶
Checks Realizados¶
- Type Consistency
- Parámetros tienen tipos válidos
- Rangos compatibles con tipos
-
Default values dentro del rango
-
Algorithm Validity
- Algoritmo existe en 03_ALGORITHM_SPEC
- Parámetros requeridos están presentes
-
Topología compatible con algoritmo
-
Optimization Compatibility
- SIMD targets son válidos (sse, avx, neon)
- Vectorization es feasible para el algoritmo
-
Precision es soportada
-
Constraint Satisfaction
- Dependencies resolvibles
- No circular dependencies
-
Resource constraints realistas
-
Naming Conventions
- Nombres siguen standards (snake_case, etc.)
- No reserved keywords
- Unique identifiers
ERROR REPORTING¶
Error Example¶
ERROR: Invalid parameter range
File: specifications/simple_filter.yaml
Line: 14
Column: 12
13 | - name: cutoff_frequency
14 | range: [20000.0, 20.0]
^^^^^^^^^^^^^^^^
15 | default: 1000.0
Error: Range minimum (20000.0) is greater than maximum (20.0)
Suggestion: Swap range values to [20.0, 20000.0]
Error Categories¶
- Syntax Errors: Malformed YAML/JSON/XML
- Semantic Errors: Invalid algorithm, missing required fields
- Type Errors: Type mismatch, invalid range
- Constraint Errors: Unsatisfiable constraints
- Warning: Suboptimal but valid configuration
CUSTOM DSL EXAMPLE¶
// math_dsl.dsp - Custom DSL for DSP math expressions
filter butterworth_lpf {
parameters {
float fc = 1000.0 [20.0..20000.0] "Cutoff Frequency (Hz)";
float q = 0.707 [0.1..10.0] "Resonance (Q)";
}
processing {
// Biquad coefficients calculation
omega = 2.0 * PI * fc / sample_rate;
alpha = sin(omega) / (2.0 * q);
b0 = (1.0 - cos(omega)) / 2.0;
b1 = 1.0 - cos(omega);
b2 = (1.0 - cos(omega)) / 2.0;
a1 = -2.0 * cos(omega);
a2 = 1.0 - alpha;
// Direct Form II processing
output = b0*x[n] + b1*x[n-1] + b2*x[n-2]
- a1*y[n-1] - a2*y[n-2];
}
optimize {
simd: sse, avx;
vectorize: auto;
}
}
EJEMPLO DE USO¶
#include "parser_factory.hpp"
// Auto-detect format and parse
try {
IR ir = ParserFactory::parse_auto("specs/simple_filter.yaml");
std::cout << "Module: " << ir.module.name << "\n";
std::cout << "Parameters: " << ir.parameters.size() << "\n";
for (const auto& param : ir.parameters) {
std::cout << " - " << param.name
<< " [" << param.range.min << ".." << param.range.max << "]\n";
}
// Pass IR to next stage (Template Engine)
TemplateEngine engine;
engine.process(ir);
} catch (const ParserException& e) {
std::cerr << "Parse error: " << e.what() << "\n";
std::cerr << "Location: " << e.file() << ":" << e.line() << "\n";
}
TAREAS DE DESARROLLO¶
- Implementar ISpecificationParser interface
- Implementar YAMLParser (yaml-cpp)
- Implementar JSONParser (nlohmann/json)
- Implementar XMLParser (pugixml)
- Implementar DSL Parser (custom lexer/parser)
- Implementar AST builder
- Implementar IR generator
- Implementar semantic validator
- Implementar error reporting system con contexto
- Implementar ParserFactory con auto-detection
- Tests unitarios para cada parser (>95% coverage)
- Tests de malformed inputs (robustez)
- Tests de performance (archivos grandes >1MB)
- Fuzzing tests para seguridad
- Documentación de formatos soportados
- Examples library de especificaciones
PERFORMANCE TARGETS¶
✅ Small spec (<10KB): <10ms parsing time ✅ Medium spec (10-100KB): <50ms parsing time ✅ Large spec (>1MB): <500ms parsing time ✅ Memory usage: <2x file size ✅ Error detection: 100% syntax errors caught ✅ Test coverage: >95%
DEPENDENCIAS EXTERNAS¶
- yaml-cpp: YAML parsing (https://github.com/jbeder/yaml-cpp)
- nlohmann/json: JSON parsing (https://github.com/nlohmann/json)
- pugixml: XML parsing (https://github.com/zeux/pugixml)
- fmt: String formatting (https://github.com/fmtlib/fmt)
REFERENCIAS¶
- YAML Spec: https://yaml.org/spec/1.2/spec.html
- JSON Spec: https://www.json.org/json-en.html
- XML Spec: https://www.w3.org/TR/xml/
- Compiler Design: "Engineering a Compiler" - Cooper & Torczon
TESTING¶
# Run all parser tests
cd tests
cmake -B build -S ..
cmake --build build
./build/test_yaml_parser
./build/test_json_parser
./build/test_semantic_validation
# Run with coverage
cmake -B build -S .. -DCMAKE_BUILD_TYPE=Coverage
cmake --build build
ctest --test-dir build --output-on-failure
Este es el módulo de entrada del sistema. La calidad del parsing determina la calidad de todo lo generado después.