Validating with Constraints
This guide explains how to define and validate Metaschema constraints.
Metaschema constraints provide validation beyond schema structure:
- Cardinality - Required fields, occurrence limits
- Allowed values - Enumerated restrictions
- Patterns - Regex-based validation
- Uniqueness - Key constraints
- Cross-references - Reference integrity
- Custom rules - Metapath-based assertions
Restrict values to a defined set:
<define-flag name="status">
<constraint>
<allowed-values allow-other="no">
<enum value="active">Active status</enum>
<enum value="inactive">Inactive status</enum>
<enum value="pending">Pending status</enum>
</allowed-values>
</constraint>
</define-flag>
Validate against a regular expression:
<define-flag name="code">
<constraint>
<matches pattern="[A-Z]{2}-[0-9]{4}" />
</constraint>
</define-flag>
Control occurrence requirements:
<define-assembly name="catalog">
<model>
<field ref="title">
<constraint>
<has-cardinality min-occurs="1" max-occurs="1" />
</constraint>
</field>
</model>
</define-assembly>
Ensure unique values within a scope:
<define-assembly name="catalog">
<constraint>
<index name="control-id-index" target="//control">
<key-field target="@id" />
</index>
</constraint>
</define-assembly>
Validate references point to existing values:
<define-assembly name="profile">
<constraint>
<index-has-key name="control-reference-check"
index-name="control-id-index"
target="//include-controls/with-id">
<key-field target="." />
</index-has-key>
</constraint>
</define-assembly>
Custom Metapath-based validation:
<define-assembly name="metadata">
<constraint>
<expect test="last-modified >= published"
message="Last modified must be after published date" />
</constraint>
</define-assembly>
import dev.metaschema.databind.IBindingContext;
import dev.metaschema.databind.io.DeserializationFeature;
import dev.metaschema.databind.io.IBoundLoader;
import java.nio.file.Path;
IBindingContext context = IBindingContext.newInstance();
IBoundLoader loader = context.newBoundLoader();
// Enable constraint validation during loading
loader.enableFeature(DeserializationFeature.DESERIALIZE_VALIDATE_CONSTRAINTS);
Object model = loader.load(Path.of("data.json"));
import dev.metaschema.core.model.validation.IValidationResult;
import dev.metaschema.databind.IBindingContext;
import java.net.URI;
import java.nio.file.Path;
IBindingContext context = IBindingContext.newInstance();
URI target = Path.of("data.json").toUri();
IValidationResult result = context.validateWithConstraints(target, null);
if (!result.isPassing()) {
result.getFindings().forEach(finding -> {
System.err.println(finding.getSeverity() + ": " +
finding.getMessage() + " at " + finding.getLocation());
});
}
import dev.metaschema.core.model.constraint.IConstraint.Level;
import dev.metaschema.core.model.validation.IValidationFinding;
import dev.metaschema.core.model.validation.IValidationResult;
import dev.metaschema.databind.IBindingContext;
import java.net.URI;
import java.nio.file.Path;
IBindingContext context = IBindingContext.newInstance();
URI target = Path.of("data.json").toUri();
IValidationResult result = context.validateWithConstraints(target, null);
// Check overall status
if (result.isPassing()) {
System.out.println("Validation passed");
}
// Process findings by severity
for (IValidationFinding finding : result.getFindings()) {
Level severity = finding.getSeverity();
if (severity == Level.CRITICAL) {
handleCritical(finding);
} else if (severity == Level.ERROR) {
handleError(finding);
} else if (severity == Level.WARNING) {
handleWarning(finding);
} else {
logInfo(finding);
}
}
The framework provides FindingCollectingConstraintValidationHandler for collecting validation findings:
import dev.metaschema.core.model.constraint.FindingCollectingConstraintValidationHandler;
import dev.metaschema.core.model.constraint.IConstraint.Level;
import dev.metaschema.core.model.validation.IValidationResult;
// The handler implements IValidationResult
FindingCollectingConstraintValidationHandler handler =
new FindingCollectingConstraintValidationHandler();
// After validation completes, check results
if (!handler.isPassing()) {
handler.getFindings().forEach(finding -> {
System.err.println(finding.getSeverity() + ": " +
finding.getMessage());
});
}
// Check highest severity level
Level highestSeverity = handler.getHighestSeverity();
if (highestSeverity.ordinal() >= Level.ERROR.ordinal()) {
System.err.println("Validation failed with errors");
}
Constraint validation can be parallelized using ValidationConfig. This is useful for large documents with many constraints.
By default, constraints are evaluated sequentially:
import dev.metaschema.core.model.constraint.DefaultConstraintValidator;
import dev.metaschema.core.model.constraint.ValidationConfig;
// Sequential is the default - no configuration needed
DefaultConstraintValidator validator =
new DefaultConstraintValidator(handler);
// Or explicitly use the SEQUENTIAL constant
DefaultConstraintValidator validator =
new DefaultConstraintValidator(handler, ValidationConfig.SEQUENTIAL);
Create a configuration with an internal thread pool:
import dev.metaschema.core.model.constraint.DefaultConstraintValidator;
import dev.metaschema.core.model.constraint.ValidationConfig;
// Use try-with-resources to ensure thread pool cleanup
try (ValidationConfig config = ValidationConfig.withThreads(4)) {
DefaultConstraintValidator validator =
new DefaultConstraintValidator(handler, config);
validator.validate(rootItem, dynamicContext);
validator.finalizeValidation(dynamicContext);
}
If your application already manages a thread pool, you can provide it directly:
import dev.metaschema.core.model.constraint.ValidationConfig;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.ForkJoinPool;
ExecutorService myPool = new ForkJoinPool(8);
// The config does NOT shut down your executor on close
try (ValidationConfig config = ValidationConfig.withExecutor(myPool)) {
DefaultConstraintValidator validator =
new DefaultConstraintValidator(handler, config);
validator.validate(rootItem, dynamicContext);
validator.finalizeValidation(dynamicContext);
}
// Your pool is still running - shut it down when you're done
myPool.shutdown();
The validation pipeline supports event-based instrumentation through the ValidationEventListener interface. This allows you to measure the performance of individual constraints, let-statements, and validation phases.
The TimingCollector class implements ValidationEventListener and records hierarchical timing data:
import dev.metaschema.core.model.constraint.DefaultConstraintValidator;
import dev.metaschema.core.model.constraint.TimingCollector;
import dev.metaschema.core.model.constraint.TimingRecord;
import dev.metaschema.core.model.constraint.ValidationConfig;
import dev.metaschema.core.model.constraint.ValidationPhase;
import java.util.Map;
// Create a timing collector
TimingCollector timings = new TimingCollector();
// Wire it into the validation configuration
try (ValidationConfig config = ValidationConfig.withThreads(4)
.withListener(timings)) {
DefaultConstraintValidator validator =
new DefaultConstraintValidator(handler, config);
validator.validate(rootItem, dynamicContext);
validator.finalizeValidation(dynamicContext);
}
// Query overall validation timing
TimingRecord overall = timings.getValidationTiming();
if (overall != null) {
System.out.printf("Total validation: %.3f ms (%d invocations)%n",
overall.getTotalTimeNs() / 1_000_000.0,
overall.getCount());
}
// Query per-phase timing
for (Map.Entry<ValidationPhase, TimingRecord> entry
: timings.getPhaseTimings().entrySet()) {
TimingRecord record = entry.getValue();
System.out.printf(" %s: %.3f ms%n",
entry.getKey(),
record.getTotalTimeNs() / 1_000_000.0);
}
// Find the slowest constraints
timings.getConstraintTimings().entrySet().stream()
.sorted((a, b) -> Long.compare(
b.getValue().getTotalTimeNs(),
a.getValue().getTotalTimeNs()))
.limit(10)
.forEach(entry -> {
TimingRecord record = entry.getValue();
System.out.printf(" %s: %.3f ms (%d evals, min=%.3f, max=%.3f)%n",
entry.getKey(),
record.getTotalTimeNs() / 1_000_000.0,
record.getCount(),
record.getMinTimeNs() / 1_000_000.0,
record.getMaxTimeNs() / 1_000_000.0);
});
The ValidationPhase enum identifies three distinct phases:
| Phase | Description |
|---|---|
SCHEMA_VALIDATION |
Schema validation against XML Schema or JSON Schema |
CONSTRAINT_VALIDATION |
Evaluation of Metaschema constraints |
FINALIZATION |
Post-validation finalization (cross-document constraints, index resolution) |
Implement ValidationEventListener to create custom instrumentation:
import dev.metaschema.core.model.constraint.ValidationEventListener;
import dev.metaschema.core.model.constraint.ValidationPhase;
import dev.metaschema.core.model.constraint.IConstraint;
import dev.metaschema.core.model.constraint.ILet;
import dev.metaschema.core.metapath.item.node.INodeItem;
import java.net.URI;
public class LoggingEventListener implements ValidationEventListener {
@Override
public void beforeValidation(URI document) {
System.out.println("Starting validation of " + document);
}
@Override
public void afterValidation(URI document) {
System.out.println("Finished validation of " + document);
}
@Override
public void beforePhase(ValidationPhase phase) {
System.out.println(" Entering phase: " + phase);
}
@Override
public void afterPhase(ValidationPhase phase) {
System.out.println(" Completed phase: " + phase);
}
@Override
public void beforeConstraintEvaluation(IConstraint constraint,
INodeItem target) {
// Called for each constraint evaluation
}
@Override
public void afterConstraintEvaluation(IConstraint constraint,
INodeItem target) {
// Called after each constraint evaluation
}
@Override
public void beforeLetEvaluation(ILet let) {
// Called before each let-statement binding
}
@Override
public void afterLetEvaluation(ILet let) {
// Called after each let-statement binding
}
}
When no listener is configured, ValidationConfig uses NoOpValidationEventListener, which provides empty implementations for all callbacks. This ensures zero overhead from instrumentation when timing is not needed.
Timing data can be included in SARIF output using the --sarif-timing CLI flag or by programmatically setting a TimingCollector on SarifValidationHandler. See the SARIF Output guide for details.
Control validation behavior with levels:
<constraint>
<!-- Causes validation failure -->
<expect level="ERROR" test="@id" message="ID is required" />
<!-- Logged but doesn't fail -->
<expect level="WARNING" test="title" message="Title recommended" />
<!-- Informational only -->
<expect level="INFORMATIONAL" test="version" message="Consider adding version" />
</constraint>
Define constraints in separate files:
<!-- main-module.xml -->
<metaschema>
<import-constraints href="additional-constraints.xml" />
</metaschema>
<!-- additional-constraints.xml -->
<constraints>
<context>
<metapath>/catalog//control</metapath>
<constraints>
<expect test="title" message="Controls must have titles" />
</constraints>
</context>
</constraints>
<constraint>
<expect test="title" message="Title is required" />
</constraint>
<constraint>
<expect test="not(@type = 'formal') or description"
message="Formal items require descriptions" />
</constraint>
<constraint>
<expect test="@count >= 0 and @count <= 100"
message="Count must be between 0 and 100" />
</constraint>
<constraint>
<expect test="end-date >= start-date"
message="End date must be after start date" />
</constraint>
<constraint>
<is-unique target="item" name="unique-item-id">
<key-field target="@id" />
</is-unique>
</constraint>
| Level | Meaning | Effect |
|---|---|---|
CRITICAL |
Severe error | Document unusable |
ERROR |
Constraint violation | Validation fails |
WARNING |
Potential issue | Logged, doesn't fail |
INFORMATIONAL |
Note | Logged only |
- Use appropriate levels - Not every issue is an ERROR
- Provide clear messages - Include context and fix hints
- Validate early - Check on load, not later
- Handle all severities - Don't ignore warnings
- Test constraints - Validate with known-bad data
Continue learning about the Metaschema Java Tools with these related guides:
- Executing Metapath - Write constraint expressions
- Reading & Writing Data - Load data for validation
- Generating Schemas - Generate schema validators
- SARIF Output - Machine-readable validation results with timing
- Using the CLI - Command-line validation

