Deployment
FastAPI inference server
A FastAPI-based inference server is included for containerized scoring.
docker build -t hugiml-core:latest -f docker/Dockerfile .
docker run -p 8080:8080 -v /path/to/models:/models hugiml-core:latest
Example request:
curl -s -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"instances": [{"age": 35, "savings": "moderate"}]}'
Kubernetes
A starter manifest is provided in kubernetes/deployment.yaml. Review CPU/memory limits, model volume paths, network policy, and secrets before production use.
Production checklist
Pin
hugiml-coreand dependency versions.Save the trained model with
save_modeland record the model schema version.Export a model card and audit artifact.
Capture calibration metrics and drift thresholds.
Exercise prediction-time schema validation against representative production payloads.
Configure monitoring, latency budgets, and rollback procedures.
Validate any pattern-pruning actions and retain the pruning audit trail.