
Context and Challenge
Redesigned the analysis of propellant materials by replacing lengthy lab cycles with machine learning models that predict mechanical properties and particle size distribution, enabling real-time insights, faster R&D, and smarter defense innovation.
.png)
Advanced Centre for Energetic Materials
Ministry of Defence, Government of India
Intern 2023-2024


Solution Design
-
Models: Random Forest selected after evaluating multiple regressors; handles non-linearities and exposes feature importance for scientist trust. (HTPB: 8→3, AP-PSD: 12→3)
-
Evaluation: 80/20 split; report test-set R² and MAE with predicted-vs-actual plots to demonstrate generalization (not just fit).
-
Serving: Flask app collects inputs, returns JSON predictions, and renders results with the model loaded in memory for low latency.
-
Explainability: Feature importance surfaced in the UI to show key drivers behind each prediction.
-
Ops readiness: Versioned model artifact, reproducible environment, and a path to CI/CD-driven retraining and redeploys.
Agile Delivery Map

Turning fragmented lab data into a reliable ML system isn’t just about models, it’s about orchestrating cross-functional execution, sprint by sprint. From initial hypothesis framing to real-time predictions, this journey maps how our team delivered a mission-critical ML system, driven by agile execution, scientific alignment, and rapid prototyping.
This project didn’t stop at a prototype. It delivered real-time insight for scientists, measurable accuracy, and an architecture ready to scale.
The Data Pipeline
What began as scattered experimental data evolved into a streamlined pipeline that was cleaned, analyzed, and transformed into real-time predictions. This system reimagines how defense researchers understand materials, enabling faster decisions and smarter innovation.

Model Selection
-
Evaluated multiple regressors and selected Random Forest for non-linear fit, mixed-scale inputs, and built-in feature importance.
-
Chosen based on test-set R²/MAE against baselines; RF performed most consistently.
Alternatives considered
-
Linear family (Linear/Ridge/Lasso): Underfit our non-linear relationships on the test set.
-
SVR: Required heavy scaling and kernel tuning; results were less stable and harder to explain to scientists.
-
Boosted trees (GBM/XGBoost): Considered for future iterations as data scales; we prioritized RF for stability, speed, and basic interpretability at current scale.
What We Achieved
Core outcomes
-
Real-time decisions: Moved material evaluation from days to seconds, enabling rapid go/no-go calls during experiment planning.
-
High predictive quality: Achieved test-set performance of R² ≈ 0.95 across tensile strength, elongation, and modulus (with MAE reported per target).
-
Scientist trust: Surfaced feature importance so researchers see the drivers behind each prediction, improving confidence and adoption.
How It Looks
A lightweight, intuitive interface where users input experimental data and get real-time predictions, visualized through responsive plots for quick analysis.


User & workflow impact
-
Scientist-first UX: Simple input → instant outputs; no ML expertise required.
-
Fewer dead ends: Prioritizes promising formulations earlier, reducing wasted bench time and materials.
-
Explainable reviews: Used model insights in design reviews to guide which parameters to vary next.
Engineering & scale
-
Two predictive tracks: Deployed HTPB mechanical properties model; developed AP particle-size distribution model and designed it for integration.
-
Operational readiness: Versioned model artifacts, reproducible environment, and CI/CD-ready retraining path.
-
Extensible architecture: Built to add new targets and datasets without reworking the core pipeline.
Organizational firsts
-
First ML at DRDO–ACEM: Established a repeatable blueprint for AI-assisted R&D workflows.
-
Shared understanding: System architecture + docs shortened onboarding time for new contributors and stakeholders.


