← Back to Projects
Concrete Strength Dashboard

Predictive Modeling for Concrete Strength

Python Streamlit Machine Learning GenAI Ridge Regression Decision Tree LDA / QDA / KNN Plotly OpenRouter API ReportLab
📊 View Live Dashboard

Built an end-to-end interactive analytics dashboard in Python and Streamlit to analyze and predict concrete compressive strength across 1,030 mix design samples spanning 8 material variables. The dashboard spans four modules - from exploratory KPI insights to real-time mix design prediction powered by GenAI recommendations via Google Gemini 2.5 Flash - and is designed to be both explainable and actionable for engineering and product decision-making.

KPI Insights Explorer
Visualizes how each mix ingredient relates to compressive strength using quantile binning. Bins keep sample sizes balanced for fair comparison, with error bars (SEM) to show confidence per bin. Users select any KPI and the number of bins to interactively explore trends.
Model Explorer
Four modeling lenses in one view: OLS scatter + trendline, standardized coefficient bar charts for Linear and Ridge Regression, an interactive Decision Tree (depth-3) visualization, and a binary classification module (pass/fail against a user-defined MPa threshold) with a confusion matrix and interpretation.
Predict Strength
Real-time mix design prediction tool with slider inputs for all 8 ingredients. Uses a Ridge Regression (RidgeCV, 5-fold CV) model trained on the dataset and outputs a gauge showing predicted vs target strength, with a clear pass/fail indicator and margin.
GenAI Recommendations
Integrates Google Gemini 2.5 Flash via the OpenRouter API to generate structured, engineering-appropriate recommendations (executive summary, detailed adjustments, conclusion) grounded in the predicted vs target gap and input values. Togglable with a rule-based fallback.
PDF Report Generation
Generates a downloadable one-page PDF report using ReportLab, including prediction summary, input table (2-column layout), a cloud-safe gauge visualization drawn natively in the PDF, and condensed GenAI recommendations.
  • Ridge Regression (RidgeCV) - primary prediction model, trained with 5-fold cross-validation across 50 alpha values on a 70/30 train-test split. Handles multicollinearity from correlated ingredients (slag, ash, cement) with stable, shrinkage-regularized coefficients.
  • Multiple Linear Regression - full-dataset fit used for interpretable standardized coefficient comparisons in the Model Explorer. Confirms cement as the strongest positive driver and water as the strongest negative driver.
  • Decision Tree Regressor (max depth 3) - rule-based model that surfaces non-linear thresholds and ingredient interactions. Visualized as a full tree plot for explainability.
  • Classification models - Logistic Regression, LDA, QDA, and KNN applied to a binary strength threshold (pass/fail at user-defined MPa). Each includes a full confusion matrix with TP/FP/FN/TN breakdown and auto-generated interpretation of error type dominance.
  • Cement is the strongest positive driver of compressive strength across all models. Increasing cement content consistently raises strength predictions.
  • Water is the strongest negative driver. Higher water-to-cement ratio weakens the bond structure, reducing strength.
  • Age is a strong positive factor, especially in early curing stages. Strength grows as hydration progresses over time.
  • Superplasticizer shows moderate positive impact by enabling lower water content while maintaining workability - improving the water-to-cement ratio indirectly.
  • Slag and ash contribute positively but are correlated with other variables, so their individual effect is best read via Ridge coefficients rather than simple correlation.