Abstract

Construction projects are considered high risk projects especially due to their required large capital making them require extreme attention in estimation as overestimating a project will lead to losing bids and underestimating them will lead to incurring more costs than budgeted resulting in losses. However, estimators are often faced with very tight timelines to finish their estimates leading them to primarily rely on their experience disregarding some crucial factors resulting in inaccurate estimates. In the steel structures industry, the steel fabrication phase accounts for 30 to 40% of the overall project cost; in addition, the steel industry is labor driven; thus, steel structure companies usually estimate their fabrication stage by estimating the required labor hours and multiplying them by crew specific rates; thus, the primary task to enhance the estimates of steel structure fabrication projects is to enhance the duration estimates.

This research is conducted to address the need of a collaborating company to enhance the estimates of their fabrication stage in which most of the losses in the supply only projects in the collaborating company resides. Thus, the objective of this research is to develop a machine learning model to estimate the fabrication hours for industrial steel structures projects; addressing a relatively understudied topic. The research aims to explore the potential utilization of periodical records available in steel fabricator companies to develop machine learning models that can enhance the accuracy of the duration estimates while providing a time efficient tool for estimation engineers to use. The objectives were achieved by employing combinations of six machine learning models and various pre-processing techniques to find the model that best enhances the estimates of the fabrication hours of the collaborating company. This research investigates the use of Ordinary Least Squares Multiple Linear Regression, Lasso Regression, Ridge Regression, K-Nearest Neighbors regression, Support Vector Machines Regression and Multi-layer perception Artificial Neural Networks in tandem with log transformation, polynomial features, Yeo-Johnson transformation, and data splitting in the pursuit of reaching the best model. The research also delves into the identification of the most important features affecting the model performance through sequential feature selection, investigation of linear regression models coefficients and spearman correlation.

The research shows that the most important features are the number of attachments and the weight of plates followed by the weights of light, medium, heavy and extra heavy profiles and the type of steel (main steel, miscellaneous steel, built-up steel). The best modeling technique was the separation of the dataset into three datasets based on the type of steel, applying support vector regression with Yeo-Johnson transformation on the main steel and miscellaneous steel dataset and applying ordinary least squares linear regression on the built-up dataset. The best model has a MAPE of 30% and MAE of 291 hours resulting in a decrease in a 25% decrease in MAPE and a 52% decrease in the MAE compared to using the company’s conventional estimation techniques In addition, the machine learning model resulted in a 94.5% decrease in the percentage error of cost estimates compared to conventional estimation.

School

School of Sciences and Engineering

Department

Construction Engineering Department

Degree Name

MS in Construction Engineering

Graduation Date

Spring 6-12-2024

Submission Date

5-27-2024

First Advisor

Dr. Khaled Nassar

Committee Member 1

Dr. Ibrahim Abotaleb (Internal Examiner)

Committee Member 2

Dr. Hesham Osman (External Examiner)

Committee Member 3

Dr. Ossama Hosny (Moderator)

Extent

209 p.

Document Type

Master's Thesis

Institutional Review Board (IRB) Approval

Not necessary for this item

Available for download on Wednesday, May 27, 2026

Share

COinS