\history

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

\tfootnote

This work was supported by Petróleo Brasileiro S/A - Petrobras (nº 0050.0124520.23.9), Fundação de Apoio A Física e A Química (FAFQ), Universidade de São Paulo (USP), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), grant nº 88887.992906/2024-00, and National Council for Scientific and Technological Development (CNPq), grants nº 309201/2021-7 and 406949/2021-2.

\corresp

Corresponding author: João Manoel Herrera Pinheiro (e-mail: joao.manoel.pinheiro@usp.br).

The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks

JOãO MANOEL HERRERA PINHEIRO1 SUZANA VILAS BOAS DE OLIVEIRA2 THIAGO HENRIQUE SEGRETO SILVA1 PEDRO ANTONIO RABELO SARAIVA1 ENZO FERREIRA DE SOUZA1 RICARDO V. GODOY1 LEONARDO ANDRé AMBROSIO2 MARCELO BECKER1 Department of Mechanical Engineering, University of São Paulo,13566-590, São Paulo, Brazil Department of Electrical and Computer Engineering, University of São Paulo,13566-590, São Paulo, Brazil

Abstract

This research addresses the critical lack of comprehensive studies on feature scaling by systematically evaluating 12 scaling techniques - including several less common transformations - across 14 different Machine Learning algorithms and 16 datasets for classification and regression tasks. We meticulously analyzed impacts on predictive performance (using metrics such as accuracy, MAE, MSE, and R²) and computational costs (training time, inference time, and memory usage). Key findings reveal that while ensemble methods (such as Random Forest and gradient boosting models like XGBoost, CatBoost and LightGBM) demonstrate robust performance largely independent of scaling, other widely used models such as Logistic Regression, SVMs, TabNet, and MLPs show significant performance variations highly dependent on the chosen scaler. This extensive empirical analysis, with all source code, experimental results, and model parameters made publicly available to ensure complete transparency and reproducibility, offers model-specific crucial guidance to practitioners on the need for an optimal selection of feature scaling techniques.

Index Terms:

Data preprocessing, feature scaling, machine learning algorithms, normalization, standardization.

\titlepgskip

=-21pt

I Introduction

Machine Learning progress has been notorious in several domains of knowledge engineering, notably driven by the rise of big data [1, 2], its applications in healthcare [3, 4, 5, 6], forecasting [7, 8, 9, 10], precision agriculture [11], wireless sensor networks [12], language tasks [13, 14, 15] and many other domains [16, 17, 18, 19, 20, 21]. All different applications compose the field of Machine Learning [14], which has become a major subarea of computer science and statistics due to its crucial role in the modern world [22]. Although these methods hold immense potential for the advancement of predictive modeling, their improper application has introduced significant obstacles [23, 24, 25].

One such obstacle is the indiscriminate use of preprocessing techniques, particularly feature scaling [26]. Feature scaling is a mapping technique in a preprocessing stage by which the user tries to give all attributes the same weight [27, 28]. In some applications, this data transformation can improve the performance of Machine Learning models [29].

Consequently, applying a scaling method without a careful evaluation of its suitability for the specific problem and model may be inadvisable and could negatively impact results. This practice risks undermining the validity of the claims regarding model performance and may lead to a feedback loop of overconfidence in the results, known as overfitting [30].

Reproducibility is a critical problem in Machine Learning [31, 32, 33, 34]. It is often undermined by factors such as missing data or code, inconsistent standards, and sensitivity to training conditions [35]. Feature scaling, in particular, if not documented or applied correctly, can significantly affect model performance and hinder the replication of results. The absence of rigorous evaluation not only hampers reproducibility, but can also lead to the adoption of practices with poor generalizability across different datasets or domains.

As Machine Learning methods continue to shape research by their use in a wide range of applications, it is essential to critically assess and justify each step of the modeling pipeline, including feature scaling, to ensure robust and replicable findings [36].

The primary objective of this study is to evaluate the impact of different data scaling methods on the training process and performance metrics of various Machine Learning algorithms across multiple datasets. We employ 14 widely used Machine Learning models for tabular data, including Linear Regression, Logistic Regression, Support Vector Machines (SVM), K-Nearest Neighbors, Multilayer Perceptron, Random Forest, TabNet, Naive Bayes, Classification and Regression Trees (CART), Gradient Boosting Trees, AdaBoost, LightGBM, CatBoost, and XGBoost. These models were evaluated using 12 different data scaling techniques, in addition to a baseline without scaling, across 16 datasets covering both classification and regression tasks. The selected models represent the state of the art in tabular data analysis, offering a favorable balance between predictive performance and computational efficiency, often outperforming deep learning techniques in this context [37, 38, 39, 40].

In Section II we cover some related work in similar studies while Section III explains more about each algorithm, each feature scaling technique, the study diagram, evaluation metrics, and how the models were trained. In Section IV we represent the final results of this study and some discussion. The limitations of our study are discussed in Section V. Lastly, in Section VI we give a final conclusion of the current experiments and future works.

II Related Work

Despite its fundamental role in Machine Learning pipelines, the impact of feature scaling remains an underexplored area in the literature. Most existing studies examine only a limited number of algorithms [41] or datasets, and often provide minimal analysis of the specific effects of different scaling techniques [42] on each particular algorithm’s performance. Studies that comprehensively evaluate various scaling methods across a broad range of models and datasets, such as the approach taken in this work, are scarce. In many Machine Learning cases, preprocessing is briefly mentioned, with scaling treated as a routine step rather than a variable worthy of in-depth investigation.

For some Machine Learning models, feature scaling is extremely necessary, such as K-Nearest Neighbors [43], Neural Networks [44, 45] [46], and SVM [47] [48]. In object detection, data scaling has a crucial impact [49].

In [50], the authors compared six normalization methods in an SVM classifier to improve intrusion data, and the Min-Max Normalization showed the best performance. In [51], the authors evaluated eleven Machine Learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-Nearest Neighbors, Classification and Regression Trees, Naive Bayes, Support Vector Machine, XGBoost, Random Forest (RF), Gradient Boosting, AdaBoost, and Extra Tree, across six different data scaling methods: Normalization, Z-score Normalization, Min-Max Normalization, Max Normalization, Robust Scaler, and Quantile Transformer. However, they focused on only one dataset, the UCI - Heart Disease [52]. Despite that, their results are interesting; models based on Decision Trees showed the best performance without any scaling method, while K-Nearest Neighbors and Support Vector Machine achieved the lowest performance.

In a work on diabetes diagnosis using models such as Random Forest, Naive Bayes, K-Nearest Neighbors, Logistic Regression and Support Vector Machine [53] the researchers compared only three preprocessing scenarios: Normalization, Z-score Normalization, and no feature scaling. Their findings suggested that Random Forest, Naive Bayes, and Logistic Regression showed little sensitivity to these specific scaling approaches, and in some cases, their performance even worsened post-scaling.

Another problem is data leakage and the reproducibility of Machine Learning models. As shown in [36], many studies used preprocessing steps, such as feature scaling, on the entire dataset before splitting the data into training and test. Some studies also applied a scaling technique without knowing if that specific Machine Learning algorithm would benefit from it.

In [54] the authors demonstrate how feature scaling methods can impact the final model performance. Rather than relying on traditional scaling techniques, they propose a Generalized Logistic algorithm. This method showed particularly strong performance on datasets with a small number of samples, consistently outperforming models that used features scaled with Min-Max Normalization or Z-score Normalization.

The convergence of stochastic gradient descent is highly sensitive to the scale of input features, with studies such as [55] demonstrating that normalization is an effective method for improving convergence.

For leaf classification, a Neural Network called the Probabilistic Neural Network was used as a classifier and Min-Max normalization was applied [56]. For approval, Min-Max normalization was again used with a K-Nearest Neighbors classifier [57].

The impact of scaling and normalization techniques on NMR spectroscopic metabonomic datasets was examined in [58]. In a related study, [59] explored eight different normalization methods to enhance the biological interpretation of metabolomics data.

An in-depth study on the impact of data normalization on classification performance was presented in [41]. The authors evaluated fourteen normalization methods but employed only the K-Nearest Neighbor Classifier. Their findings indicate that normalization as a data preprocessing technique is affected by various data characteristics, such as features with differing statistical properties, the dominance of certain features, and the presence of outliers.

In an unsupervised task, [44] investigates the impact of the scaling of features on K-Means, highlighting its importance for datasets with features measured in different units. The study compares five scaling methods: Z-score Normalization, Min-Max Normalization, Percentile transformation, Maximum absolute scaling, and Robust Scaler. For cluster analysis, normalization helps prevent unwanted biases in external validation indices, such as the Jaccard, Fowlkes-Mallows, and Adjusted Rand Index, that may arise due to variations in the number of clusters or imbalances in class size distributions [60].

In the context of glaucoma detection based on a combination of texture and higher-order spectral features [61], the authors demonstrate that Z-score normalization, when paired with a Random Forest classifier, achieves superior performance compared to a Support Vector Machine.

A most recent study [62] evaluates five Machine Learning models, but focuses exclusively on classification problems and applies only two feature scaling techniques. However, most of these studies do not explain the rationale for choosing these techniques. In addition, some apply normalization to the entire dataset before splitting it into training and testing sets, leading to data leakage.

III Methodology

Our primary focus in this study is to ensure reproducibility. To that end, we use a well-known dataset for classification and regression tasks, sourced from the University of California, Irvine (UCI) Machine Learning Repository, due to its well-recognized and diverse collection of real-world datasets with standardized formats, that can easily be used for benchmarking and comparison between the different models chosen in this work.

III-A Dataset

Tables I and II provide detailed information about the datasets used in this study, including the number of features, instances, and classes. All features are numeric, represented as either int64 or float64 types. The classification tasks are either binary or multi-class.

TABLE I: Datasets used for classification: Breast Cancer Wisconsin (Diagnostic) [63], Dry Bean Dataset [64], Glass Identification [65], Heart Disease [52], Iris [66], Letter Recognition [67], MAGIC Gamma Telescope [68], Rice (Cammeo and Osmancik) [69], and Wine [70].

Dataset	Instances	Features	Classes
Breast Cancer Wisconsin (Diagnostic)	569	30	2
Dry Bean Dataset	13611	16	7
Glass Identification	214	9	6
Heart Disease	303	13	2
Iris	150	4	3
Letter Recognition	20000	16	26
MAGIC Gamma Telescope	19020	10	2
Rice (Cammeo and Osmancik)	3810	7	2
Wine	178	13	3

TABLE II: Datasets used for regression: Air Quality [71], Abalone [72], Appliances Energy Prediction [73], Concrete Compressive Strength [74], Forest Fires [75], Real Estate Valuation [76], and Wine Quality [77].

Dataset	Instances	Features
Abalone	4177	8
Air Quality	9358	15
Appliances Energy Prediction	19735	28
Concrete Compressive Strength	1030	8
Forest Fires	517	12
Real Estate Valuation	414	6
Wine Quality	4898	11

III-B Train and Test set

To preserve the integrity of our analysis and avoid data leakage, we split the dataset into training and test sets prior to applying any preprocessing steps, such as feature scaling. Following standard practice in machine learning [78, 79], 70% of the data is allocated to the training set and 30% to the test set. While determining the optimal train-test split is inherently challenging in machine learning [80], our choice reflects a balance between reproducibility and the practical constraints posed by the relatively small size of some datasets. This ratio allows for both effective model training and reliable performance evaluation.

Data leakage occurs when information from the test set inadvertently influences the training process, often leading to overly optimistic performance estimates. For instance, performing oversampling or other transformations before splitting the data can introduce overlap between the training and test sets, thereby compromising their independence. By strictly separating the data prior to any preprocessing, we maintain a clear boundary between the two sets, ensuring a robust and unbiased evaluation of the model performance [36].

III-C Feature Scaling Techniques

Several feature scaling techniques were investigated. For a subset of these, we leveraged the built-in implementations available in the scikit-learn library. However, other specialized or less common scaling methods required custom implementation, which we developed as classes within our Python experimental framework.

III-C1 Min-Max Normalization (MM)

Min-Max normalization scales the data to a fixed range, typically [0, 1] [27, 81]. The transformation is given by:

X_{\text{norm}}=\frac{X-X_{\text{min}}}{X_{\text{max}}-X_{\text{min}}},

(1)

where $X$ is the original value, $X_{\text{min}}$ and $X_{\text{max}}$ are the minimum and maximum values in the dataset, $X_{\text{norm}}$ is the normalized value.

III-C2 Max Normalization (MA)

Max normalization scales the data by dividing each feature by its maximum absolute value [27, 82]:

X_{\text{norm}}=\frac{X}{max(~|X|~)}

(2)

This method is advantageous when the data consists of strictly non-negative values.

III-C3 Z-score Normalization (ZSN)

Z-score normalization (also known as Standardization) transforms data to have a mean of 0 and a unit variance [27, 83]. The formula is given by:

X_{\text{norm}}=\frac{X-\mu}{\sigma},

(3)

where $X$ is the original feature value, $\mu$ is the mean of the feature, $\sigma$ is the standard deviation of the feature, and $X_{\text{norm}}$ is the scaled value.

III-C4 Variable Stability Scaling (VAST)

Variable stability scaling adjusts the data based on the stability of each feature. It is particularly useful for high-dimensional datasets and can be seen as a variation of standardization that incorporates the Coefficient of Variation (CV), $\frac{\mu}{\sigma}$ , as a scaling factor [59]:

X_{\text{norm}}=\frac{(X-\mu)}{\sigma}\frac{\mu}{\sigma}

(4)

III-C5 Pareto Scaling (PS)

Pareto scaling is a normalization technique in which each feature is centered, by subtracting the mean, and then divided by the square root of its standard deviation [84]. It’s similar to Z-score normalization, but instead of dividing by the full standard deviation, it uses its square root as the scaling factor. Pareto scaling is particularly useful when the goal is to preserve relative differences between features while reducing the impact of large variances [85, 86].

X_{\text{norm}}=\frac{X-\mu}{\sqrt{\sigma}}

(5)

III-C6 Mean Centered (MC)

Mean centering subtracts the mean of each feature from the data. This method is often used as a preprocessing step in Principal Component Analysis (PCA) [87].

X_{\text{norm}}=X-\mu

(6)

III-C7 Robust Scaler (RS)

The robust scaler uses the median and interquartile range (IQR) to scale the data:

X_{\text{norm}}=\frac{X-X_{\text{median}}}{\text{IQR}}

This method is robust to outliers [88].

III-C8 Quantile Transformation (QT)

Quantile transformation maps the data to a uniform or normal distribution. It is useful for non-linear data [88].

III-C9 Decimal Scaling Normalization (DS)

Decimal scaling performs normalization by adjusting the decimal point of the attribute values, thereby rescaling them to fit within the range (–1, 1), not including the endpoints [26, 27]:

X_{\text{norm}}=\frac{X}{10^{j}}

(7)

where $j$ is the smallest integer such that $max(~|X_{norm}|~)<1$

III-C10 Tanh Transformation (TT)

A variant of tanh normalization is used, in which the Hampel estimators are replaced by the mean and standard deviation of each feature [89]:

X_{\text{norm}}=\frac{1}{2}\left\{\tanh\left(0.01~\frac{X-\mu}{\sigma}\right)+1\right\}

(8)

III-C11 Logistic Sigmoid Transformation (LS)

The logistic sigmoid-based transformation applies the sigmoid function to the data [90, 91]:

X_{\text{norm}}=\frac{1}{1+e^{-q}},\quad\text{where}\quad q=\frac{X-\mu}{\sigma}

(9)

III-C12 Hyperbolic Tangent Transformation (HT)

The hyperbolic tangent transformation is similar to the tanh transformation but is applied differently in certain contexts [90, 91]:

X_{\text{norm}}=\frac{1-e^{-q}}{1+e^{-q}},\quad\text{where}\quad q=\frac{X-\mu}{\sigma}

(10)

III-D Machine Learning Algorithms

The following models were used:

•

Logistic Regression (LR): A simple and effective statistical model for binary classification that estimates class probabilities using the logistic function [92].
•

Linear Regression: A fundamental model that fits a linear relationship between input characteristics and a continuous target variable [28, 93, 94]
•

Support Vector Machine (SVM) & Support Vector Regressor (SVR): With a linear kernel, this model finds the hyperplane that maximizes the margin between classes and for regression finds a function within a tolerance margin using the radial basis function (RBF) kernel [95, 96, 97].
•

Multilayer Perceptron (MLP): A feedforward neural network with one or more hidden layers trained using backpropagation, for classification and regression [98, 99, 100].
•

Random Forest (RF): An ensemble of decision trees built using bootstrap aggregation (bagging), improving robustness, and reducing overfitting, works for classification and regression [101, 102, 103].
•

Naive Bayes (NB): Based on Bayes’ Theorem with the assumption of conditional independence between features, only works for classification [104].
•

Classification and Regression Trees (CART): A recursive partitioning algorithm that builds binary trees for decision-making, works for classification and regression tasks [105, 106].
•

LightGBM (LGBM): A gradient boosting framework that uses histogram-based learning and leaf-wise tree growth, works for classification and regression [107].
•

AdaBoost (Ada): A boosting algorithm that iteratively focuses on misclassified examples to improve the accuracy of the model, works for classification and regression [108, 109].
•

CatBoost: A gradient boosting model with native support for categorical features, works for regression and classification [110].
•

XGBoost: An efficient implementation of gradient boosting with regularization and optimized parallel computing, works for regression and classification [111].
•

K-Nearest Neighbors (KNN): An instance-based learning algorithm that classifies samples based on the most frequent label among their nearest neighbors, also have its regression variant that predicts the target by averaging the outputs of the $k$ nearest neighbors in the feature space [112, 113, 114].
•

Attentive Interpretable Tabular Learning (TabNet): A deep learning model for tabular data that employs sequential attention to select relevant features at each decision step, works for classification and regression [115].

III-E Metrics

III-E1 Classification Metrics

•

Accuracy: one of the most widely used metrics to evaluate classification tasks. Measures the proportion of correctly predicted instances relative to the total number of predictions:

$\text{Accuracy}=\frac{TP+TN}{TP+TN+FP+FN},$ (11)

where $TP$ , $TN$ , $FP$ , and $FN$ represent true positives, true negatives, false positives, and false negatives, respectively.

Despite its popularity, accuracy can be misleading when dealing with imbalanced datasets, as highlighted by [116]. However, given the characteristics of our datasets, comprising both binary and multiclass classification problems, we chose to include accuracy in our analysis, acknowledging its limitations in the presence of class imbalance.

III-E2 Regression Metrics

•

Mean Absolute Error (MAE): measures the average absolute difference between predicted and actual values, offering an intuitive sense of the magnitude of the error.

$\text{MAE}=\frac{1}{n}\sum_{i=1}^{n}|y_{i}-\hat{y}_{i}|$ (12)
•

Mean Squared Error (MSE): calculates the average squared differences between actual and predicted values, penalizing larger errors more heavily:

$\text{MSE}=\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\hat{y}_{i})^{2}$ (13)
•

Coefficient of Determination ( $R^{2}$ ): indicates the proportion of variance in the dependent variable that is predictable from the independent variables. A higher value $R^{2}$ indicates a better fit of the model to the data [117].

These regression metrics are standard for evaluating continuous output predictions [118].

III-E3 Computational Metrics

To complement the evaluation of predictive performance, we also assess:

•

Memory Usage: measure memory usage during the scaling step.
•

Training Time: the time taken to train each model on a given dataset.
•

Inference Time: the time required for the trained model to make predictions on unseen data.

These metrics are essential when evaluating models for real-time or resource-constrained environments.

III-F Experiment Workflow

The objective of this subsection is to document every step of the experimentation process to ensure full transparency and reproducibility. All datasets used in this work are publicly available and can be downloaded along with their respective train/test splits. All Machine Learning algorithms were applied using their default hyperparameters, as implemented in scikit-learn or the corresponding official libraries. For algorithms that support the random_state parameter, a fixed seed was used to ensure consistent results across runs. During each training session, a configuration file is generated to record the parameters used by each model, enabling complete traceability of the experimental setup.

The experiment begins with the import and cleaning of each dataset. Categorical target variables are encoded numerically, and column names are standardized using regular expressions. After that, each dataset is partitioned into training and testing subsets, which are saved both as .csv files and as Python dictionaries. Subsequently, for every dataset and Machine Learning model combination, various scaling techniques are applied. Each model is trained on the training set and evaluated on the test set, with performance metrics computed accordingly. In addition to the results, the model configuration and metadata - including training and inference times - are stored to ensure reproducibility and to facilitate further analysis.

Algorithm 1 Experiment Workflow

D

: Training Data

2:Validation metrics, training time, inference time, memory usage, and model parameters.

3:Begin

4:Import necessary libraries and load dataset

D

5:Perform cleaning and ETL (Extract, Transform, Load) on

D

6:Split

D

into

D_{train}

(training set) and

D_{test}

(testing set).

7:for each available dataset do

8: for each Machine Learning Model

M

9: Fit scaler on

D_{train}

and apply transform to

D_{train}

and

D_{test}

10: Train Model

M

using

D_{train}

11: Evaluate

M

D_{test}

12: Save validation results, including accuracy, training time, inference time, memory usage, and model parameters.

13: end for

14:end for

15:End

III-G Python Script Descriptions

The following scripts were developed to automate and manage the experimental pipeline:

•

import_dataset.py: Imports datasets from the UCI repository and maps the appropriate target variable for each dataset.
•

etl_cleaning.py: Convert categorical and numerical variables as needed and cleans column names using regular expressions.
•

train_test_split.py: Splits the datasets into training and testing sets and saves them in both .csv and dictionary formats.
•

train_results.py: Train each Machine Learning model on every dataset using different scaling techniques. Calculate validation metrics and save both performance results and model configuration.
•

main.py: Serves as the main execution script that orchestrates all stages of the experiment.

Refer to caption — Figure 1: Experimental design

III-H Source Code of this Experiments

For complete transparency and reproducibility, the source code, all experimental results, and detailed model parameters are publicly available in GitHub

The experiments were conducted on a system running a 64-bit Linux distribution, equipped with an AMD Ryzen™ 9 7900 processor (capable of boosting to 5.4 GHz) and 64 GB of RAM.

IV Experiments

This section presents the empirical results of our extensive experiments. We selected five representative models and three scaling techniques, along with a baseline without scaling, as this subset already allows us to draw meaningful conclusions and observe distinctions among models. The complete results are available in the Appendix A.

IV-A Impact on Validation Metrics

TABLE III: Accuracy results by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVM under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Breast Cancer Wisconsin Diagnostic	KNN	0.9591	0.9766	0.9591	0.9649
	LGBM	0.9474	0.9474	0.9591	0.9591
	MLP	0.9649	0.9766	0.9766	0.9708
	RF	0.9708	0.9708	0.9708	0.9708
	SVM	0.9240	0.9825	0.9766	0.9825
Dry Bean	KNN	0.7113	0.9141	0.9216	0.9190
	LGBM	0.9275	0.9275	0.9263	0.9275
	MLP	0.2980	0.9101	0.9327	0.9314
	RF	0.9238	0.9226	0.9226	0.9226
	SVM	0.5803	0.9109	0.9263	0.9268
Glass Identification	KNN	0.5846	0.6615	0.6308	0.6308
	LGBM	0.8154	0.8154	0.8000	0.8308
	MLP	0.7077	0.7231	0.6923	0.6769
	RF	0.7538	0.7538	0.7692	0.7846
	SVM	0.6769	0.5077	0.6615	0.6462
Heart Disease	KNN	0.4945	0.5495	0.5714	0.5275
	LGBM	0.5275	0.5275	0.5275	0.5385
	MLP	0.3516	0.5495	0.5055	0.5385
	RF	0.5604	0.5604	0.5604	0.5604
	SVM	0.5055	0.5934	0.6154	0.5934
Iris	KNN	1.0000	1.0000	1.0000	0.9556
	LGBM	1.0000	1.0000	1.0000	1.0000
	MLP	1.0000	1.0000	1.0000	1.0000
	RF	1.0000	1.0000	1.0000	1.0000
	SVM	1.0000	1.0000	0.9778	0.9778
Letter Recognition	KNN	0.9493	0.9480	0.9405	0.9158
	LGBM	0.9640	0.9640	0.9637	0.9637
	MLP	0.9367	0.9280	0.9502	0.9545
	RF	0.9577	0.9577	0.9570	0.9580
	SVM	0.8135	0.8208	0.8488	0.8488
Magic Gamma Telescope	KNN	0.8098	0.8254	0.8340	0.8340
	LGBM	0.8803	0.8803	0.8792	0.8810
	MLP	0.8170	0.8677	0.8717	0.8777
	RF	0.8808	0.8808	0.8808	0.8808
	SVM	0.2976	0.5158	0.4341	0.3212
Rice Cammeo And Osmancik	KNN	0.8775	0.9204	0.9143	0.9064
	LGBM	0.9213	0.9213	0.9178	0.9160
	MLP	0.5468	0.9335	0.9318	0.9309
	RF	0.9265	0.9265	0.9265	0.9265
	SVM	0.9248	0.9309	0.9309	0.9274
Wine	KNN	0.7407	0.9444	0.9630	0.9444
	LGBM	0.9815	0.9815	0.9815	0.9815
	MLP	0.9815	1.0000	0.9815	0.9815
	RF	1.0000	1.0000	1.0000	1.0000
	SVM	0.5926	1.0000	0.9815	0.9815

TABLE IV:

R^{2}

score by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVR under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Abalone	KNN	0.5164	0.4955	0.4662	0.4552
	LGBM	0.5260	0.5260	0.5256	0.5190
	MLP	0.5245	0.5265	0.5578	0.5632
	RF	0.5244	0.5249	0.5234	0.5241
	SVR	0.5293	0.5257	0.5421	0.5398
Air Quality	KNN	0.9995	0.9993	0.9994	0.9986
	LGBM	0.9999	0.9999	0.9999	0.9999
	MLP	0.9985	1.0000	1.0000	0.9999
	RF	1.0000	1.0000	1.0000	1.0000
	SVR	0.9966	0.9619	0.9269	0.9188
Appliances Energy Prediction	KNN	0.1681	0.2049	0.3279	0.2929
	LGBM	0.4318	0.4318	0.4334	0.4192
	MLP	0.1598	0.1581	0.3144	0.2970
	RF	0.5122	0.5120	0.5122	0.5125
	SVR	-0.1056	-0.0275	0.0154	-0.0096
Concrete Compressive Strength	KNN	0.6770	0.6631	0.6714	0.7446
	LGBM	0.9229	0.9229	0.9217	0.9226
	MLP	0.8030	0.7468	0.8725	0.8662
	RF	0.8896	0.8895	0.8891	0.8894
	SVR	0.2259	0.5394	0.6093	0.6987
Forest Fires	KNN	-0.0115	-0.0470	-0.0447	-0.0345
	LGBM	-0.0246	-0.0246	-0.0188	-0.0135
	MLP	-0.0067	0.0057	0.0013	0.0121
	RF	-0.1060	-0.1101	-0.1093	-0.1058
	SVR	-0.0257	-0.0246	-0.0244	-0.0245
Real Estate Valuation	KNN	0.6232	0.6232	0.6153	0.6348
	LGBM	0.7001	0.7001	0.7120	0.7075
	MLP	0.6199	0.5608	0.6436	0.6894
	RF	0.7444	0.7449	0.7444	0.7444
	SVR	0.4897	0.5327	0.5788	0.5872
Wine Quality	KNN	0.1221	0.3283	0.3475	0.3234
	LGBM	0.4557	0.4557	0.4578	0.4602
	MLP	0.2500	0.3243	0.3872	0.3879
	RF	0.4985	0.4991	0.4982	0.4992
	SVR	0.1573	0.3185	0.3842	0.3799

TABLE V: MSE by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVR under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Abalone	KNN	4.9107	5.1229	5.4200	5.5324
	LGBM	4.8137	4.8137	4.8170	4.8846
	MLP	4.8285	4.8087	4.4900	4.4356
	RF	4.8295	4.8244	4.8392	4.8328
	SVR	4.7801	4.8158	4.6502	4.6734
Air Quality	KNN	0.9109	1.2492	1.0231	2.3434
	LGBM	0.1129	0.1129	0.1298	0.1291
	MLP	2.6090	0.0278	0.0110	0.1181
	RF	0.0131	0.0131	0.0131	0.0131
	SVR	5.7798	65.6306	125.8177	139.7765
Appliances Energy Prediction	KNN	8570.5671	8192.0770	6924.2331	7285.3653
	LGBM	5854.3508	5854.3508	5837.8269	5983.4385
	MLP	8656.1894	8673.7742	7063.7610	7243.1268
	RF	5025.6175	5028.0530	5025.4432	5022.6680
	SVR	11390.9162	10585.9003	10144.3973	10401.7762
Concrete Compressive Strength	KNN	87.4024	91.1487	88.9093	69.1034
	LGBM	20.8520	20.8520	21.1791	20.9335
	MLP	53.3124	68.5215	34.5083	36.1983
	RF	29.8643	29.8945	30.0114	29.9231
	SVR	209.4406	124.6258	105.7187	81.5215
Forest Fires	KNN	8049.5193	8331.5751	8314.0348	8232.2422
	LGBM	8153.9240	8153.9240	8107.4363	8065.4234
	MLP	8010.9478	7912.5614	7947.6005	7861.6344
	RF	8801.7409	8833.7029	8827.5833	8799.6126
	SVR	8162.5768	8154.0176	8152.2006	8152.5913
Real Estate Valuation	KNN	63.0027	63.0182	64.3252	61.0770
	LGBM	50.1547	50.1547	48.1590	48.9211
	MLP	63.5585	73.4388	59.6005	51.9328
	RF	42.7505	42.6657	42.7479	42.7405
	SVR	85.3424	78.1453	70.4354	69.0369
Wine Quality	KNN	0.6406	0.4901	0.4761	0.4937
	LGBM	0.3971	0.3971	0.3956	0.3939
	MLP	0.5472	0.4930	0.4471	0.4466
	RF	0.3659	0.3655	0.3661	0.3654
	SVR	0.6149	0.4973	0.4493	0.4525

TABLE VI: MAE by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVR under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Abalone	KNN	1.5673	1.6102	1.6555	1.6654
	LGBM	1.5476	1.5476	1.5500	1.5632
	MLP	1.6159	1.5748	1.5318	1.4958
	RF	1.5590	1.5584	1.5617	1.5619
	SVR	1.5048	1.5111	1.4964	1.4990
Air Quality	KNN	0.5506	0.6755	0.5862	0.8379
	LGBM	0.0677	0.0677	0.0728	0.0731
	MLP	1.2505	0.0909	0.0698	0.1910
	RF	0.0167	0.0167	0.0166	0.0167
	SVR	0.8970	1.4802	1.9293	5.4424
Appliances Energy Prediction	KNN	47.7696	45.5977	39.8189	41.5828
	LGBM	39.0633	39.0633	39.0969	39.3198
	MLP	53.8080	54.1215	47.7318	47.7236
	RF	34.2701	34.2923	34.2876	34.2838
	SVR	48.9182	45.5442	43.4164	45.1344
Concrete Compressive Strength	KNN	7.2301	7.0405	7.3319	6.4237
	LGBM	3.0480	3.0480	3.0473	3.0197
	MLP	5.9685	6.4045	4.4758	4.5392
	RF	3.7512	3.7503	3.7608	3.7550
	SVR	11.6674	8.9978	8.1314	7.0217
Forest Fires	KNN	21.5065	20.7837	21.0229	19.7818
	LGBM	24.2821	24.2821	24.1922	23.9371
	MLP	21.2084	20.6192	24.5951	23.9833
	RF	24.2812	24.3738	24.3496	24.3443
	SVR	14.9474	14.9755	14.9655	14.9590
Real Estate Valuation	KNN	5.4626	5.4435	5.6979	5.5819
	LGBM	4.8106	4.8106	4.7284	4.8254
	MLP	5.3601	6.1126	5.4400	5.0138
	RF	4.3971	4.3951	4.4002	4.3916
	SVR	6.8824	6.2845	5.9805	5.8563
Wine Quality	KNN	0.6243	0.5231	0.5259	0.5349
	LGBM	0.4832	0.4832	0.4829	0.4833
	MLP	0.5692	0.5499	0.5187	0.5196
	RF	0.4366	0.4368	0.4370	0.4362
	SVR	0.6076	0.5453	0.5111	0.5142

As anticipated, one of the key findings from our classification experiments is the differential impact of feature scaling on model performance. Table III shows the accuracy results. Ensemble methods, including Random Forest and the gradient boosting family (LightGBM, CatBoost, XGBoost), demonstrated strong robustness by consistently achieving high validation performance irrespective of the preprocessing strategy or dataset. This inherent robustness offers a significant practical advantage, particularly in resource-constrained environments, since omitting the scaling step eliminates the associated memory and computational overhead. The Naive Bayes model showed similar scaling resistance, although its overall accuracy was not competitive with these top-tier ensembles. In stark contrast, the performance of Logistic Regression (LR), Support Vector Machines (SVM), K-Nearest Neighbor (KNN), TabNet, and Multi-Layer Perceptrons (MLP) was highly dependent on the choice of scaler, revealing their pronounced sensitivity to data preprocessing through significant fluctuations in performance.

A similar pattern of scaling sensitivity was observed in regression tasks when employing the regression counterparts of these classification models, as shows Tables IV, V and VI. This suggests that the underlying mathematical principles governing these model families lead to consistent behavior regarding data scaling, whether applied to classification or regression problems.

In general, our findings affirm the superior performance of the ensemble methods, Random Forest, LightGBM, CatBoost, and XGBoost, which is consistent with their established reputation as state-of-the-art models for tabular data. Their ability to achieve high accuracy regardless of the scaling technique applied explains why feature scaling is often considered an optional preprocessing step for these particular models in many Machine Learning projects. This practical consideration, combined with their predictive power, underscores their utility in a wide range of applications.

IV-B Impact on Training and Inference Times

TABLE VII: Time to Train (s) in classification models by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVM under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Breast Cancer Wisconsin Diagnostic	KNN	0.0002	0.0002	0.0002	0.0002
	LGBM	0.0213	0.0219	0.0275	0.0256
	MLP	0.1708	0.2929	0.1683	0.1824
	RF	0.0706	0.0707	0.0709	0.0706
	SVM	0.0016	0.0007	0.0009	0.0009
Dry Bean	KNN	0.0008	0.0008	0.0008	0.0008
	LGBM	0.2732	0.2759	0.2988	0.3025
	MLP	0.3396	3.2012	2.4228	3.8822
	RF	1.8703	1.8860	1.8872	1.8734
	SVM	0.1096	0.1932	0.1560	0.1470
Glass Identification	KNN	0.0002	0.0002	0.0002	0.0004
	LGBM	0.0422	0.0427	0.0345	0.0410
	MLP	0.2307	0.2416	0.2376	0.2372
	RF	0.0453	0.0456	0.0453	0.0453
	SVM	0.0010	0.0007	0.0008	0.0009
Heart Disease	KNN	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0466	0.0450	0.0412	0.0439
	MLP	0.0198	0.1105	0.3423	0.3942
	RF	0.0449	0.0452	0.0451	0.0451
	SVM	0.0029	0.0009	0.0017	0.0012
Iris	KNN	0.0004	0.0002	0.0002	0.0003
	LGBM	0.0116	0.0119	0.0130	0.0118
	MLP	0.1074	0.1444	0.0947	0.1164
	RF	0.0382	0.0387	0.0386	0.0385
	SVM	0.0003	0.0004	0.0003	0.0003
Letter Recognition	KNN	0.0010	0.0009	0.0009	0.0010
	LGBM	1.1319	1.0970	1.1101	1.1380
	MLP	11.1298	22.0963	11.5231	13.5357
	RF	0.8526	0.8582	0.8632	0.8535
	SVM	0.8484	0.7841	0.7588	0.7936
Magic Gamma Telescope	KNN	0.0065	0.0065	0.0065	0.0071
	LGBM	0.0469	0.0460	0.0461	0.0468
	MLP	0.5592	4.8919	4.0988	3.4037
	RF	2.6338	2.6361	2.6471	2.6319
	SVM	0.0697	0.2742	0.2290	0.2240
Rice Cammeo And Osmancik	KNN	0.0009	0.0010	0.0010	0.0010
	LGBM	0.0294	0.0291	0.0309	0.0315
	MLP	0.0464	0.6241	0.1954	0.2864
	RF	0.2023	0.2032	0.2026	0.2034
	SVM	0.0093	0.0223	0.0201	0.0199
Wine	KNN	0.0003	0.0003	0.0002	0.0003
	LGBM	0.0164	0.0151	0.0167	0.0154
	MLP	0.2105	0.1658	0.0603	0.0696
	RF	0.0406	0.0408	0.0411	0.0406
	SVM	0.0008	0.0004	0.0005	0.0005

TABLE VIII: Time to Inference (s) in classification models by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVM under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Breast Cancer Wisconsin Diagnostic	KNN	0.0025	0.0025	0.0025	0.0030
	LGBM	0.0004	0.0004	0.0007	0.0007
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0013	0.0013	0.0013	0.0013
	SVM	0.0002	0.0002	0.0001	0.0001
Dry Bean	KNN	0.0529	0.0536	0.0536	0.0594
	LGBM	0.0101	0.0099	0.0099	0.0100
	MLP	0.0011	0.0012	0.0012	0.0012
	RF	0.0180	0.0181	0.0184	0.0179
	SVM	0.0255	0.1723	0.0892	0.0907
Glass Identification	KNN	0.0012	0.0012	0.0012	0.0014
	LGBM	0.0005	0.0006	0.0006	0.0006
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0013	0.0013	0.0013	0.0013
	SVM	0.0002	0.0002	0.0002	0.0002
Heart Disease	KNN	0.0016	0.0016	0.0016	0.0020
	LGBM	0.0007	0.0006	0.0006	0.0006
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0014	0.0014	0.0014	0.0015
	SVM	0.0003	0.0002	0.0002	0.0002
Iris	KNN	0.0012	0.0010	0.0009	0.0012
	LGBM	0.0004	0.0005	0.0004	0.0004
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0011	0.0011	0.0011	0.0011
	SVM	0.0001	0.0001	0.0001	0.0001
Letter Recognition	KNN	0.8666	0.0819	0.0821	0.0870
	LGBM	0.0611	0.0606	0.0610	0.0601
	MLP	0.0022	0.0022	0.0021	0.0021
	RF	0.0471	0.0473	0.0472	0.0479
	SVM	0.8372	1.3598	0.8540	0.8722
Magic Gamma Telescope	KNN	0.1088	0.1550	0.1659	0.1635
	LGBM	0.0024	0.0023	0.0024	0.0023
	MLP	0.0012	0.0012	0.0012	0.0012
	RF	0.0347	0.0349	0.0349	0.0341
	SVM	0.0214	0.1027	0.0838	0.0856
Rice Cammeo And Osmancik	KNN	0.0139	0.0148	0.0149	0.0152
	LGBM	0.0007	0.0007	0.0007	0.0008
	MLP	0.0003	0.0003	0.0002	0.0003
	RF	0.0045	0.0044	0.0045	0.0044
	SVM	0.0012	0.0083	0.0055	0.0056
Wine	KNN	0.0011	0.0011	0.0011	0.0011
	LGBM	0.0004	0.0004	0.0004	0.0004
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0011	0.0011	0.0011	0.0011
	SVM	0.0001	0.0001	0.0001	0.0001

TABLE IX: Time to Train (s) in regression models by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVM under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Abalone	KNN	0.0010	0.0010	0.0010	0.0010
	LGBM	0.0286	0.0288	0.0301	0.0307
	MLP	0.8414	1.1404	0.8932	1.2343
	RF	0.5945	0.6035	0.6003	0.5977
	SVR	0.1382	0.1383	0.1392	0.1440
Air Quality	KNN	0.0032	0.0032	0.0031	0.0031
	LGBM	0.0391	0.0395	0.0392	0.0396
	MLP	0.5701	3.3321	2.1864	1.8880
	RF	1.8482	1.8664	1.8547	1.8659
	SVR	0.6315	0.5742	0.4535	0.7643
Appliances Energy Prediction	KNN	0.0004	0.0004	0.0004	0.0004
	LGBM	0.0522	0.0532	0.0552	0.0545
	MLP	3.3408	12.6484	17.8381	16.1881
	RF	18.2580	18.3987	18.3023	18.2253
	SVR	3.6868	3.6063	3.6767	3.7207
Concrete Compressive Strength	KNN	0.0004	0.0004	0.0004	0.0004
	LGBM	0.0221	0.0232	0.0231	0.0237
	MLP	0.1335	0.7281	0.8502	0.8323
	RF	0.1397	0.1396	0.1390	0.1438
	SVR	0.0086	0.0093	0.0087	0.0091
Forest Fires	KNN	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0115	0.0121	0.0119	0.0118
	MLP	0.2457	0.3816	0.4316	0.4330
	RF	0.0997	0.1006	0.1002	0.1063
	SVR	0.0025	0.0028	0.0027	0.0025
Real Estate Valuation	KNN	0.0002	0.0002	0.0002	0.0002
	LGBM	0.0090	0.0094	0.0094	0.0098
	MLP	0.1022	0.3386	0.3599	0.3778
	RF	0.0664	0.0666	0.0663	0.0689
	SVR	0.0019	0.0018	0.0023	0.0017
Wine Quality	KNN	0.0022	0.0022	0.0023	0.0022
	LGBM	0.0314	0.0313	0.0322	0.0334
	MLP	0.4713	1.3455	2.3311	1.6345
	RF	1.2844	1.2891	1.2968	1.3414
	SVR	0.3166	0.3181	0.3239	0.3213

TABLE X: Time to Inference (s) in regression models by dataset, model, and scaling method, showing only KNN, LGBM, MLP, RF, and SVM under the None, MA, ZSN, and RS scaling methods. Full results are provided in the Appendix A.

Dataset	Model	NO	MA	ZSN	RS
Abalone	KNN	0.0034	0.0048	0.0042	0.0041
	LGBM	0.0006	0.0007	0.0007	0.0007
	MLP	0.0003	0.0003	0.0003	0.0003
	RF	0.0110	0.0111	0.0110	0.0109
	SVR	0.0652	0.0652	0.0656	0.0666
Air Quality	KNN	0.0171	0.0317	0.0338	0.0381
	LGBM	0.0010	0.0009	0.0009	0.0009
	MLP	0.0005	0.0006	0.0006	0.0006
	RF	0.0169	0.0171	0.0170	0.0171
	SVR	0.2986	0.2676	0.1981	0.3599
Appliances Energy Prediction	KNN	0.0222	0.0215	0.0225	0.0216
	LGBM	0.0017	0.0017	0.0017	0.0017
	MLP	0.0012	0.0012	0.0012	0.0012
	RF	0.0694	0.0697	0.0700	0.0692
	SVR	1.8656	1.8341	1.7995	1.8159
Concrete Compressive Strength	KNN	0.0009	0.0010	0.0010	0.0010
	LGBM	0.0004	0.0004	0.0004	0.0004
	MLP	0.0001	0.0002	0.0001	0.0001
	RF	0.0033	0.0034	0.0032	0.0034
	SVR	0.0044	0.0043	0.0041	0.0041
Forest Fires	KNN	0.0003	0.0006	0.0006	0.0005
	LGBM	0.0003	0.0004	0.0003	0.0003
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0018	0.0018	0.0018	0.0018
	SVR	0.0011	0.0012	0.0011	0.0013
Real Estate Valuation	KNN	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0003	0.0003	0.0003	0.0003
	MLP	0.0001	0.0001	0.0001	0.0001
	RF	0.0017	0.0017	0.0017	0.0017
	SVR	0.0007	0.0009	0.0007	0.0007
Wine Quality	KNN	0.0048	0.0257	0.0369	0.0329
	LGBM	0.0008	0.0008	0.0008	0.0009
	MLP	0.0004	0.0004	0.0004	0.0005
	RF	0.0143	0.0147	0.0145	0.0151
	SVR	0.1501	0.1527	0.1419	0.1422

The application of different scaling techniques had a variable impact on inference times across the evaluated models. Notably, Classification and Regression Trees (CART) exhibited exceptionally robust behavior, remaining unaffected across all scaling methods and datasets. While certain Machine Learning algorithms — such as K-Nearest Neighbors (KNN), Random Forest, Support Vector Machine (SVM), and Support Vector Regressor (SVR) — showed more evident sensitivity to the choice of scaling technique, this was not the norm. For the majority of models, the preprocessing step introduced only a small, non-uniform computational overhead — as illustrated in Tables VIII and X — which may become more significant in the context of large-scale datasets or time-sensitive applications requiring real-time inference.

For training time, the effects of scaling largely mirrored those seen in the validation accuracy results, as shown in Tables VII and IX. Certain models, notably tree-based ensembles, did not derive a significant speed benefit from feature scaling, while others were more sensitive.

IV-C Results of Memory Usage (kB)

The analysis revealed a clear distinction in memory consumption between scaling techniques. As expected, applying no scaling resulted in zero additional memory consumption, only the consumption to load the data. Among the actual scaling methods, the RobustScaler, StandardScaler, Tanh Transformer, and Hyperbolic Tangent were found to be the most memory-intensive. In contrast, the MaxAbsScaler, MinMaxScaler, and Decimal Scaler consistently registered the lowest memory usage as shown in Table XI.

TABLE XI: Memory Usage (kB) per Dataset and Scaling Method

Dataset

ZSN

Breast Cancer

Wisconsin

Diagnostic

0.1875

175.7594

176.1266

384.9979

Dry Bean

1704.2750

2448.3812

2599.3156

2388.1666

Glass

Identification

0.1875

23.1984

26.1799

51.8070

Heart Disease

33.6609

67.2016

71.7602

122.0010

Iris

0.1875

8.7828

10.5523

20.2666

Letter

Recognition

0.1875

3566.3000

3787.1906

2568.4697

Magic Gamma

Telescope

0.1875

1552.0641

1552.2750

1552.7039

Rice Cammeo

And Osmancik

211.2688

358.2641

378.0189

297.6197

Wine

21.0234

40.3781

43.8195

73.9416

Abalone

0.1875

294.4004

294.5879

294.9893

Air Quality

880.1270

1294.5430

1373.1387

1233.9355

Appliances

Energy

Prediction

4164.2959

5894.4902

6261.5137

5832.6855

Concrete

Compressive

Strength

65.6523

137.7617

144.9668

94.3340

Forest Fires

43.2832

87.1992

92.4219

157.8330

Real Estate

Valuation

22.2676

43.1992

46.3398

79.0088

Wine Quality

0.1875

624.3691

624.5879

625.0762

V Limitation

While this study provides a broad empirical analysis of feature scaling across various models and datasets, certain limitations should be acknowledged, which also open avenues for future research.

•

Hyperparameter Optimization: The Machine Learning models analyzed in this study were evaluated using their default hyperparameters, as outlined in the methodology. A comprehensive hyperparameter tuning process for each model–scaler–dataset combination was beyond the current scope; however, such optimization could potentially uncover different optimal pairings or further improve model performance.
•

Scope and Diversity of Datasets: Although 16 datasets were used for both classification and regression tasks, the findings could be further enhanced by incorporating an even wider array of datasets, particularly those with very high dimensionality, different types of underlying data distributions, or from more specialized domains.
•

Evaluation Metrics for Classification: The primary metric for classification tasks was accuracy. Although acknowledged as potentially misleading for imbalanced datasets, future work could incorporate a broader suite of metrics, such as F1 score, precision, recall AUC, or balanced precision, to provide a more nuanced understanding of performance, especially on datasets with skewed class distributions.
•

Dataset Size and Synthetic Data: For some of the smaller datasets utilized, the exploration of techniques such as synthetic data generation or data augmentation was not performed. Such methods could potentially improve the robustness and performance of certain models, representing a promising direction for further research.
•

Focus on Default Algorithm Implementations: The study relied on standard implementations of algorithms mainly from well-known libraries. Investigating variations or more recent advancements within these algorithm families could offer additional insight.

These limitations are common in empirical studies of this nature and primarily highlight areas where this already extensive work could be expanded in the future.

VI Conclusion

This comprehensive empirical study investigated the impact of 12 feature scaling techniques on 14 Machine Learning algorithms in 16 classification and regression datasets. Key findings reaffirmed the robustness of ensemble methods (e.g., Random Forest, gradient boosting family), which, along with Naive Bayes, largely maintained high performance irrespective of scaling. This offers efficiency gains by potentially avoiding preprocessing overhead. In stark contrast, models such as Logistic Regression, SVMs, MLPs, K-Nearest Neighbor, and TabNet demonstrated high sensitivity, with their performance critically dependent on scaler choice — a pattern consistent across both task types. Computational analysis also indicated that scaling choices can influence training/inference times and memory usage, with certain scalers being notably more resource-intensive.

This study contributes with one of the first systematic evaluations of such an extensive array of models, some less common models like TabNet, and scaling techniques — including transformations, e.g Tanh (TT) and Hyperbolic Tangent (HT), which are less commonly benchmarked as general-purpose scalers —, all within a unified Python framework. This is particularly relevant given that feature scaling is often applied in the literature without clear rationale, sometimes incorrectly before data splitting, which can lead to data leakage, or without verifying algorithm-specific benefits. By providing broad empirical evidence, our work offers clear guidance on how to mitigate these common issues, promoting informed scaling selection and more rigorous experimental design.

Future research could extend these insights by exploring extensive hyperparameter optimization, incorporating more diverse datasets, and utilizing a broader suite of evaluation metrics. Nevertheless, this study significantly contributes to a deeper, practical understanding of feature scaling’s role in Machine Learning.

References

[1] L. Zhou, S. Pan, J. Wang, and A. V. Vasilakos, “Machine learning on big data: Opportunities and challenges,” Neurocomputing, vol. 237, pp. 350–361, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231217300577
[2] X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg, “Top 10 algorithms in data mining,” Knowl. Inf. Syst., vol. 14, no. 1, p. 1–37, dec 2007. [Online]. Available: https://doi.org/10.1007/s10115-007-0114-2
[3] K. Shailaja, B. Seetharamulu, and M. A. Jabbar, “Machine learning in healthcare: A review,” in 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), 2018, pp. 910–914.
[4] C. J. Haug and J. M. Drazen, “Artificial intelligence and machine learning in clinical medicine, 2023,” New England Journal of Medicine, vol. 388, no. 13, pp. 1201–1208, 2023. [Online]. Available: https://www.nejm.org/doi/full/10.1056/NEJMra2302038
[5] Z. Obermeyer and E. J. Emanuel, “Predicting the future — big data, machine learning, and clinical medicine,” New England Journal of Medicine, vol. 375, no. 13, pp. 1216–1219, 2016. [Online]. Available: https://www.nejm.org/doi/full/10.1056/NEJMp1606181
[6] Y. Wang, Y. Fan, P. Bhatt, and C. Davatzikos, “High-dimensional pattern regression using machine learning: From medical images to continuous clinical variables,” NeuroImage, vol. 50, no. 4, pp. 1519–1535, 2010. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1053811909013810
[7] N. K. Ahmed, A. F. Atiya, N. E. Gayar, and H. E.-S. and, “An empirical comparison of machine learning models for time series forecasting,” Econometric Reviews, vol. 29, no. 5-6, pp. 594–621, 2010. [Online]. Available: https://doi.org/10.1080/07474938.2010.481556
[8] R. P. Masini, M. C. Medeiros, and E. F. Mendes, “Machine learning advances for time series forecasting,” Journal of Economic Surveys, vol. 37, no. 1, pp. 76–111, 2023. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/joes.12429
[9] G. Bontempi, S. Ben Taieb, and Y.-A. Le Borgne, Machine Learning Strategies for Time Series Forecasting. Springer Berlin Heidelberg, 01 2013, vol. 138, pp. 62–77.
[10] F. Sun, X. Meng, Y. Zhang, Y. Wang, H. Jiang, and P. Liu, “Agricultural product price forecasting methods: A review,” Agriculture, vol. 13, no. 9, 2023. [Online]. Available: https://www.mdpi.com/2077-0472/13/9/1671
[11] A. Sharma, A. Jain, P. Gupta, and V. Chowdary, “Machine learning applications for precision agriculture: A comprehensive review,” IEEE Access, vol. 9, pp. 4843–4873, 2021.
[12] M. A. Alsheikh, S. Lin, D. Niyato, and H.-P. Tan, “Machine learning in wireless sensor networks: Algorithms, strategies, and applications,” IEEE Communications Surveys & Tutorials, vol. 16, no. 4, pp. 1996–2018, 2014.
[13] S. Wang, J. Huang, Z. Chen, Y. Song, W. Tang, H. Mao, W. Fan, H. Liu, X. Liu, D. Yin, and Q. Li, “Graph machine learning in the era of large language models (llms),” ACM Trans. Intell. Syst. Technol., May 2025, just Accepted. [Online]. Available: https://doi.org/10.1145/3732786
[14] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” 2023. [Online]. Available: https://arxiv.org/abs/1706.03762
[15] A. Conneau, H. Schwenk, L. Barrault, and Y. Lecun, “Very deep convolutional networks for text classification,” 2017. [Online]. Available: https://arxiv.org/abs/1606.01781
[16] P. P. Shinde and S. Shah, “A review of machine learning and deep learning applications,” in 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 2018, pp. 1–6.
[17] A. P. Singh, V. K. Mishra, and S. Akhter, “Investigating machine learning applications for fdsoi mos-based computer-aided design,” in 2023 9th International Conference on Signal Processing and Communication (ICSC), 2023, pp. 708–713.
[18] E. R. Hruschka, R. J. G. B. Campello, A. A. Freitas, and A. C. Ponce Leon F. de Carvalho, “A survey of evolutionary algorithms for clustering,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 39, no. 2, pp. 133–155, 2009.
[19] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol. 234, 12 2016.
[20] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012.
[21] M. Little, Machine Learning for Signal Processing: Data Science, Algorithms, and Computational Statistics. Oxford University Press, 2019. [Online]. Available: https://books.google.com.br/books?id=mDGoDwAAQBAJ
[22] D. D. and, “50 years of data science,” Journal of Computational and Graphical Statistics, vol. 26, no. 4, pp. 745–766, 2017. [Online]. Available: https://doi.org/10.1080/10618600.2017.1384734
[23] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison, “Hidden technical debt in machine learning systems,” Advances in neural information processing systems, vol. 28, 2015.
[24] A. Holzinger, P. Kieseberg, E. Weippl, and A. M. Tjoa, “Current advances, trends and challenges of machine learning and knowledge extraction: From machine learning to explainable ai,” in Machine Learning and Knowledge Extraction, A. Holzinger, P. Kieseberg, A. M. Tjoa, and E. Weippl, Eds. Cham: Springer International Publishing, 2018, pp. 1–8.
[25] S. Kaufman, S. Rosset, and C. Perlich, “Leakage in data mining: formulation, detection, and avoidance,” in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’11. New York, NY, USA: Association for Computing Machinery, 2011, p. 556–563. [Online]. Available: https://doi.org/10.1145/2020408.2020496
[26] S. García, J. Luengo, and F. Herrera, Data Preprocessing in Data Mining, ser. Intelligent Systems Reference Library. Cham: Springer International Publishing, 2015, vol. 72. [Online]. Available: https://doi.org/10.1007/978-3-319-10247-4
[27] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, ser. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science, 2011. [Online]. Available: https://shop.elsevier.com/books/data-mining-concepts-and-techniques/han/978-0-12-381479-1
[28] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. Springer New York, NY, 2009.
[29] L. A. Shalabi, Z. Shaaban, and B. Kasasbeh, “Data mining: A preprocessing engine,” Journal of Computer Science, vol. 2, no. 9, pp. 735–739, Sep 2006. [Online]. Available: https://thescipub.com/abstract/jcssp.2006.735.739
[30] D. M. Hawkins, “The problem of overfitting,” Journal of Chemical Information and Computer Sciences, vol. 44, no. 1, pp. 1–12, 2004, pMID: 14741005. [Online]. Available: https://doi.org/10.1021/ci0342472
[31] O. E. Gundersen, K. Coakley, C. Kirkpatrick, and Y. Gil, “Sources of irreproducibility in machine learning: A review,” 2023. [Online]. Available: https://arxiv.org/abs/2204.07610
[32] M. B. A. McDermott, S. Wang, N. Marinsek, R. Ranganath, M. Ghassemi, and L. Foschini, “Reproducibility in machine learning for health,” 2019. [Online]. Available: https://arxiv.org/abs/1907.01463
[33] H. Semmelrock, S. Kopeinik, D. Theiler, T. Ross-Hellauer, and D. Kowald, “Reproducibility in machine learning-driven research,” 2023. [Online]. Available: https://arxiv.org/abs/2307.10320
[34] B. Haibe-Kains, G. A. Adam, A. Hosny, F. Khodakarami, T. Shraddha, R. Kusko, S.-A. Sansone, W. Tong, R. D. Wolfinger, C. E. Mason, W. Jones, J. Dopazo, C. Furlanello, L. Waldron, B. Wang, C. McIntosh, A. Goldenberg, A. Kundaje, C. S. Greene, T. Broderick, M. M. Hoffman, J. T. Leek, K. Korthauer, W. Huber, A. Brazma, J. Pineau, R. Tibshirani, T. Hastie, J. P. A. Ioannidis, J. Quackenbush, and H. J. W. L. Aerts, “Transparency and reproducibility in artificial intelligence,” Nature, vol. 586, no. 7829, p. E14–E16, Oct. 2020. [Online]. Available: http://dx.doi.org/10.1038/s41586-020-2766-y
[35] H. Semmelrock, T. Ross-Hellauer, S. Kopeinik, D. Theiler, A. Haberl, S. Thalmann, and D. Kowald, “Reproducibility in machine-learning-based research: Overview, barriers, and drivers,” AI Magazine, vol. 46, no. 2, p. e70002, 2025. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/aaai.70002
[36] S. Kapoor and A. Narayanan, “Leakage and the reproducibility crisis in machine-learning-based science,” Patterns, vol. 4, no. 9, p. 100804, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2666389923001599
[37] R. Shwartz-Ziv and A. Armon, “Tabular data: Deep learning is not all you need,” Information Fusion, vol. 81, pp. 84–90, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1566253521002360
[38] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,” in Proceedings of the 35th International Conference on Neural Information Processing Systems, ser. NIPS ’21. Red Hook, NY, USA: Curran Associates Inc., 2021.
[39] R. Levin, V. Cherepanova, A. Schwarzschild, A. Bansal, C. B. Bruss, T. Goldstein, A. G. Wilson, and M. Goldblum, “Transfer learning with deep tabular models,” 2023. [Online]. Available: https://arxiv.org/abs/2206.15306
[40] H.-J. Ye, S.-Y. Liu, H.-R. Cai, Q.-L. Zhou, and D.-C. Zhan, “A closer look at deep learning methods on tabular datasets,” 2025. [Online]. Available: https://arxiv.org/abs/2407.00956
[41] D. Singh and B. Singh, “Investigating the impact of data normalization on classification performance,” Applied Soft Computing, vol. 97, p. 105524, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494619302947
[42] K. Maharana, S. Mondal, and B. Nemade, “A review: Data pre-processing and data augmentation techniques,” Global Transitions Proceedings, vol. 3, no. 1, pp. 91–99, 2022, international Conference on Intelligent Engineering Approach(ICIEA-2022). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2666285X22000565
[43] S. Aksoy and R. M. Haralick, “Feature normalization and likelihood-based similarity measures for image retrieval,” Pattern Recognition Letters, vol. 22, no. 5, pp. 563–582, 2001, image/Video Indexing and Retrieval. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167865500001124
[44] C. Wongoutong, “The impact of neglecting feature scaling in k-means clustering,” PLOS ONE, vol. 19, 12 2024.
[45] R. F. de Mello and M. A. Ponti, Machine Learning: A Practical Approach on the Statistical Learning Theory. Cham: Springer International Publishing, 2018. [Online]. Available: https://doi.org/10.1007/978-3-319-94989-5
[46] T. Jayalakshmi and A. Santhakumaran, “Statistical normalization and back propagation for classification,” International Journal of Computer Theory and Engineering, vol. 3, no. 1, pp. 1793–8201, 2011.
[47] C.-W. Hsu, C.-C. Chang, C.-J. Lin et al., “A practical guide to support vector classification,” 2003.
[48] J. Pan, Y. Zhuang, and S. Fong, “The impact of data normalization on stock market prediction: Using svm and technical indicators,” in Soft Computing in Data Science, M. W. Berry, A. Hj. Mohamed, and B. W. Yap, Eds. Singapore: Springer Singapore, 2016, pp. 72–88.
[49] X. Wen, L. Shao, W. Fang, and Y. Xue, “Efficient feature selection and classification for vehicle detection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 3, pp. 508–517, 2015.
[50] W. li and Z. Liu, “A method of svm with normalization in intrusion detection,” Procedia Environmental Sciences, vol. 11, pp. 256–262, 2011, 2011 2nd International Conference on Challenges in Environmental Science and Computer Engineering (CESCE 2011). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1878029611008632
[51] M. M. Ahsan, M. A. P. Mahmud, P. K. Saha, K. D. Gupta, and Z. Siddique, “Effect of data scaling methods on machine learning algorithms and model performance,” Technologies, vol. 9, no. 3, 2021. [Online]. Available: https://www.mdpi.com/2227-7080/9/3/52
[52] A. Janosi, M. Steinbrunn, William ad Pfisterer, and R. Detrano, “Heart Disease,” UCI Machine Learning Repository, 1988, DOI: https://doi.org/10.24432/C52P4X.
[53] D. U. Ozsahin, M. Taiwo Mustapha, A. S. Mubarak, Z. Said Ameen, and B. Uzun, “Impact of feature scaling on machine learning models for the diagnosis of diabetes,” in 2022 International Conference on Artificial Intelligence in Everything (AIE), 2022, pp. 87–94.
[54] X. H. Cao, I. Stojkovic, and Z. Obradovic, “A robust data scaling algorithm to improve classification accuracies in biomedical data,” BMC Bioinformatics, vol. 17, no. 1, Sep 2016.
[55] X. Wan, “Influence of feature scaling on convergence of gradient iterative algorithm,” Journal of Physics: Conference Series, vol. 1213, no. 3, p. 032021, jun 2019. [Online]. Available: https://dx.doi.org/10.1088/1742-6596/1213/3/032021
[56] A. Kadir, L. E. Nugroho, A. Susanto, and P. I. Santosa, “Leaf classification using shape, color, and texture features,” CoRR, vol. abs/1401.4447, 2014. [Online]. Available: http://arxiv.org/abs/1401.4447
[57] C.-M. Wang and Y.-F. Huang, “Evolutionary-based feature selection approaches with new criteria for data mining: A case study of credit approval data,” Expert Systems with Applications, vol. 36, pp. 5900–5908, 04 2009.
[58] A. Craig, O. Cloarec, E. Holmes, J. Nicholson, and J. Lindon, “Scaling and normalization effects in nmr spectroscopic metabonomic data sets,” Analytical chemistry, vol. 78, pp. 2262–7, 05 2006.
[59] R. van den Berg, H. Hoefsloot, J. Westerhuis, A. Smilde, and M. van der Werf, “Van den berg ra, hoefsloot hcj, westerhuis ja, smilde ak, van der werf mj.. centering, scaling, and transformations: improving the biological information content of metabolomics data. bmc genomics 7: 142-157,” BMC genomics, vol. 7, p. 142, 02 2006.
[60] M. Z. Rodriguez, C. H. Comin, D. Casanova, O. M. Bruno, D. R. Amancio, L. d. F. Costa, and F. A. Rodrigues, “Clustering algorithms: A comparative approach,” PLOS ONE, vol. 14, no. 1, pp. 1–34, 01 2019. [Online]. Available: https://doi.org/10.1371/journal.pone.0210236
[61] U. R. Acharya, S. Dua, X. Du, V. Sree S, and C. K. Chua, “Automated diagnosis of glaucoma using texture and higher order spectra features,” IEEE Transactions on Information Technology in Biomedicine, vol. 15, no. 3, pp. 449–455, 2011.
[62] K. Mahmud Sujon, R. Binti Hassan, Z. Tusnia Towshi, M. A. Othman, M. Abdus Samad, and K. Choi, “When to use standardization and normalization: Empirical evidence from machine learning models and xai,” IEEE Access, vol. 12, pp. 135 300–135 314, 2024.
[63] W. Wolberg, O. Mangasarian, N. Street, and W. Street, “Breast Cancer Wisconsin (Diagnostic),” UCI Machine Learning Repository, 1995, DOI: https://doi.org/10.24432/C5DW2B.
[64] “Dry Bean Dataset,” UCI Machine Learning Repository, 2020, DOI: https://doi.org/10.24432/C50S4B.
[65] B. German, “Glass Identification,” UCI Machine Learning Repository, 1987, DOI: https://doi.org/10.24432/C5WW2P.
[66] R. A. Fisher, “Iris,” UCI Machine Learning Repository, 1988, DOI: https://doi.org/10.24432/C56C76.
[67] D. Slate, “Letter Recognition,” UCI Machine Learning Repository, 1991, DOI: https://doi.org/10.24432/C5ZP40.
[68] R. Bock, “MAGIC Gamma Telescope,” UCI Machine Learning Repository, 2007, DOI: https://doi.org/10.24432/C52C8B.
[69] “Rice (Cammeo and Osmancik),” UCI Machine Learning Repository, 2019, DOI: https://doi.org/10.24432/C5MW4Z.
[70] S. Aeberhard and M. Forina, “Wine,” UCI Machine Learning Repository, 1991, DOI: https://doi.org/10.24432/C5PC7J.
[71] S. Vito, “Air Quality,” UCI Machine Learning Repository, 2008, DOI: https://doi.org/10.24432/C59K5F.
[72] Warwick Nash, Tracy Sellers, Simon Talbot, Andrew Cawthorn, and Wes Ford, “Abalone,” UCI Machine Learning Repository, 1994, DOI: https://doi.org/10.24432/C55C7W.
[73] L. Candanedo, “Appliances Energy Prediction,” UCI Machine Learning Repository, 2017, DOI: https://doi.org/10.24432/C5VC8G.
[74] I.-C. Yeh, “Concrete Compressive Strength,” UCI Machine Learning Repository, 1998, DOI: https://doi.org/10.24432/C5PK67.
[75] P. Cortez and A. Morais, “Forest Fires,” UCI Machine Learning Repository, 2007, DOI: https://doi.org/10.24432/C5D88D.
[76] I.-C. Yeh, “Real Estate Valuation,” UCI Machine Learning Repository, 2018, DOI: https://doi.org/10.24432/C5J30W.
[77] Paulo Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, “Wine Quality,” UCI Machine Learning Repository, 2009, DOI: https://doi.org/10.24432/C56S3T.
[78] S. Raschka, “Model evaluation, model selection, and algorithm selection in machine learning,” 2020. [Online]. Available: https://arxiv.org/abs/1811.12808
[79] D. Wilimitis and C. G. Walsh, “Practical considerations and applied examples of cross-validation for model development and evaluation in health care: Tutorial,” JMIR AI, vol. 2, p. e49023, Dec 2023. [Online]. Available: https://ai.jmir.org/2023/1/e49023
[80] V. R. Joseph, “Optimal ratio for data splitting,” Statistical Analysis and Data Mining: An ASA Data Science Journal, vol. 15, no. 4, pp. 531–538, 2022. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/sam.11583
[81] A. Jain, K. Nandakumar, and A. Ross, “Score normalization in multimodal biometric systems,” Pattern Recognition, vol. 38, no. 12, pp. 2270–2285, 2005. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0031320305000592
[82] K. Cabello-Solorzano, I. Ortigosa de Araujo, M. Peña, L. Correia, and A. J. Tallón-Ballesteros, “The impact of data normalization on the accuracy of machine learning algorithms: A comparative analysis,” in 18th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2023), P. García Bringas, H. Pérez García, F. J. Martínez de Pisón, F. Martínez Álvarez, A. Troncoso Lora, Á. Herrero, J. L. Calvo Rolle, H. Quintián, and E. Corchado, Eds. Cham: Springer Nature Switzerland, 2023, pp. 344–353.
[83] A. Reverter, W. Barris, S. McWilliam, K. A. Byrne, Y. H. Wang, S. H. Tan, N. Hudson, and B. P. Dalrymple, “Validation of alternative methods of data normalization in gene co-expression studies,” Bioinformatics, vol. 21, no. 7, pp. 1112–1120, 11 2004. [Online]. Available: https://doi.org/10.1093/bioinformatics/bti124
[84] I. Noda, “Scaling techniques to enhance two-dimensional correlation spectra,” Journal of Molecular Structure, vol. 883-884, pp. 216–227, 2008, progress in two-dimensional correlation spectroscopy. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0022286007008411
[85] L. Eriksson, E. Johansson, S. Kettapeh-Wold, and S. Wold, Introduction to Multi- and Megavariate Data Analysis Using Projection Methods (PCA & PLS). Umetrics AB, 1999. [Online]. Available: https://books.google.com.br/books?id=3aW8GwAACAAJ
[86] H. Kubinyi, G. Folkers, and Y. Martin, 3D QSAR in Drug Design: Recent Advances, ser. Three-Dimensional Quantitative Structure Activity Relationships. Springer Netherlands, 2006. [Online]. Available: https://books.google.com.br/books?id=8GnrBwAAQBAJ
[87] D. Kim and K. You, “Pca, svd, and centering of data,” 2024. [Online]. Available: https://arxiv.org/abs/2307.15213
[88] V. N. G. Raju, K. P. Lakshmi, V. M. Jain, A. Kalidindi, and V. Padma, “Study the influence of normalization/transformation process on the accuracy of supervised classification,” in 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), 2020, pp. 729–735.
[89] R. Snelick, U. Uludag, A. Mink, M. Indovina, and A. Jain, “Large-scale evaluation of multimodal biometric authentication using state-of-the-art systems.” IEEE transactions on pattern analysis and machine intelligence, vol. 27, pp. 450–5, 04 2005.
[90] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Fourth Edition, 4th ed. USA: Academic Press, Inc., 2008.
[91] K. Priddy and P. Keller, Artificial Neural Networks: An Introduction, ser. SPIE tutorial texts. SPIE Press, 2005. [Online]. Available: https://books.google.com.br/books?id=BrnHR7esWmkC
[92] D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, 3rd ed. Wiley, 2013.
[93] G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning: With Applications in R. Springer, 2013.
[94] X. Su, X. Yan, and C.-L. Tsai, “Linear regression,” WIREs Computational Statistics, vol. 4, no. 3, pp. 275–294, 2012. [Online]. Available: https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/wics.1198
[95] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273–297, 1995.
[96] A. Ben-Hur and J. Weston, “A user’s guide to support vector machines,” Methods in molecular biology (Clifton, N.J.), vol. 609, pp. 223–39, 01 2010.
[97] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Advances in Neural Information Processing Systems, M. Mozer, M. Jordan, and T. Petsche, Eds., vol. 9. MIT Press, 1996. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/1996/file/d38901788c533e8286cb6400b40b386d-Paper.pdf
[98] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–44, 05 2015.
[99] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” 2015. [Online]. Available: https://arxiv.org/abs/1502.01852
[100] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017. [Online]. Available: https://arxiv.org/abs/1412.6980
[101] L. Breiman, “Random forests,” Machine Learning, vol. 45, pp. 5–32, 10 2001.
[102] A. Parmar, R. Katariya, and V. Patel, “A review on random forest: An ensemble classifier,” in International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018, J. Hemanth, X. Fernando, P. Lafata, and Z. Baig, Eds. Cham: Springer International Publishing, 2019, pp. 758–763.
[103] C. Zhang and Y. Ma, Ensemble machine learning: Methods and applications. Springer New York, 01 2012.
[104] I. Rish, “An empirical study of the naïve bayes classifier,” IJCAI 2001 Work Empir Methods Artif Intell, vol. 3, 01 2001.
[105] L. Breiman, J. Friedman, C. Stone, and R. Olshen, Classification and Regression Trees. Taylor & Francis, 1984.
[106] J. Singh Kushwah, A. Kumar, S. Patel, R. Soni, A. Gawande, and S. Gupta, “Comparative study of regressor and classifier with decision tree using modern tools,” Materials Today: Proceedings, vol. 56, pp. 3571–3576, 2022, first International Conference on Design and Materials. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2214785321076574
[107] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf
[108] Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, 1997. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S002200009791504X
[109] R. Schapire, “The boosting approach to machine learning: An overview,” Nonlin. Estimat. Classif. Lect. Notes Stat, vol. 171, pp. 149–171, 01 2002.
[110] A. V. Dorogush, A. Gulin, G. Gusev, N. Kazeev, L. O. Prokhorenkova, and A. Vorobev, “Fighting biases with dynamic boosting,” CoRR, vol. abs/1706.09516, 2017. [Online]. Available: http://arxiv.org/abs/1706.09516
[111] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’16. ACM, Aug. 2016, p. 785–794. [Online]. Available: http://dx.doi.org/10.1145/2939672.2939785
[112] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
[113] S. Zhang, X. Li, M. Zong, X. Zhu, and R. Wang, “Efficient knn classification with different numbers of nearest neighbors,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 5, pp. 1774–1785, 2018.
[114] S. Zhang, X. Li, M. Zong, X. Zhu, and D. Cheng, “Learning k for knn classification,” ACM Trans. Intell. Syst. Technol., vol. 8, no. 3, Jan. 2017. [Online]. Available: https://doi.org/10.1145/2990508
[115] S. O. Arik and T. Pfister, “Tabnet: Attentive interpretable tabular learning,” 2020. [Online]. Available: https://arxiv.org/abs/1908.07442
[116] D. M. Powers, “Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.
[117] N. J. D. NAGELKERKE, “A note on a general definition of the coefficient of determination,” Biometrika, vol. 78, no. 3, pp. 691–692, 09 1991. [Online]. Available: https://doi.org/10.1093/biomet/78.3.691
[118] D. Chicco and G. Jurman, “The coefficient of determination r-squared is more informative than smape, mae, mape, mse and rmse in regression analysis evaluation,” PeerJ Computer Science, vol. 7, p. e623, 2021.

Appendix A Tables Results

TABLE XII: Accuracy by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Breast Cancer Wisconsin Diagnostic	Ada	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766
	CART	0.9415	0.9415	0.9415	0.9415	0.9415	0.9415	0.9415	0.9415	0.9474	0.9415	0.9415	0.9474	0.9474
	CatBoost	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766	0.9766
	KNN	0.9591	0.9649	0.9766	0.9591	0.9766	0.9591	0.9591	0.9649	0.9649	0.9474	0.9591	0.9591	0.9591
	LGBM	0.9474	0.9474	0.9474	0.9591	0.9591	0.9591	0.9591	0.9591	0.9474	0.9474	0.9415	0.9415	0.9591
	LR	0.9708	0.9649	0.9532	0.9825	0.9766	0.9883	0.9708	0.9883	0.9942	0.9123	0.6316	0.9883	0.9942
	MLP	0.9649	0.9766	0.9766	0.9766	0.9883	0.9708	0.9006	0.9708	0.9766	0.9825	0.6316	0.9825	0.9883
	NB	0.9415	0.9357	0.9357	0.9357	0.9357	0.9357	0.9415	0.9357	0.9532	0.9357	0.9357	0.9532	0.9532
	RF	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708	0.9708
	SVM	0.9240	0.9883	0.9825	0.9766	0.9766	0.9766	0.9591	0.9825	0.9766	0.9532	0.6316	0.9883	0.9883
	TabNet	0.5673	0.7310	0.6608	0.6608	0.7544	0.5906	0.7018	0.6433	0.6140	0.6023	0.3684	0.5263	0.7076
	XGBoost	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649	0.9649
Dry Bean	Ada	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467	0.6467
	CART	0.8881	0.8881	0.8881	0.8881	0.8881	0.8881	0.8881	0.8881	0.8879	0.8881	0.8881	0.8881	0.8881
	CatBoost	0.9305	0.9297	0.9300	0.9297	0.9297	0.9297	0.9297	0.9297	0.9297	0.9305	0.9319	0.9297	0.9297
	KNN	0.7113	0.9197	0.9141	0.9216	0.8984	0.7507	0.7113	0.9190	0.9192	0.9158	0.9216	0.9229	0.9229
	LGBM	0.9275	0.9275	0.9275	0.9263	0.9263	0.9263	0.9263	0.9275	0.9275	0.9275	0.9275	0.9275	0.9263
	LR	0.7054	0.9185	0.9070	0.9229	0.9089	0.8805	0.6876	0.9234	0.9175	0.9070	0.3127	0.9246	0.9251
	MLP	0.2980	0.9224	0.9101	0.9327	0.9119	0.9199	0.5992	0.9314	0.9319	0.9087	0.8834	0.9297	0.9334
	NB	0.7627	0.8999	0.8999	0.8999	0.9003	0.8999	0.7627	0.8999	0.8905	0.8999	0.8999	0.8979	0.8979
	RF	0.9238	0.9226	0.9226	0.9226	0.9226	0.9226	0.9238	0.9226	0.9231	0.9226	0.9226	0.9231	0.9231
	SVM	0.5803	0.9246	0.9109	0.9263	0.5553	0.6060	0.1379	0.9268	0.9273	0.9111	0.7253	0.9253	0.9263
	TabNet	0.8763	0.9285	0.9170	0.9273	0.9209	0.9224	0.9038	0.9243	0.9155	0.9197	0.5228	0.9246	0.9258
	XGBoost	0.9261	0.9273	0.9256	0.9273	0.9273	0.9273	0.9273	0.9273	0.9265	0.9261	0.9248	0.9265	0.9273
Glass Identification	Ada	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077	0.5077
	CART	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462	0.6462
	CatBoost	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846	0.7846
	KNN	0.5846	0.5846	0.6615	0.6308	0.6308	0.5385	0.5846	0.6308	0.6615	0.6462	0.6308	0.6923	0.6923
	LGBM	0.8154	0.8000	0.8154	0.8000	0.8000	0.8000	0.8000	0.8308	0.8000	0.8154	0.8154	0.8154	0.8000
	LR	0.6615	0.5231	0.5077	0.6769	0.7077	0.5385	0.6769	0.6615	0.6615	0.4308	0.3538	0.6000	0.6462
	MLP	0.7077	0.7231	0.7231	0.6923	0.6615	0.7385	0.6615	0.6769	0.7538	0.6462	0.3538	0.7538	0.6462
	NB	0.3077	0.3077	0.3077	0.3077	0.3077	0.4923	0.3077	0.3077	0.2615	0.3077	0.3077	0.3231	0.3231
	RF	0.7538	0.7538	0.7538	0.7692	0.7538	0.7692	0.7538	0.7846	0.8000	0.7846	0.7692	0.7692	0.7692
	SVM	0.6769	0.6308	0.5077	0.6615	0.6615	0.3231	0.6769	0.6462	0.6769	0.4923	0.3538	0.6615	0.6462
	TabNet	0.1692	0.3231	0.0923	0.2154	0.1846	0.1538	0.3231	0.1692	0.2923	0.2923	0.2923	0.2923	0.2615
	XGBoost	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692	0.7692
Heart Disease	Ada	0.5385	0.5385	0.5385	0.5385	0.5385	0.5385	0.5385	0.5385	0.5495	0.5385	0.5385	0.5385	0.5385
	CART	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176	0.4176
	CatBoost	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714	0.5714
	KNN	0.4945	0.5275	0.5495	0.5714	0.5165	0.4835	0.4945	0.5275	0.5714	0.5824	0.5714	0.5495	0.5495
	LGBM	0.5275	0.5275	0.5275	0.5275	0.5275	0.5275	0.5275	0.5385	0.5275	0.5275	0.5275	0.5275	0.5275
	LR	0.5934	0.5824	0.5934	0.5385	0.5495	0.5604	0.5495	0.5714	0.5714	0.5604	0.5275	0.5934	0.5824
	MLP	0.3516	0.5824	0.5495	0.5055	0.5055	0.4945	0.5604	0.5385	0.5275	0.5604	0.5275	0.5824	0.5385
	NB	0.4396	0.3187	0.3187	0.2967	0.3846	0.3846	0.4396	0.3407	0.3187	0.3187	0.2967	0.3297	0.3297
	RF	0.5604	0.5604	0.5604	0.5604	0.5604	0.5604	0.5604	0.5604	0.5385	0.5604	0.5604	0.5604	0.5604
	SVM	0.5055	0.6154	0.5934	0.6154	0.5275	0.5385	0.4945	0.5934	0.5934	0.5714	0.5275	0.5934	0.5934
	TabNet	0.3187	0.1099	0.0989	0.1099	0.2418	0.2198	0.1538	0.1758	0.1209	0.1099	0.1099	0.1209	0.1538
	XGBoost	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165	0.5165
Iris	Ada	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	CART	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	0.9556	1.0000	1.0000	0.9556	0.9556
	CatBoost	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	KNN	1.0000	1.0000	1.0000	1.0000	1.0000	0.9111	1.0000	0.9556	1.0000	1.0000	1.0000	1.0000	1.0000
	LGBM	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	LR	1.0000	0.9111	0.9333	1.0000	1.0000	1.0000	1.0000	1.0000	0.9111	0.8222	0.4000	0.8667	0.9333
	MLP	1.0000	0.9778	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	0.2889	1.0000	1.0000
	NB	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778	0.9778
	RF	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	SVM	1.0000	1.0000	1.0000	0.9778	1.0000	0.9778	1.0000	0.9778	1.0000	0.9778	0.4222	0.9556	1.0000
	TabNet	0.4222	0.4667	0.3778	0.2667	0.3111	0.4000	0.2667	0.2222	0.4000	0.4000	0.2889	0.6222	0.3111
	XGBoost	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
Letter Recognition	Ada	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378	0.2378
	CART	0.8760	0.8745	0.8745	0.8748	0.8748	0.8750	0.8748	0.8765	0.8747	0.8752	0.8753	0.8755	0.8755
	CatBoost	0.9640	0.9640	0.9640	0.9640	0.9640	0.9640	0.9640	0.9640	0.9645	0.9640	0.9640	0.9640	0.9640
	KNN	0.9493	0.9480	0.9480	0.9405	0.9475	0.8853	0.9485	0.9158	0.9240	0.9463	0.9405	0.9378	0.9378
	LGBM	0.9640	0.9640	0.9640	0.9637	0.9637	0.9637	0.9637	0.9637	0.9640	0.9640	0.9640	0.9640	0.9637
	LR	0.7627	0.7523	0.7523	0.7762	0.7770	0.7757	0.7775	0.7765	0.7497	0.5523	0.3238	0.7547	0.7680
	MLP	0.9367	0.9280	0.9280	0.9502	0.9487	0.9448	0.9527	0.9545	0.9308	0.8677	0.5888	0.9242	0.9538
	NB	0.6432	0.6432	0.6432	0.6432	0.6432	0.6432	0.6432	0.6432	0.6288	0.6432	0.6432	0.6303	0.6303
	RF	0.9577	0.9577	0.9577	0.9570	0.9587	0.9568	0.9585	0.9580	0.9577	0.9582	0.9585	0.9587	0.9587
	SVM	0.8135	0.8208	0.8208	0.8488	0.8333	0.7587	0.8090	0.8488	0.8193	0.4952	0.0522	0.8268	0.8415
	TabNet	0.8963	0.8993	0.8993	0.8987	0.8997	0.8970	0.8942	0.8968	0.8827	0.9027	0.8840	0.8888	0.8907
	XGBoost	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585	0.9585
Magic Gamma Telescope	Ada	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375	0.8375
	CART	0.8153	0.8153	0.8151	0.8153	0.8153	0.8134	0.8151	0.8151	0.8156	0.8134	0.8146	0.8151	0.8151
	CatBoost	0.8891	0.8891	0.8891	0.8891	0.8891	0.8880	0.8891	0.8891	0.8891	0.8891	0.8889	0.8891	0.8891
	KNN	0.8098	0.8332	0.8254	0.8340	0.8146	0.8393	0.8098	0.8340	0.8453	0.8091	0.8340	0.8419	0.8419
	LGBM	0.8803	0.8785	0.8803	0.8792	0.8792	0.8808	0.8792	0.8810	0.8812	0.8803	0.8792	0.8792	0.8792
	LR	0.7834	0.7914	0.7911	0.7936	0.7927	0.7939	0.7920	0.7936	0.8328	0.7893	0.6525	0.7976	0.7971
	MLP	0.8170	0.8700	0.8677	0.8717	0.8722	0.8652	0.8517	0.8777	0.8729	0.8687	0.7730	0.8712	0.8777
	NB	0.7275	0.7275	0.7275	0.7275	0.7275	0.7275	0.7275	0.7275	0.7934	0.7275	0.7273	0.7296	0.7296
	RF	0.8808	0.8808	0.8808	0.8808	0.8808	0.8785	0.8808	0.8808	0.8803	0.8791	0.8819	0.8812	0.8812
	SVM	0.2976	0.7483	0.5158	0.4341	0.4478	0.6851	0.3312	0.3212	0.3640	0.7536	0.3481	0.3467	0.3507
	TabNet	0.8633	0.8757	0.8721	0.8701	0.8724	0.8731	0.8693	0.8791	0.8759	0.8789	0.8726	0.8731	0.8757
	XGBoost	0.8803	0.8803	0.8803	0.8803	0.8803	0.8792	0.8803	0.8803	0.8831	0.8803	0.8835	0.8803	0.8803
Rice Cammeo And Osmancik	Ada	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283	0.9283
	CART	0.8819	0.8819	0.8828	0.8828	0.8828	0.8828	0.8819	0.8819	0.8801	0.8819	0.8819	0.8819	0.8819
	CatBoost	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9274	0.9265	0.9265
	KNN	0.8775	0.9143	0.9204	0.9143	0.9125	0.9108	0.8775	0.9064	0.9160	0.9151	0.9143	0.9134	0.9134
	LGBM	0.9213	0.9221	0.9213	0.9178	0.9178	0.9178	0.9178	0.9160	0.9221	0.9213	0.9213	0.9213	0.9178
	LR	0.9318	0.9335	0.9265	0.9300	0.9318	0.9318	0.9326	0.9300	0.9265	0.9151	0.5468	0.9300	0.9300
	MLP	0.5468	0.9300	0.9335	0.9318	0.9239	0.9248	0.9116	0.9309	0.9265	0.9309	0.9169	0.9309	0.9326
	NB	0.9151	0.9274	0.9274	0.9274	0.9274	0.9274	0.9151	0.9274	0.9204	0.9274	0.9274	0.9239	0.9239
	RF	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9265	0.9248	0.9265	0.9265	0.9256	0.9256
	SVM	0.9248	0.9291	0.9309	0.9309	0.8915	0.7970	0.8469	0.9274	0.9291	0.9274	0.8784	0.9274	0.9274
	TabNet	0.5766	0.9248	0.7227	0.9335	0.9318	0.9265	0.9326	0.9265	0.9274	0.5494	0.4532	0.9283	0.9274
	XGBoost	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125	0.9125
Wine	Ada	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259	0.9259
	CART	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630
	CatBoost	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815
	KNN	0.7407	0.9444	0.9444	0.9630	0.9444	0.9815	0.7407	0.9444	0.9630	0.8704	0.9630	0.9630	0.9630
	LGBM	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815	0.9815
	LR	0.9815	1.0000	1.0000	0.9815	1.0000	0.9815	1.0000	0.9815	1.0000	0.8704	0.3889	1.0000	1.0000
	MLP	0.9815	0.9815	1.0000	0.9815	0.9815	0.9815	1.0000	0.9815	0.9815	1.0000	0.3889	0.9815	0.9815
	NB	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	0.9815	0.9815
	RF	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	SVM	0.5926	1.0000	1.0000	0.9815	0.9815	0.9815	0.6111	0.9815	0.9815	0.8148	0.3889	0.9815	0.9815
	TabNet	0.2222	0.4074	0.2778	0.0926	0.1481	0.1852	0.2778	0.2037	0.2963	0.3333	0.2593	0.3704	0.2037
	XGBoost	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630	0.9630

TABLE XII: Accuracy by dataset, model, and scaling method.

TABLE XIII: Time to Train (s) by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Breast Cancer Wisconsin Diagnostic	Ada	0.0625	0.0660	0.0623	0.0626	0.0661	0.0619	0.0618	0.0617	0.0616	0.0618	0.0621	0.0619	0.0618
	CART	0.0035	0.0036	0.0039	0.0035	0.0040	0.0036	0.0035	0.0035	0.0035	0.0035	0.0036	0.0036	0.0036
	CatBoost	0.9431	0.9772	0.9543	0.9512	0.9513	0.9340	0.9568	0.9348	0.9510	0.9467	0.9513	0.9505	0.9479
	KNN	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0003
	LGBM	0.0213	0.0306	0.0219	0.0275	0.0237	0.0233	0.0228	0.0256	0.0213	0.0210	0.0216	0.0213	0.0239
	LR	0.0043	0.0047	0.0014	0.0012	0.0025	0.0024	0.0044	0.0012	0.0012	0.0011	0.0013	0.0012	0.0010
	MLP	0.1708	0.2342	0.2929	0.1683	0.2549	0.1412	0.0107	0.1824	0.3734	0.3249	0.0070	0.3637	0.1990
	NB	0.0003	0.0003	0.0003	0.0003	0.0004	0.0003	0.0004	0.0003	0.0003	0.0004	0.0003	0.0004	0.0004
	RF	0.0706	0.0729	0.0707	0.0709	0.0711	0.0714	0.0709	0.0706	0.0704	0.0708	0.0706	0.0708	0.0708
	SVM	0.0016	0.0008	0.0007	0.0009	0.0012	0.0014	0.0023	0.0009	0.0009	0.0010	0.0017	0.0008	0.0009
	TabNet	0.0350	0.9545	0.0328	0.0314	0.0325	0.0339	0.0346	0.0319	0.0328	0.0344	0.0330	0.0349	0.0339
	XGBoost	0.0151	0.0995	0.0155	0.0157	0.0153	0.0153	0.0151	0.0158	0.0154	0.0158	0.0152	0.0152	0.0159
Dry Bean	Ada	0.7061	0.7334	0.7174	0.7231	0.7148	0.7192	0.7052	0.7145	0.7136	0.7101	0.7053	0.7134	0.7127
	CART	0.1292	0.1307	0.1309	0.1312	0.1313	0.1304	0.1293	0.1302	0.1304	0.1298	0.1293	0.1304	0.1302
	CatBoost	3.6706	3.6325	3.6400	3.6436	3.6461	3.6497	3.6579	3.6701	3.6637	3.6749	3.6736	3.6620	3.6708
	KNN	0.0008	0.0008	0.0008	0.0008	0.0008	0.0008	0.0008	0.0008	0.0009	0.0010	0.0010	0.0009	0.0009
	LGBM	0.2732	0.2741	0.2759	0.2988	0.2991	0.2989	0.2970	0.3025	0.2760	0.2743	0.2764	0.2748	0.2963
	LR	0.1024	0.1288	0.0976	0.0783	0.0982	0.1048	0.1042	0.0820	0.0947	0.1013	0.0604	0.0972	0.0528
	MLP	0.3396	2.8338	3.2012	2.4228	0.9144	0.5234	0.5023	3.8822	3.8263	2.6816	8.1463	2.8726	3.1601
	NB	0.0012	0.0013	0.0012	0.0013	0.0013	0.0013	0.0013	0.0012	0.0012	0.0013	0.0013	0.0013	0.0013
	RF	1.8703	1.9180	1.8860	1.8872	1.9258	1.8776	1.8653	1.8734	1.8702	2.0044	1.8541	1.8794	1.8811
	SVM	0.1096	0.1588	0.1932	0.1560	0.1382	0.1516	0.0975	0.1470	0.1399	0.2449	1.0143	0.1336	0.1299
	TabNet	11.8293	11.8382	11.7919	11.5408	11.7602	11.7540	11.3646	11.3732	11.5892	11.5038	11.7965	11.4746	11.4619
	XGBoost	0.2849	0.2832	0.2870	0.2877	0.2877	0.2879	0.2882	0.2915	0.2901	0.2907	0.2895	0.2899	0.2963
Glass Identification	Ada	0.0269	0.0269	0.0275	0.0291	0.0267	0.0267	0.0270	0.0265	0.0265	0.0268	0.0266	0.0266	0.0270
	CART	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0006	0.0006	0.0007	0.0007	0.0007	0.0007
	CatBoost	0.7934	0.7912	0.7935	0.8024	0.7970	0.8090	0.7974	0.8031	0.8038	0.7896	0.8072	0.8029	0.7971
	KNN	0.0002	0.0003	0.0002	0.0002	0.0003	0.0003	0.0002	0.0004	0.0002	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0422	0.0433	0.0427	0.0345	0.0366	0.0348	0.0360	0.0410	0.0406	0.0400	0.0362	0.0362	0.0352
	LR	0.0067	0.0023	0.0030	0.0020	0.0017	0.0072	0.0023	0.0026	0.0026	0.0021	0.0018	0.0029	0.0016
	MLP	0.2307	0.2427	0.2416	0.2376	0.2341	0.2284	0.2333	0.2372	0.2360	0.2328	0.0195	0.2345	0.2357
	NB	0.0003	0.0004	0.0005	0.0003	0.0004	0.0004	0.0004	0.0003	0.0003	0.0003	0.0004	0.0004	0.0005
	RF	0.0453	0.0456	0.0456	0.0453	0.0482	0.0454	0.0453	0.0453	0.0453	0.0453	0.0455	0.0453	0.0462
	SVM	0.0010	0.0007	0.0007	0.0008	0.0010	0.0022	0.0010	0.0009	0.0007	0.0007	0.0007	0.0007	0.0008
	TabNet	0.0293	0.0338	0.0316	0.0301	0.0356	0.0317	0.0299	0.0293	0.0314	0.0297	0.0322	0.0310	0.0305
	XGBoost	0.0332	0.0337	0.0341	0.0340	0.0339	0.0342	0.0334	0.0344	0.0340	0.0340	0.0339	0.0357	0.0354
Heart Disease	Ada	0.0265	0.0266	0.0272	0.0272	0.0265	0.0266	0.0265	0.0264	0.0264	0.0266	0.0264	0.0265	0.0269
	CART	0.0007	0.0007	0.0008	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007
	CatBoost	0.6742	0.6528	0.6595	0.6623	0.6676	0.6681	0.6694	0.6754	0.6734	0.6811	0.6725	0.6683	0.6687
	KNN	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0466	0.0433	0.0450	0.0412	0.0405	0.0402	0.0390	0.0439	0.0418	0.0449	0.0385	0.0388	0.0391
	LR	0.0073	0.0044	0.0054	0.0018	0.0058	0.0069	0.0078	0.0031	0.0046	0.0036	0.0024	0.0045	0.0016
	MLP	0.0198	0.1104	0.1105	0.3423	0.2924	0.3025	0.0822	0.3942	0.3946	0.1659	0.0234	0.2356	0.3996
	NB	0.0003	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003	0.0004	0.0003	0.0004	0.0005	0.0005
	RF	0.0449	0.0453	0.0452	0.0451	0.0481	0.0453	0.0453	0.0451	0.0451	0.0451	0.0451	0.0450	0.0476
	SVM	0.0029	0.0009	0.0009	0.0017	0.0028	0.0031	0.0030	0.0012	0.0010	0.0010	0.0008	0.0009	0.0011
	TabNet	0.0291	0.0330	0.0329	0.1327	0.0333	0.0334	0.0305	0.0305	0.0307	0.0301	0.0324	0.0306	0.0307
	XGBoost	0.0379	0.0388	0.0387	0.0391	0.0387	0.0391	0.0382	0.0396	0.0388	0.0388	0.0386	0.0404	0.0407
Iris	Ada	0.0238	0.0239	0.0246	0.0243	0.0236	0.0258	0.0237	0.0234	0.0235	0.0236	0.0236	0.0236	0.0239
	CART	0.0003	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0003	0.0003	0.0003	0.0004	0.0004	0.0003
	CatBoost	0.2805	0.2828	0.2800	0.2777	0.2863	0.2898	0.2740	0.2748	0.2893	0.2820	0.2874	0.2870	0.2846
	KNN	0.0004	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0003	0.0003	0.0002	0.0002	0.0002	0.0003
	LGBM	0.0116	0.0126	0.0119	0.0130	0.0136	0.0127	0.0126	0.0118	0.0121	0.0116	0.0123	0.0122	0.0115
	LR	0.0045	0.0014	0.0018	0.0014	0.0013	0.0044	0.0015	0.0015	0.0014	0.0015	0.0012	0.0013	0.0012
	MLP	0.1074	0.1412	0.1444	0.0947	0.0969	0.1067	0.0921	0.1164	0.1309	0.1910	0.0049	0.1556	0.1173
	NB	0.0003	0.0003	0.0004	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	RF	0.0382	0.0387	0.0387	0.0386	0.0399	0.0385	0.0385	0.0385	0.0384	0.0383	0.0385	0.0384	0.0399
	SVM	0.0003	0.0003	0.0004	0.0003	0.0004	0.0004	0.0003	0.0003	0.0004	0.0004	0.0004	0.0004	0.0004
	TabNet	0.0285	0.0310	0.0322	0.0282	0.0320	0.0306	0.0304	0.0296	0.0299	0.0290	0.0301	0.0307	0.0296
	XGBoost	0.0130	0.0141	0.0136	0.0134	0.0134	0.0135	0.0132	0.0136	0.0134	0.0135	0.0135	0.0142	0.0144
Letter Recognition	Ada	0.3581	0.3601	0.3589	0.3570	0.3563	0.3565	0.3555	0.3551	0.3548	0.3554	0.3550	0.3603	0.3546
	CART	0.0381	0.0380	0.0382	0.0381	0.0378	0.0384	0.0381	0.0378	0.0379	0.0380	0.0380	0.0379	0.0380
	CatBoost	5.8619	5.8869	5.8750	5.8861	5.8526	5.8972	5.8645	5.8823	5.8914	5.8835	5.8919	5.8794	5.8696
	KNN	0.0010	0.0010	0.0009	0.0009	0.0009	0.0009	0.0009	0.0010	0.0011	0.0012	0.0011	0.0011	0.0011
	LGBM	1.1319	1.1009	1.0970	1.1101	1.1199	1.1223	1.1144	1.1380	1.0928	1.1109	1.1247	1.0959	1.1216
	LR	0.2923	0.3110	0.3040	0.1714	0.2318	0.2898	0.2954	0.2229	0.2969	0.0625	0.0764	0.2906	0.1379
	MLP	11.1298	22.3418	22.0963	11.5231	10.0211	8.8762	8.8380	13.5357	22.1324	22.2006	18.4904	22.0306	14.9389
	NB	0.0029	0.0026	0.0026	0.0025	0.0026	0.0025	0.0026	0.0025	0.0024	0.0026	0.0025	0.0025	0.0025
	RF	0.8526	0.8806	0.8582	0.8632	0.8850	0.8594	0.9108	0.8535	0.8573	0.8554	0.8554	0.8547	0.8557
	SVM	0.8484	0.7881	0.7841	0.7588	0.8009	0.9311	0.8348	0.7936	0.6644	2.6349	3.5273	0.6776	0.6531
	TabNet	16.6080	16.2526	16.6332	16.3683	16.3003	16.4503	16.9526	16.6139	16.5979	16.4259	16.5634	16.2669	16.6276
	XGBoost	0.7149	0.6430	0.6358	0.6357	0.6335	0.6344	0.6589	0.6445	0.6438	0.6458	0.7151	0.6552	0.6772
Magic Gamma Telescope	Ada	0.5774	0.5811	0.5844	0.5766	0.5774	0.5807	0.5775	0.5764	0.5760	0.5751	0.5624	0.5781	0.5763
	CART	0.1448	0.1449	0.1456	0.1454	0.1448	0.1450	0.1451	0.1448	0.1448	0.1443	0.1436	0.1450	0.1462
	CatBoost	2.4467	2.4322	2.4213	2.4526	2.4666	2.4253	2.4503	2.4659	2.4766	2.4590	2.4641	2.4541	2.4565
	KNN	0.0065	0.0065	0.0065	0.0065	0.0065	0.0065	0.0065	0.0071	0.0069	0.0075	0.0068	0.0069	0.0068
	LGBM	0.0469	0.0457	0.0460	0.0461	0.0457	0.0461	0.0464	0.0468	0.0444	0.0468	0.0447	0.0447	0.0464
	LR	0.0283	0.0074	0.0106	0.0063	0.0176	0.0124	0.0269	0.0067	0.0081	0.0089	0.0066	0.0068	0.0054
	MLP	0.5592	4.3368	4.8919	4.0988	4.1299	4.0089	1.6349	3.4037	4.5621	7.1413	5.2403	3.0823	5.8520
	NB	0.0013	0.0013	0.0015	0.0013	0.0013	0.0014	0.0013	0.0014	0.0013	0.0013	0.0016	0.0013	0.0014
	RF	2.6338	2.6490	2.6361	2.6471	2.6767	2.6233	2.6346	2.6319	2.6293	2.6260	2.6381	2.6242	2.6470
	SVM	0.0697	0.2825	0.2742	0.2290	0.0940	0.1469	0.0727	0.2240	0.2731	0.2712	0.2685	0.2738	0.2736
	TabNet	16.3108	16.2803	16.3504	16.1869	17.0110	16.9746	16.7629	16.9795	16.5454	16.8754	16.2501	16.7170	16.5016
	XGBoost	0.0501	0.0494	0.0496	0.0501	0.0495	0.0495	0.0495	0.0500	0.0491	0.0543	0.0499	0.0510	0.0496
Rice Cammeo And Osmancik	Ada	0.0993	0.1006	0.1024	0.0991	0.0992	0.1018	0.0991	0.0992	0.0992	0.1008	0.0990	0.0995	0.0994
	CART	0.0094	0.0104	0.0096	0.0103	0.0095	0.0094	0.0094	0.0094	0.0094	0.0095	0.0094	0.0094	0.0095
	CatBoost	0.9131	0.8989	0.8965	0.9065	0.9136	0.9021	0.9150	0.9123	0.9078	0.9154	0.9173	0.9184	0.9123
	KNN	0.0009	0.0010	0.0010	0.0010	0.0009	0.0009	0.0009	0.0010	0.0011	0.0012	0.0010	0.0010	0.0012
	LGBM	0.0294	0.0295	0.0291	0.0309	0.0312	0.0307	0.0310	0.0315	0.0292	0.0296	0.0296	0.0298	0.0309
	LR	0.0081	0.0016	0.0034	0.0018	0.0028	0.0055	0.0075	0.0017	0.0015	0.0032	0.0019	0.0018	0.0015
	MLP	0.0464	0.2474	0.6241	0.1954	0.3662	0.1570	0.1310	0.2864	0.4593	1.5107	2.6258	0.2457	0.2491
	NB	0.0004	0.0005	0.0005	0.0005	0.0005	0.0004	0.0004	0.0005	0.0005	0.0004	0.0004	0.0004	0.0005
	RF	0.2023	0.2036	0.2032	0.2026	0.2086	0.2033	0.2024	0.2034	0.2023	0.2026	0.2022	0.2020	0.2106
	SVM	0.0093	0.0173	0.0223	0.0201	0.0146	0.0104	0.0074	0.0199	0.0176	0.0361	0.0551	0.0168	0.0170
	TabNet	2.5276	2.6327	2.6146	2.5252	2.5909	2.7084	2.5272	2.7104	2.5292	2.5641	2.6258	2.6799	2.5631
	XGBoost	0.0262	0.0262	0.0262	0.0263	0.0263	0.0262	0.0260	0.0267	0.0263	0.0264	0.0259	0.0275	0.0260
Wine	Ada	0.0271	0.0280	0.0276	0.0271	0.0272	0.0283	0.0277	0.0269	0.0270	0.0271	0.0272	0.0274	0.0274
	CART	0.0005	0.0005	0.0005	0.0006	0.0006	0.0005	0.0006	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005
	CatBoost	0.5668	0.5651	0.5670	0.5647	0.5677	0.5665	0.5729	0.5716	0.5675	0.5645	0.5724	0.5610	0.5577
	KNN	0.0003	0.0002	0.0003	0.0002	0.0002	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0164	0.0149	0.0151	0.0167	0.0162	0.0144	0.0160	0.0154	0.0152	0.0144	0.0156	0.0175	0.0154
	LR	0.0059	0.0015	0.0031	0.0014	0.0040	0.0031	0.0067	0.0013	0.0019	0.0013	0.0019	0.0019	0.0012
	MLP	0.2105	0.1228	0.1658	0.0603	0.0784	0.0377	0.1086	0.0696	0.1100	0.2217	0.0076	0.1234	0.0789
	NB	0.0003	0.0003	0.0003	0.0004	0.0004	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	RF	0.0406	0.0410	0.0408	0.0411	0.0411	0.0408	0.0410	0.0406	0.0406	0.0407	0.0407	0.0405	0.0410
	SVM	0.0008	0.0004	0.0004	0.0005	0.0006	0.0005	0.0008	0.0005	0.0004	0.0005	0.0006	0.0005	0.0005
	TabNet	0.0291	0.0298	0.0289	0.0286	0.0319	0.0315	0.0321	0.0307	0.0306	0.0304	0.0310	0.0336	0.0292
	XGBoost	0.0122	0.0120	0.0121	0.0121	0.0119	0.0119	0.0119	0.0124	0.0121	0.0121	0.0118	0.0126	0.0118

TABLE XIII: Time to Train (s) by dataset, model, and scaling method.

TABLE XIV: Time to Inference (s) by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Breast Cancer Wisconsin Diagnostic	Ada	0.0018	0.0019	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0005	0.0005	0.0006	0.0004	0.0005	0.0006	0.0005	0.0004	0.0004	0.0004	0.0005	0.0007	0.0006
	KNN	0.0025	0.0131	0.0025	0.0025	0.0025	0.0025	0.0025	0.0030	0.0025	0.0025	0.0034	0.0027	0.0028
	LGBM	0.0004	0.0006	0.0004	0.0007	0.0004	0.0004	0.0004	0.0007	0.0004	0.0004	0.0004	0.0005	0.0005
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	NB	0.0001	0.0001	0.0001	0.0002	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0013	0.0014	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013
	SVM	0.0002	0.0002	0.0002	0.0001	0.0001	0.0001	0.0003	0.0001	0.0002	0.0003	0.0006	0.0002	0.0002
	TabNet	0.0037	0.0145	0.0042	0.0038	0.0037	0.0036	0.0040	0.0038	0.0040	0.0038	0.0037	0.0039	0.0042
	XGBoost	0.0003	0.0005	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
Dry Bean	Ada	0.0073	0.0074	0.0075	0.0078	0.0073	0.0073	0.0074	0.0072	0.0071	0.0072	0.0072	0.0073	0.0073
	CART	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	CatBoost	0.0024	0.0027	0.0025	0.0029	0.0026	0.0022	0.0023	0.0022	0.0024	0.0025	0.0023	0.0022	0.0022
	KNN	0.0529	0.0532	0.0536	0.0536	0.0527	0.0539	0.0530	0.0594	0.0553	0.0554	0.0542	0.0553	0.0565
	LGBM	0.0101	0.0097	0.0099	0.0099	0.0100	0.0101	0.0099	0.0100	0.0099	0.0099	0.0100	0.0102	0.0098
	LR	0.0002	0.0003	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002
	MLP	0.0011	0.0012	0.0012	0.0012	0.0011	0.0011	0.0012	0.0012	0.0012	0.0010	0.0011	0.0010	0.0010
	NB	0.0007	0.0009	0.0006	0.0007	0.0007	0.0008	0.0007	0.0006	0.0007	0.0007	0.0007	0.0007	0.0007
	RF	0.0180	0.0182	0.0181	0.0184	0.0188	0.0180	0.0182	0.0179	0.0183	0.0182	0.0187	0.0187	0.0179
	SVM	0.0255	0.1381	0.1723	0.0892	0.0597	0.0384	0.0138	0.0907	0.1141	0.2042	0.4215	0.1154	0.0968
	TabNet	0.0237	0.0238	0.0238	0.0238	0.0243	0.0244	0.0239	0.0239	0.0238	0.0236	0.0243	0.0238	0.0236
	XGBoost	0.0035	0.0035	0.0035	0.0035	0.0035	0.0035	0.0035	0.0035	0.0036	0.0036	0.0035	0.0036	0.0036
Glass Identification	Ada	0.0017	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0004	0.0006	0.0006	0.0004	0.0006	0.0005	0.0006	0.0005	0.0006	0.0008	0.0007	0.0005	0.0007
	KNN	0.0012	0.0013	0.0012	0.0012	0.0012	0.0012	0.0012	0.0014	0.0014	0.0013	0.0014	0.0014	0.0014
	LGBM	0.0005	0.0006	0.0006	0.0006	0.0006	0.0005	0.0005	0.0006	0.0005	0.0005	0.0006	0.0005	0.0005
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	NB	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0013	0.0012	0.0013	0.0013	0.0013	0.0013
	SVM	0.0002	0.0002	0.0002	0.0002	0.0002	0.0001	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002
	TabNet	0.0033	0.0037	0.0031	0.0036	0.0033	0.0034	0.0036	0.0033	0.0033	0.0034	0.0035	0.0034	0.0035
	XGBoost	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005	0.0004	0.0005	0.0004
Heart Disease	Ada	0.0017	0.0016	0.0016	0.0016	0.0016	0.0017	0.0017	0.0016	0.0016	0.0017	0.0017	0.0016	0.0017
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0005	0.0005	0.0005	0.0006	0.0006	0.0006	0.0004	0.0008	0.0006	0.0006	0.0007	0.0005	0.0007
	KNN	0.0016	0.0016	0.0016	0.0016	0.0015	0.0016	0.0016	0.0020	0.0018	0.0017	0.0016	0.0019	0.0017
	LGBM	0.0007	0.0006	0.0006	0.0006	0.0006	0.0007	0.0005	0.0006	0.0006	0.0005	0.0006	0.0006	0.0005
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	NB	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0002	0.0001	0.0001	0.0001
	RF	0.0014	0.0014	0.0014	0.0014	0.0015	0.0014	0.0014	0.0015	0.0014	0.0014	0.0015	0.0014	0.0014
	SVM	0.0003	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002
	TabNet	0.0034	0.0036	0.0035	0.0036	0.0038	0.0035	0.0036	0.0036	0.0035	0.0035	0.0033	0.0034	0.0035
	XGBoost	0.0005	0.0006	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0006	0.0005
Iris	Ada	0.0015	0.0015	0.0016	0.0015	0.0016	0.0015	0.0015	0.0016	0.0016	0.0015	0.0016	0.0015	0.0016
	CART	0.0000	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0000	0.0000	0.0001	0.0001	0.0001
	CatBoost	0.0006	0.0005	0.0005	0.0005	0.0005	0.0006	0.0004	0.0004	0.0004	0.0005	0.0005	0.0005	0.0004
	KNN	0.0012	0.0010	0.0010	0.0009	0.0009	0.0009	0.0009	0.0012	0.0010	0.0010	0.0010	0.0011	0.0010
	LGBM	0.0004	0.0004	0.0005	0.0004	0.0004	0.0006	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0000	0.0001	0.0001	0.0001	0.0001
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	NB	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011
	SVM	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	TabNet	0.0033	0.0036	0.0034	0.0034	0.0032	0.0034	0.0032	0.0034	0.0033	0.0032	0.0035	0.0032	0.0035
	XGBoost	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
Letter Recognition	Ada	0.0204	0.0200	0.0200	0.0193	0.0200	0.0197	0.0195	0.0194	0.0197	0.0198	0.0197	0.0197	0.0196
	CART	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006
	CatBoost	0.0049	0.0049	0.0053	0.0044	0.0052	0.0044	0.0046	0.0050	0.0050	0.0045	0.0050	0.0048	0.0044
	KNN	0.8666	0.0829	0.0819	0.0821	0.0823	0.0828	0.0825	0.0870	0.0841	0.0903	0.0836	0.0854	0.0854
	LGBM	0.0611	0.0606	0.0606	0.0610	0.0613	0.0610	0.0612	0.0601	0.0607	0.0609	0.0606	0.0609	0.0609
	LR	0.0006	0.0006	0.0005	0.0005	0.0005	0.0005	0.0006	0.0005	0.0005	0.0005	0.0004	0.0005	0.0005
	MLP	0.0022	0.0022	0.0022	0.0021	0.0022	0.0021	0.0021	0.0021	0.0022	0.0022	0.0022	0.0022	0.0022
	NB	0.0036	0.0032	0.0031	0.0033	0.0031	0.0031	0.0030	0.0030	0.0029	0.0030	0.0030	0.0030	0.0029
	RF	0.0471	0.0480	0.0473	0.0472	0.0493	0.0481	0.0484	0.0479	0.0483	0.0482	0.0480	0.0482	0.0486
	SVM	0.8372	1.3641	1.3598	0.8540	0.8260	0.8683	0.8521	0.8722	1.1584	1.8609	1.8765	1.2526	1.0256
	TabNet	0.0354	0.0355	0.0357	0.0352	0.0351	0.0354	0.0354	0.0366	0.0356	0.0358	0.0357	0.0354	0.0351
	XGBoost	0.0206	0.0208	0.0207	0.0207	0.0208	0.0208	0.0209	0.0208	0.0210	0.0205	0.0206	0.0207	0.0211
Magic Gamma Telescope	Ada	0.0073	0.0071	0.0071	0.0068	0.0068	0.0071	0.0069	0.0067	0.0067	0.0068	0.0067	0.0068	0.0068
	CART	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0004	0.0005	0.0004	0.0005	0.0005
	CatBoost	0.0016	0.0019	0.0019	0.0017	0.0021	0.0019	0.0019	0.0018	0.0017	0.0017	0.0020	0.0019	0.0019
	KNN	0.1088	0.1324	0.1550	0.1659	0.1254	0.1007	0.1093	0.1635	0.1617	0.1156	0.1673	0.1543	0.1536
	LGBM	0.0024	0.0025	0.0023	0.0024	0.0022	0.0022	0.0023	0.0023	0.0024	0.0023	0.0023	0.0023	0.0023
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0012	0.0012	0.0012	0.0012	0.0012	0.0012	0.0012	0.0012	0.0011	0.0011	0.0011	0.0010	0.0011
	NB	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0002	0.0003	0.0003	0.0002	0.0003	0.0003	0.0003
	RF	0.0347	0.0344	0.0349	0.0349	0.0341	0.0346	0.0342	0.0341	0.0341	0.0341	0.0343	0.0344	0.0345
	SVM	0.0214	0.1030	0.1027	0.0838	0.0291	0.0523	0.0232	0.0856	0.1028	0.1025	0.1023	0.1031	0.1024
	TabNet	0.0342	0.0336	0.0340	0.0339	0.0352	0.0337	0.0342	0.0351	0.0339	0.0340	0.0341	0.0349	0.0339
	XGBoost	0.0009	0.0009	0.0011	0.0010	0.0009	0.0009	0.0009	0.0009	0.0009	0.0010	0.0009	0.0009	0.0009
Rice Cammeo And Osmancik	Ada	0.0026	0.0026	0.0026	0.0026	0.0026	0.0027	0.0025	0.0026	0.0026	0.0026	0.0026	0.0025	0.0026
	CART	0.0001	0.0001	0.0001	0.0002	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0009	0.0011	0.0009	0.0009	0.0009	0.0008	0.0007	0.0009	0.0010	0.0009	0.0007	0.0007	0.0010
	KNN	0.0139	0.0151	0.0148	0.0149	0.0141	0.0147	0.0140	0.0152	0.0153	0.0172	0.0155	0.0155	0.0154
	LGBM	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0008	0.0007	0.0007	0.0007	0.0007	0.0007
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0003	0.0003	0.0003	0.0002	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0002
	NB	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0045	0.0043	0.0044	0.0045	0.0045	0.0044	0.0044	0.0044	0.0043	0.0044	0.0044	0.0044	0.0044
	SVM	0.0012	0.0061	0.0083	0.0055	0.0031	0.0017	0.0007	0.0056	0.0058	0.0140	0.0213	0.0058	0.0055
	TabNet	0.0090	0.0091	0.0090	0.0091	0.0090	0.0090	0.0089	0.0091	0.0090	0.0090	0.0095	0.0099	0.0089
	XGBoost	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005
Wine	Ada	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0016	0.0015	0.0015
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0005	0.0005	0.0005	0.0004	0.0004	0.0007	0.0006	0.0007	0.0005	0.0005	0.0005	0.0005	0.0005
	KNN	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0012	0.0011	0.0012	0.0012	0.0012
	LGBM	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005	0.0007	0.0004
	LR	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	NB	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0011	0.0011	0.0011	0.0011	0.0011	0.0011	0.0012	0.0011	0.0011	0.0011	0.0011	0.0012	0.0011
	SVM	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	TabNet	0.0034	0.0034	0.0033	0.0033	0.0031	0.0032	0.0036	0.0035	0.0033	0.0036	0.0037	0.0035	0.0032
	XGBoost	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003

TABLE XIV: Time to Inference (s) by dataset, model, and scaling method.

TABLE XV:

R^{2}

Score by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Abalone	Ada	0.1671	0.2210	0.2243	0.2503	0.1510	0.1938	0.1296	0.2301	0.1347	0.2018	0.1690	0.2112	0.2112
	CART	0.1383	0.1464	0.1477	0.1560	0.1456	0.1500	0.1626	0.1417	0.1663	0.1416	0.1373	0.1686	0.1686
	CatBoost	0.5226	0.5226	0.5225	0.5226	0.5225	0.5225	0.5225	0.5227	0.5224	0.5226	0.5226	0.5223	0.5223
	KNN	0.5164	0.5023	0.4955	0.4662	0.5056	0.4406	0.5164	0.4552	0.4766	0.4100	0.4662	0.4699	0.4699
	LGBM	0.5260	0.5278	0.5260	0.5256	0.5256	0.5256	0.5256	0.5190	0.5277	0.5260	0.5257	0.5257	0.5259
	LinearRegression	0.5150	0.5150	0.5150	0.5150	0.5150	0.5150	0.5150	0.5150	0.4964	0.5150	0.5151	0.5204	0.5204
	MLP	0.5245	0.5276	0.5265	0.5578	0.5580	0.5443	0.5559	0.5632	0.5434	0.5300	0.0226	0.5337	0.5560
	RF	0.5244	0.5247	0.5249	0.5234	0.5244	0.5248	0.5247	0.5241	0.5230	0.5232	0.5239	0.5236	0.5236
	SVR	0.5293	0.5283	0.5257	0.5421	0.5505	0.5157	0.5430	0.5398	0.5345	0.3615	0.5421	0.5437	0.5437
	TabNet	0.3127	0.3609	0.2115	0.5131	0.5224	0.5554	0.3745	0.5434	0.4144	-0.0595	-0.0462	0.3457	0.5017
	XGBoost	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546	0.4546
Air Quality	Ada	0.9992	0.9992	0.9984	0.9984	0.9984	0.9984	0.9992	0.9992	0.9984	0.9989	0.9984	0.9984	0.9984
	CART	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	CatBoost	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999
	KNN	0.9995	0.9993	0.9993	0.9994	0.9994	0.9996	0.9995	0.9986	0.9986	0.9993	0.9994	0.9994	0.9994
	LGBM	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999
	LinearRegression	0.9992	0.9992	0.9992	0.9992	0.9992	0.9992	0.9992	0.9992	0.8736	0.9992	0.9992	0.9977	0.9977
	MLP	0.9985	0.9990	1.0000	1.0000	0.9999	1.0000	0.9975	0.9999	0.9983	0.9992	0.8187	0.9997	0.9999
	RF	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
	SVR	0.9966	0.9792	0.9619	0.9269	0.9435	0.9915	0.9675	0.9188	0.7749	0.9540	0.9269	0.9297	0.9297
	TabNet	0.9997	0.9984	0.9995	0.9978	0.9998	0.9981	0.9993	0.9998	0.9948	0.9869	0.0353	0.9998	0.9990
	XGBoost	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999	0.9999
Appliances Energy Prediction	Ada	-2.6328	-0.7955	-3.8255	-3.4419	-3.9861	-3.6762	-1.4564	-1.4564	-1.7618	-0.7955	-1.4562	-0.7955	-0.7955
	CART	0.1358	0.1348	0.1361	0.1378	0.1378	0.1367	0.1354	0.1377	0.1540	0.1406	0.1398	0.1410	0.1410
	CatBoost	0.4529	0.4570	0.4529	0.4570	0.4570	0.4570	0.4570	0.4570	0.4588	0.4529	0.4568	0.4570	0.4570
	KNN	0.1681	0.2875	0.2049	0.3279	0.2595	0.4720	0.1681	0.2929	0.3228	0.1568	0.3279	0.3280	0.3280
	LGBM	0.4318	0.4276	0.4318	0.4334	0.4334	0.4334	0.4334	0.4192	0.4285	0.4318	0.4351	0.4351	0.4334
	LinearRegression	0.1672	0.1672	0.1672	0.1672	0.1672	0.1672	0.1672	0.1672	0.1488	0.1672	0.1672	0.1644	0.1644
	MLP	0.1598	0.2037	0.1581	0.3144	0.3093	0.3161	0.3017	0.2970	0.2243	0.1540	0.0003	0.2209	0.2936
	RF	0.5122	0.5122	0.5120	0.5122	0.5129	0.5124	0.5127	0.5125	0.5114	0.5132	0.5120	0.5126	0.5126
	SVR	-0.1056	-0.0032	-0.0275	0.0154	0.0041	-0.0518	-0.0152	-0.0096	0.0101	-0.0409	0.0154	0.0110	0.0110
	TabNet	0.3054	0.2955	0.3177	0.2828	0.3031	0.3333	0.3049	0.2775	0.2997	0.2838	0.3258	0.2973	0.3195
	XGBoost	0.4788	0.4788	0.4788	0.4788	0.4788	0.4788	0.4788	0.4788	0.4841	0.4788	0.4788	0.4788	0.4788
Concrete Compressive Strength	Ada	0.7762	0.7761	0.7717	0.7782	0.7783	0.7788	0.7829	0.7788	0.7797	0.7788	0.7814	0.7874	0.7874
	CART	0.8252	0.8252	0.8252	0.8252	0.8252	0.8252	0.8251	0.8251	0.8186	0.8252	0.8251	0.8250	0.8250
	CatBoost	0.9337	0.9337	0.9337	0.9337	0.9337	0.9338	0.9337	0.9338	0.9344	0.9338	0.9337	0.9338	0.9338
	KNN	0.6770	0.6472	0.6631	0.6714	0.7057	0.4853	0.6773	0.7446	0.7971	0.6381	0.6714	0.7119	0.7119
	LGBM	0.9229	0.9229	0.9229	0.9217	0.9217	0.9217	0.9217	0.9226	0.9234	0.9229	0.9232	0.9232	0.9219
	LinearRegression	0.5944	0.5944	0.5944	0.5944	0.5944	0.5944	0.5944	0.5944	0.8056	0.5944	0.5945	0.6983	0.6983
	MLP	0.8030	0.7922	0.7468	0.8725	0.8744	0.8319	0.8007	0.8662	0.8333	0.5683	-0.0024	0.7430	0.8341
	RF	0.8896	0.8895	0.8895	0.8891	0.8894	0.8895	0.8893	0.8894	0.8904	0.8895	0.8895	0.8897	0.8897
	SVR	0.2259	0.5593	0.5394	0.6093	0.5897	0.3754	0.5446	0.6987	0.7302	0.3923	0.6093	0.6315	0.6315
	TabNet	-8297.0175	-4.5391	-4.5299	-4.3532	-7.9591	-3.8492	-876.1911	-4.4619	-4.5697	-4.5216	-4.6802	-4.6035	-4.4647
	XGBoost	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104	0.9104
Forest Fires	Ada	-0.4980	-0.4851	-0.4430	-0.0558	-0.0628	-0.4720	-0.4738	-0.4412	-0.0462	-0.4570	-0.4833	-0.0191	-0.0191
	CART	-0.6686	-0.6694	-0.6694	-0.6694	-0.6653	-0.6686	-0.6694	-0.6694	-0.6432	-0.6429	-0.6694	-0.6423	-0.6423
	CatBoost	-0.0437	-0.0437	-0.0437	-0.0437	-0.0437	-0.0437	-0.0437	-0.0437	-0.0438	-0.0437	-0.0437	-0.0437	-0.0437
	KNN	-0.0115	-0.0619	-0.0470	-0.0447	0.0345	-0.0530	-0.0115	-0.0345	0.0105	-0.0548	-0.0447	-0.0403	-0.0403
	LGBM	-0.0246	-0.0246	-0.0246	-0.0188	-0.0188	-0.0188	-0.0188	-0.0135	-0.0288	-0.0246	-0.0181	-0.0181	-0.0127
	LinearRegression	0.0041	0.0041	0.0041	0.0041	0.0041	0.0041	0.0041	0.0041	0.0092	0.0041	0.0041	0.0061	0.0061
	MLP	-0.0067	0.0069	0.0057	0.0013	-0.0266	-0.0007	-0.0479	0.0121	0.0134	0.0062	-0.0012	0.0089	0.0144
	RF	-0.1060	-0.1084	-0.1101	-0.1093	-0.1093	-0.0995	-0.1094	-0.1058	-0.1004	-0.1102	-0.1060	-0.1060	-0.1060
	SVR	-0.0257	-0.0246	-0.0246	-0.0244	-0.0256	-0.0248	-0.0265	-0.0245	-0.0249	-0.0252	-0.0244	-0.0242	-0.0242
	TabNet	-1.1851	-0.0274	-0.0264	-0.0256	-0.0435	-0.0301	-1.2978	-0.0263	-0.0268	-0.0275	-0.0264	-0.0260	-0.0272
	XGBoost	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684	-0.4684
Real Estate Valuation	Ada	0.6579	0.6843	0.6843	0.6816	0.6843	0.6719	0.6726	0.6643	0.6842	0.6980	0.6678	0.6829	0.6829
	CART	0.5608	0.5623	0.5623	0.5623	0.5607	0.5639	0.5607	0.5608	0.5567	0.6011	0.5623	0.5567	0.5567
	CatBoost	0.7324	0.7325	0.7325	0.7324	0.7325	0.7325	0.7327	0.7325	0.7301	0.7325	0.7327	0.7322	0.7322
	KNN	0.6232	0.6191	0.6232	0.6153	0.6488	0.4689	0.6232	0.6348	0.6924	0.5645	0.6153	0.6253	0.6253
	LGBM	0.7001	0.7001	0.7001	0.7120	0.7120	0.7120	0.7120	0.7075	0.7002	0.7001	0.7000	0.7000	0.7120
	LinearRegression	0.5601	0.5601	0.5601	0.5601	0.5601	0.5601	0.5601	0.5601	0.6805	0.5601	0.5601	0.6052	0.6052
	MLP	0.6199	0.5762	0.5608	0.6436	0.6641	-4.7466	0.2915	0.6894	0.6388	0.5184	-0.0077	0.5586	0.6535
	RF	0.7444	0.7449	0.7449	0.7444	0.7438	0.7450	0.7444	0.7444	0.7518	0.7456	0.7449	0.7430	0.7430
	SVR	0.4897	0.5610	0.5327	0.5788	0.5481	0.4095	0.5202	0.5872	0.6311	0.5200	0.5788	0.5948	0.5948
	TabNet	-23524.2610	-8.6955	-8.8303	-8.6971	-8.4680	-2124512.8319	-1294.4512	-8.4379	-8.6292	-8.2463	-8.5425	-8.6645	-8.2867
	XGBoost	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013	0.7013
Wine Quality	Ada	0.3092	0.3142	0.3013	0.3017	0.2968	0.2998	0.2994	0.3158	0.2956	0.3021	0.3148	0.3148	0.3148
	CART	0.0308	0.0231	0.0273	0.0287	0.0238	0.0371	0.0357	0.0322	0.0118	0.0238	0.0280	0.0202	0.0202
	CatBoost	0.4707	0.4708	0.4707	0.4707	0.4707	0.4708	0.4707	0.4707	0.4708	0.4706	0.4707	0.4710	0.4710
	KNN	0.1221	0.2954	0.3283	0.3475	0.2706	0.1479	0.1221	0.3234	0.3245	0.3132	0.3473	0.3396	0.3396
	LGBM	0.4557	0.4557	0.4557	0.4578	0.4578	0.4578	0.4578	0.4602	0.4556	0.4557	0.4555	0.4555	0.4578
	LinearRegression	0.2701	0.2701	0.2701	0.2701	0.2701	0.2701	0.2701	0.2701	0.2723	0.2701	0.2702	0.2811	0.2811
	MLP	0.2500	0.3455	0.3243	0.3872	0.3713	0.2518	0.3026	0.3879	0.3553	0.2928	-0.0026	0.3344	0.3864
	RF	0.4985	0.4987	0.4991	0.4982	0.4994	0.4986	0.4990	0.4992	0.4986	0.4980	0.4979	0.4993	0.4993
	SVR	0.1573	0.3561	0.3185	0.3842	0.3320	0.2115	0.2286	0.3799	0.3825	0.3128	0.3842	0.3907	0.3907
	TabNet	0.1478	0.3028	0.2697	0.3199	0.3066	0.3453	0.3167	0.3340	0.3274	0.0854	0.0545	0.3531	0.3270
	XGBoost	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494	0.4494

TABLE XV:

R^{2}

Score by dataset, model, and scaling method.

TABLE XVI: MSE by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Abalone	Ada	8.4579	7.9106	7.8773	7.6131	8.6210	8.1869	8.8383	7.8177	8.7869	8.1054	8.4388	8.0102	8.0102
	CART	8.7504	8.6683	8.6547	8.5702	8.6762	8.6316	8.5032	8.7161	8.4657	8.7169	8.7600	8.4426	8.4426
	CatBoost	4.8477	4.8482	4.8485	4.8478	4.8486	4.8488	4.8492	4.8470	4.8501	4.8480	4.8477	4.8505	4.8506
	KNN	4.9107	5.0541	5.1229	5.4200	5.0202	5.6801	4.9107	5.5324	5.3144	5.9916	5.4200	5.3829	5.3829
	LGBM	4.8137	4.7948	4.8137	4.8170	4.8175	4.8175	4.8175	4.8846	4.7960	4.8137	4.8158	4.8158	4.8146
	LinearRegression	4.9249	4.9249	4.9249	4.9249	4.9249	4.9249	4.9249	4.9249	5.1136	4.9249	4.9242	4.8700	4.8700
	MLP	4.8285	4.7973	4.8087	4.4900	4.4885	4.6276	4.5099	4.4356	4.6370	4.7730	9.9246	4.7351	4.5088
	RF	4.8295	4.8261	4.8244	4.8392	4.8299	4.8257	4.8268	4.8328	4.8436	4.8421	4.8342	4.8376	4.8376
	SVR	4.7801	4.7898	4.8158	4.6502	4.5642	4.9178	4.6401	4.6734	4.7267	6.4835	4.6502	4.6337	4.6337
	TabNet	6.9789	6.4898	8.0072	4.9438	4.8501	4.5143	6.3516	4.6369	5.9460	10.7584	10.6238	6.6442	5.0601
	XGBoost	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381	5.5381
Air Quality	Ada	1.3054	1.3054	2.8180	2.8180	2.8150	2.7004	1.3054	1.3054	2.8118	1.8197	2.8151	2.7332	2.7332
	CART	0.0264	0.0264	0.0264	0.0264	0.0264	0.0751	0.0264	0.0264	0.0264	0.0264	0.0265	0.0265	0.0265
	CatBoost	0.1474	0.1475	0.1473	0.1473	0.1473	0.1768	0.1473	0.1473	0.1474	0.1473	0.1392	0.1478	0.1478
	KNN	0.9109	1.1487	1.2492	1.0231	0.9524	0.6838	0.9109	2.3434	2.4706	1.2911	1.0232	1.0926	1.0926
	LGBM	0.1129	0.1144	0.1129	0.1298	0.1298	0.1332	0.1298	0.1291	0.1144	0.1129	0.1156	0.1156	0.1300
	LinearRegression	1.4180	1.4180	1.4180	1.4180	1.4180	1.4180	1.4180	1.4180	217.6832	1.4180	1.4189	3.9983	3.9983
	MLP	2.6090	1.6711	0.0278	0.0110	0.0910	0.0229	4.2237	0.1181	2.8912	1.4552	312.0822	0.5080	0.1000
	RF	0.0131	0.0133	0.0131	0.0131	0.0131	0.0126	0.0131	0.0131	0.0104	0.0132	0.0137	0.0130	0.0130
	SVR	5.7798	35.8845	65.6306	125.8177	97.2254	14.6127	55.9688	139.7765	387.6257	79.1870	125.8348	121.0407	121.0407
	TabNet	0.5921	2.7576	0.8953	3.7794	0.4268	3.2731	1.2790	0.3880	8.9527	22.4788	1661.0442	0.3756	1.6444
	XGBoost	0.1147	0.1147	0.1147	0.1147	0.1147	0.1079	0.1147	0.1147	0.1147	0.1147	0.1364	0.1147	0.1147
Appliances Energy Prediction	Ada	37427.0812	18498.3970	49715.5179	45763.4907	51370.2541	48177.6281	25307.3469	25307.3469	28453.4699	18498.3970	25305.0426	18498.3970	18498.3970
	CART	8903.9689	8913.8490	8900.0338	8882.6043	8883.1110	8894.7137	8907.5494	8884.2763	8716.1797	8854.3151	8862.6921	8849.8564	8849.8564
	CatBoost	5636.6122	5594.4777	5636.8339	5594.4722	5594.4462	5594.5548	5594.4195	5594.4042	5575.4147	5636.6466	5596.2357	5594.2071	5594.0726
	KNN	8570.5671	7340.9715	8192.0770	6924.2331	7629.3687	5439.7190	8570.5671	7285.3653	6977.3282	8687.5352	6924.2331	6922.9272	6922.9272
	LGBM	5854.3508	5897.1553	5854.3508	5837.8269	5837.8269	5837.8269	5837.8269	5983.4385	5887.6446	5854.3508	5819.6456	5819.6456	5837.9645
	LinearRegression	8579.6535	8579.6535	8579.6535	8579.6535	8579.6535	8579.6535	8579.6535	8579.6535	8769.7261	8579.6535	8579.5634	8609.0530	8609.0530
	MLP	8656.1894	8203.8308	8673.7742	7063.7610	7116.0726	7046.2702	7194.4738	7243.1268	7992.0693	8715.8666	10299.4889	8027.0389	7277.7891
	RF	5025.6175	5025.6061	5028.0530	5025.4432	5018.0357	5023.1468	5020.2272	5022.6680	5033.6409	5015.6727	5027.8705	5021.9574	5021.2540
	SVR	11390.9162	10335.1997	10585.9003	10144.3973	10259.9281	10836.6115	10459.6290	10401.7762	10198.0740	10724.4861	10144.4020	10189.3769	10189.3769
	TabNet	7156.6395	7258.0613	7029.5273	7389.1339	7180.2959	6868.7284	7160.9294	7443.5996	7215.4305	7378.4224	6946.2734	7239.4565	7010.9228
	XGBoost	5369.4787	5369.4787	5369.4787	5369.4787	5369.4787	5369.4787	5369.4787	5369.4787	5314.8232	5369.4787	5369.4846	5369.4787	5369.4787
Concrete Compressive Strength	Ada	60.5643	60.5890	61.7613	60.0204	59.9916	59.8600	58.7378	59.8600	59.6016	59.8600	59.1379	57.5319	57.5319
	CART	47.2904	47.2904	47.2990	47.2990	47.2904	47.2904	47.3263	47.3263	49.0908	47.2904	47.3263	47.3510	47.3510
	CatBoost	17.9329	17.9310	17.9327	17.9331	17.9330	17.9158	17.9344	17.9199	17.7440	17.9171	17.9348	17.9007	17.9007
	KNN	87.4024	95.4699	91.1487	88.9093	79.6278	139.2635	87.3197	69.1034	54.8975	97.9166	88.9093	77.9513	77.9513
	LGBM	20.8520	20.8520	20.8520	21.1791	21.1791	21.1791	21.1791	20.9335	20.7375	20.8520	20.7807	20.7807	21.1283
	LinearRegression	109.7508	109.7508	109.7508	109.7508	109.7508	109.7508	109.7508	109.7508	52.5905	109.7508	109.7231	81.6210	81.6210
	MLP	53.3124	56.2212	68.5215	34.5083	33.9975	45.4931	53.9233	36.1983	45.1127	116.8063	271.2155	69.5416	44.8916
	RF	29.8643	29.9020	29.8945	30.0114	29.9181	29.8971	29.9510	29.9231	29.6500	29.8871	29.8991	29.8406	29.8406
	SVR	209.4406	119.2487	124.6258	105.7187	111.0081	168.9885	123.2218	81.5215	72.9884	164.4260	105.7134	99.7138	99.7138
	TabNet	2245230.0052	1498.7334	1496.2369	1448.4442	2424.1084	1312.0591	237345.3437	1477.8487	1507.0072	1494.0005	1536.9135	1516.1655	1478.5968
	XGBoost	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314	24.2314
Forest Fires	Ada	11920.6614	11818.4033	11483.3932	8401.7532	8457.8291	11713.7349	11728.3039	11469.2245	8325.6368	11594.6689	11804.1047	8109.8741	8109.8741
	CART	13278.1984	13284.7052	13284.7052	13284.7052	13252.0296	13278.1984	13284.7052	13284.7052	13076.4297	13073.6727	13284.7052	13068.9608	13068.9608
	CatBoost	8305.6236	8305.5795	8305.5795	8305.5795	8305.8724	8305.5695	8306.0607	8305.5795	8306.2021	8305.5983	8305.5994	8305.8919	8305.8919
	KNN	8049.5193	8450.3037	8331.5751	8314.0348	7683.6465	8380.0861	8049.5193	8232.2422	7874.0009	8394.1118	8314.0348	8278.8926	8278.8926
	LGBM	8153.9240	8153.9240	8153.9240	8107.4363	8107.4363	8107.4363	8107.4363	8065.4234	8187.3561	8153.9240	8101.9144	8101.9144	8058.7846
	LinearRegression	7925.6571	7925.6571	7925.6571	7925.6571	7925.6571	7925.6571	7925.6571	7925.6571	7884.4617	7925.6571	7925.6456	7909.6969	7909.6969
	MLP	8010.9478	7903.1906	7912.5614	7947.6005	8169.4083	7963.7329	8339.4658	7861.6344	7851.0386	7908.2283	7967.3473	7886.8273	7843.6319
	RF	8801.7409	8820.9222	8833.7029	8827.5833	8827.4812	8749.3543	8828.6643	8799.6126	8757.0734	8835.0585	8801.6836	8801.2509	8801.2509
	SVR	8162.5768	8153.9647	8154.0176	8152.2006	8161.5094	8155.5670	8168.8605	8152.5913	8155.7694	8158.6271	8152.2076	8150.3462	8150.3462
	TabNet	17388.5255	8175.7646	8167.9740	8161.3423	8304.3410	8197.3670	18285.8949	8166.9890	8171.4319	8177.1570	8167.8719	8164.7159	8174.7450
	XGBoost	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576	11685.4576
Real Estate Valuation	Ada	57.2065	52.7882	52.7882	53.2506	52.7882	54.8695	54.7434	56.1439	52.8036	50.5103	55.5551	53.0226	53.0226
	CART	73.4509	73.1951	73.1951	73.1951	73.4537	72.9267	73.4537	73.4509	74.1349	66.7002	73.1951	74.1321	74.1321
	CatBoost	44.7412	44.7336	44.7336	44.7413	44.7257	44.7387	44.7055	44.7387	45.1346	44.7321	44.7055	44.7879	44.7879
	KNN	63.0027	63.7030	63.0182	64.3252	58.7216	88.8102	63.0027	61.0770	51.4310	72.8293	64.3252	62.6561	62.6561
	LGBM	50.1547	50.1547	50.1547	48.1590	48.1590	48.1590	48.1590	48.9211	50.1370	50.1547	50.1674	50.1674	48.1683
	LinearRegression	73.5684	73.5684	73.5684	73.5684	73.5684	73.5684	73.5684	73.5684	53.4317	73.5684	73.5625	66.0181	66.0181
	MLP	63.5585	70.8757	73.4388	59.6005	56.1629	960.9785	118.4726	51.9328	60.4070	80.5388	168.5185	73.8182	57.9370
	RF	42.7505	42.6567	42.6657	42.7479	42.8437	42.6366	42.7353	42.7405	41.5067	42.5372	42.6543	42.9702	42.9702
	SVR	85.3424	73.4157	78.1453	70.4354	75.5672	98.7492	80.2402	69.0369	61.6845	80.2676	70.4315	67.7575	67.7575
	TabNet	3934014.4970	1621.3258	1643.8788	1621.5964	1583.2938	355272072.3309	216631.9725	1578.2590	1610.2450	1546.2204	1595.7477	1616.1390	1552.9722
	XGBoost	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527	49.9527
Wine Quality	Ada	0.5040	0.5004	0.5098	0.5095	0.5131	0.5109	0.5112	0.4992	0.5139	0.5092	0.4999	0.4999	0.4999
	CART	0.7072	0.7128	0.7097	0.7087	0.7123	0.7026	0.7036	0.7062	0.7210	0.7123	0.7092	0.7149	0.7149
	CatBoost	0.3862	0.3862	0.3862	0.3862	0.3862	0.3861	0.3862	0.3862	0.3862	0.3862	0.3862	0.3860	0.3860
	KNN	0.6406	0.5141	0.4901	0.4761	0.5322	0.6217	0.6406	0.4937	0.4929	0.5011	0.4762	0.4818	0.4818
	LGBM	0.3971	0.3971	0.3971	0.3956	0.3956	0.3956	0.3956	0.3939	0.3972	0.3971	0.3973	0.3973	0.3956
	LinearRegression	0.5326	0.5326	0.5326	0.5326	0.5326	0.5326	0.5326	0.5326	0.5310	0.5326	0.5325	0.5245	0.5245
	MLP	0.5472	0.4775	0.4930	0.4471	0.4587	0.5459	0.5089	0.4466	0.4704	0.5160	0.7315	0.4856	0.4477
	RF	0.3659	0.3658	0.3655	0.3661	0.3653	0.3658	0.3655	0.3654	0.3658	0.3662	0.3664	0.3654	0.3654
	SVR	0.6149	0.4698	0.4973	0.4493	0.4874	0.5753	0.5629	0.4525	0.4505	0.5014	0.4493	0.4446	0.4446
	TabNet	0.6218	0.5087	0.5329	0.4963	0.5059	0.4777	0.4986	0.4859	0.4908	0.6674	0.6899	0.4720	0.4910
	XGBoost	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017	0.4017

TABLE XVI: MSE by dataset, model, and scaling method.

TABLE XVII: MAE by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Abalone	Ada	2.5042	2.4037	2.3991	2.3437	2.5275	2.4467	2.5691	2.3881	2.5624	2.4383	2.4968	2.4173	2.4173
	CART	2.0805	2.0702	2.0710	2.0582	2.0766	2.0702	2.0582	2.0718	2.0542	2.0789	2.0821	2.0502	2.0502
	CatBoost	1.5612	1.5614	1.5612	1.5612	1.5617	1.5611	1.5616	1.5603	1.5620	1.5611	1.5609	1.5618	1.5618
	KNN	1.5673	1.5938	1.6102	1.6555	1.5943	1.6962	1.5673	1.6654	1.6140	1.7400	1.6555	1.6416	1.6416
	LGBM	1.5476	1.5486	1.5476	1.5500	1.5502	1.5502	1.5502	1.5632	1.5489	1.5476	1.5484	1.5484	1.5494
	LinearRegression	1.6187	1.6187	1.6187	1.6187	1.6187	1.6187	1.6187	1.6187	1.6734	1.6187	1.6186	1.6132	1.6132
	MLP	1.6159	1.5740	1.5748	1.5318	1.5059	1.5392	1.5258	1.4958	1.5429	1.5844	2.3339	1.5816	1.5126
	RF	1.5590	1.5595	1.5584	1.5617	1.5596	1.5601	1.5593	1.5619	1.5613	1.5615	1.5622	1.5613	1.5613
	SVR	1.5048	1.5086	1.5111	1.4964	1.4830	1.5272	1.4856	1.4990	1.4972	1.6810	1.4963	1.4900	1.4900
	TabNet	2.0912	1.9773	2.3117	1.5785	1.6197	1.4921	1.9571	1.4957	1.8579	2.5043	2.5471	2.0374	1.7154
	XGBoost	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623	1.6623
Air Quality	Ada	0.8698	0.8698	1.3721	1.3720	1.3685	1.3244	0.8698	0.8698	1.3417	1.0721	1.3685	1.3383	1.3383
	CART	0.0203	0.0203	0.0202	0.0202	0.0202	0.0235	0.0203	0.0202	0.0200	0.0202	0.0203	0.0203	0.0203
	CatBoost	0.1711	0.1712	0.1710	0.1711	0.1711	0.1863	0.1711	0.1711	0.1712	0.1711	0.1710	0.1714	0.1714
	KNN	0.5506	0.6348	0.6755	0.5862	0.5577	0.4738	0.5506	0.8379	0.7639	0.6980	0.5864	0.5759	0.5759
	LGBM	0.0677	0.0669	0.0677	0.0728	0.0728	0.0731	0.0728	0.0731	0.0669	0.0677	0.0678	0.0678	0.0730
	LinearRegression	0.8282	0.8282	0.8282	0.8282	0.8282	0.8282	0.8282	0.8282	10.9540	0.8282	0.8284	1.2383	1.2383
	MLP	1.2505	0.8289	0.0909	0.0698	0.2038	0.0988	1.2563	0.1910	0.8323	0.8479	13.5283	0.3256	0.1266
	RF	0.0167	0.0169	0.0167	0.0166	0.0167	0.0169	0.0167	0.0167	0.0159	0.0168	0.0168	0.0166	0.0166
	SVR	0.8970	1.2103	1.4802	1.9293	1.7292	0.6919	1.3269	5.4424	4.7159	1.4842	1.9294	1.9787	1.9787
	TabNet	0.4924	0.9669	0.6923	0.9567	0.4297	0.5142	0.6516	0.4076	1.1264	3.1115	12.9536	0.3352	0.7997
	XGBoost	0.0769	0.0769	0.0769	0.0769	0.0769	0.0764	0.0769	0.0769	0.0769	0.0769	0.0790	0.0769	0.0769
Appliances Energy Prediction	Ada	183.0879	124.1897	212.6930	203.8061	216.6901	209.1780	148.9915	148.9915	157.4692	124.1897	148.9851	124.1897	124.1897
	CART	40.5540	40.6029	40.5692	40.4712	40.5455	40.5404	40.5573	40.5269	40.3091	40.4087	40.3952	40.4256	40.4256
	CatBoost	38.3286	38.2656	38.3309	38.2655	38.2655	38.2659	38.2642	38.2649	38.1813	38.3277	38.2543	38.2625	38.2620
	KNN	47.7696	42.0912	45.5977	39.8189	43.2660	34.0929	47.7696	41.5828	41.1120	48.6708	39.8189	40.4084	40.4084
	LGBM	39.0633	39.0817	39.0633	39.0969	39.0969	39.0969	39.0969	39.3198	39.2256	39.0633	39.1020	39.1020	39.0960
	LinearRegression	52.9227	52.9227	52.9227	52.9227	52.9227	52.9227	52.9227	52.9227	53.9764	52.9227	52.9226	53.2947	53.2947
	MLP	53.8080	51.1479	54.1215	47.7318	47.5991	47.7552	49.2947	47.7236	51.0640	53.4270	60.5754	51.3053	48.1224
	RF	34.2701	34.2698	34.2923	34.2876	34.2666	34.2645	34.2602	34.2838	34.3702	34.2628	34.2888	34.3111	34.3082
	SVR	48.9182	44.1754	45.5442	43.4164	44.1483	46.1472	45.1832	45.1344	43.8597	46.2897	43.4165	43.6265	43.6265
	TabNet	43.6365	44.2129	43.0738	44.9472	43.2715	42.8673	43.5117	45.0579	43.5039	44.3062	42.3802	44.3074	44.7821
	XGBoost	37.3973	37.3973	37.3973	37.3973	37.3973	37.3973	37.3973	37.3973	37.3315	37.3973	37.3974	37.3973	37.3973
Concrete Compressive Strength	Ada	6.4633	6.4051	6.5035	6.3843	6.4167	6.3515	6.3480	6.3515	6.3690	6.3515	6.3553	6.2855	6.2855
	CART	4.4604	4.4604	4.4623	4.4623	4.4604	4.4604	4.4647	4.4647	4.5501	4.4604	4.4647	4.4681	4.4681
	CatBoost	2.7582	2.7570	2.7581	2.7579	2.7586	2.7574	2.7588	2.7559	2.7539	2.7575	2.7589	2.7529	2.7529
	KNN	7.2301	7.6315	7.0405	7.3319	7.0126	9.5001	7.2247	6.4237	5.4080	7.3897	7.3319	6.8437	6.8437
	LGBM	3.0480	3.0480	3.0480	3.0473	3.0473	3.0473	3.0473	3.0197	3.0303	3.0480	3.0353	3.0353	3.0364
	LinearRegression	8.2986	8.2986	8.2986	8.2986	8.2986	8.2986	8.2986	8.2986	5.7572	8.2986	8.2973	7.0897	7.0897
	MLP	5.9685	5.7620	6.4045	4.4758	4.2648	5.2214	5.3184	4.5392	5.3287	8.6028	13.3172	6.5977	5.1541
	RF	3.7512	3.7531	3.7503	3.7608	3.7547	3.7539	3.7572	3.7550	3.7495	3.7511	3.7528	3.7430	3.7430
	SVR	11.6674	8.6790	8.9978	8.1314	8.2733	10.7062	8.6913	7.0217	6.5147	10.4403	8.1311	7.8360	7.8360
	TabNet	1149.8143	35.0352	35.0250	34.2232	39.0829	31.0733	268.8924	34.6500	35.1275	34.9598	35.5915	35.2705	34.6666
	XGBoost	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739	3.1739
Forest Fires	Ada	34.7652	31.7109	36.7437	29.3869	29.5476	36.1184	43.9637	33.7898	34.3625	35.6878	32.0355	28.5811	28.5811
	CART	32.0911	32.2953	32.2953	32.2953	31.8315	32.0911	32.2953	32.2953	31.0349	30.9868	32.2953	30.9321	30.9321
	CatBoost	21.9476	21.9470	21.9470	21.9470	21.9597	21.9468	21.9593	21.9470	21.9818	21.9501	21.9498	21.9440	21.9440
	KNN	21.5065	20.9572	20.7837	21.0229	21.7229	20.3066	21.5065	19.7818	18.7445	20.2599	21.0229	20.6635	20.6635
	LGBM	24.2821	24.2821	24.2821	24.1922	24.1922	24.1922	24.1922	23.9371	24.5966	24.2821	23.7689	23.7689	23.7599
	LinearRegression	20.7980	20.7980	20.7980	20.7980	20.7980	20.7980	20.7980	20.7980	20.8690	20.7980	20.7983	20.9673	20.9673
	MLP	21.2084	20.5521	20.6192	24.5951	24.6872	24.3192	24.9473	23.9833	20.7647	20.6168	20.1942	20.7236	23.1578
	RF	24.2812	24.2711	24.3738	24.3496	24.3699	23.8711	24.3590	24.3443	24.0253	24.3326	24.4005	24.3836	24.3836
	SVR	14.9474	14.9723	14.9755	14.9655	14.9625	14.9637	14.9233	14.9590	15.0261	15.0110	14.9655	14.9942	14.9942
	TabNet	94.6647	14.9556	14.9436	15.1497	20.5845	17.4966	61.7420	15.0331	14.9516	14.9664	14.9649	14.9054	14.9939
	XGBoost	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770	29.0770
Real Estate Valuation	Ada	5.6034	5.3289	5.3289	5.4391	5.3289	5.5175	5.5500	5.5659	5.3031	5.1177	5.6168	5.4525	5.4525
	CART	5.6076	5.5892	5.5892	5.5892	5.6116	5.5868	5.6116	5.6076	5.6700	5.3548	5.5892	5.6660	5.6660
	CatBoost	4.4727	4.4703	4.4703	4.4735	4.4681	4.4711	4.4642	4.4711	4.5069	4.4667	4.4642	4.4764	4.4764
	KNN	5.4626	5.6760	5.4435	5.6979	5.3102	6.7592	5.4626	5.5819	5.1686	5.8458	5.6979	5.5866	5.5866
	LGBM	4.8106	4.8106	4.8106	4.7284	4.7284	4.7284	4.7284	4.8254	4.8065	4.8106	4.8187	4.8187	4.7326
	LinearRegression	6.1848	6.1848	6.1848	6.1848	6.1848	6.1848	6.1848	6.1848	4.9362	6.1848	6.1844	5.7043	5.7043
	MLP	5.3601	6.1315	6.1126	5.4400	5.0396	21.0560	8.8945	5.0138	5.4359	6.5674	10.7036	6.3407	5.3635
	RF	4.3971	4.3919	4.3951	4.4002	4.4044	4.3726	4.4000	4.3916	4.3922	4.3867	4.3892	4.4019	4.4019
	SVR	6.8824	6.1177	6.2845	5.9805	6.1307	7.6388	6.5489	5.8563	5.5097	6.3822	5.9803	5.9063	5.9063
	TabNet	1625.5671	38.0836	38.2156	37.9038	37.8126	14245.0982	271.3981	37.6000	37.9313	37.1465	37.7939	38.0048	37.1589
	XGBoost	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311	4.7311
Wine Quality	Ada	0.5608	0.5751	0.5633	0.5607	0.5719	0.5696	0.5622	0.5606	0.5686	0.5634	0.5612	0.5568	0.5568
	CART	0.4928	0.4944	0.4933	0.4923	0.4959	0.4913	0.4913	0.4928	0.4985	0.4949	0.4928	0.4954	0.4954
	CatBoost	0.4796	0.4795	0.4796	0.4796	0.4796	0.4796	0.4796	0.4796	0.4797	0.4796	0.4796	0.4796	0.4796
	KNN	0.6243	0.5406	0.5231	0.5259	0.5558	0.5967	0.6243	0.5349	0.5275	0.5350	0.5261	0.5280	0.5280
	LGBM	0.4832	0.4832	0.4832	0.4829	0.4829	0.4829	0.4829	0.4833	0.4832	0.4832	0.4833	0.4833	0.4829
	LinearRegression	0.5639	0.5639	0.5639	0.5639	0.5639	0.5639	0.5639	0.5639	0.5627	0.5639	0.5639	0.5617	0.5617
	MLP	0.5692	0.5436	0.5499	0.5187	0.5299	0.5742	0.5563	0.5196	0.5376	0.5598	0.6657	0.5471	0.5261
	RF	0.4366	0.4367	0.4368	0.4370	0.4364	0.4365	0.4364	0.4362	0.4371	0.4370	0.4369	0.4365	0.4365
	SVR	0.6076	0.5254	0.5453	0.5111	0.5454	0.5921	0.5893	0.5142	0.5080	0.5480	0.5111	0.5065	0.5065
	TabNet	0.6061	0.5582	0.5779	0.5453	0.5494	0.5324	0.5574	0.5428	0.5483	0.6509	0.6585	0.5386	0.5515
	XGBoost	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639	0.4639

TABLE XVII: MAE by dataset, model, and scaling method.

TABLE XVIII: Time to Train (s) by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Abalone	Ada	0.0971	0.1051	0.0774	0.0758	0.0961	0.0966	0.0967	0.0810	0.0971	0.0979	0.0973	0.1004	0.0978
	CART	0.0095	0.0093	0.0093	0.0093	0.0094	0.0094	0.0093	0.0093	0.0093	0.0095	0.0094	0.0094	0.0094
	CatBoost	0.6395	0.6556	0.6344	0.6431	0.6465	0.6445	0.6430	0.6503	0.6530	0.6478	0.6459	0.6400	0.6432
	KNN	0.0010	0.0011	0.0010	0.0010	0.0010	0.0010	0.0010	0.0010	0.0010	0.0010	0.0011	0.0011	0.0011
	LGBM	0.0286	0.0290	0.0288	0.0301	0.0303	0.0304	0.0304	0.0307	0.0293	0.0297	0.0292	0.0283	0.0303
	LinearRegression	0.0003	0.0091	0.0004	0.0003	0.0004	0.0004	0.0003	0.0003	0.0003	0.0004	0.0004	0.0004	0.0004
	MLP	0.8414	1.1040	1.1404	0.8932	1.1691	0.7039	1.5363	1.2343	1.8627	1.7507	0.3577	1.4888	1.5986
	RF	0.5945	0.6241	0.6035	0.6003	0.5983	0.6006	0.5963	0.5977	0.6015	0.5973	0.6022	0.5991	0.5997
	SVR	0.1382	0.1348	0.1383	0.1392	0.1382	0.1383	0.1397	0.1440	0.1391	0.1367	0.1369	0.1367	0.1372
	TabNet	2.8896	2.9717	2.8425	2.7805	2.7096	2.6584	2.7499	2.8266	2.6811	2.9476	2.7365	2.8925	2.7076
	XGBoost	0.0287	0.0291	0.0303	0.0290	0.0296	0.0300	0.0289	0.0305	0.0284	0.0284	0.0286	0.0286	0.0285
Air Quality	Ada	0.2988	0.3068	0.1755	0.1752	0.2095	0.1829	0.3000	0.2992	0.1339	0.3224	0.2055	0.1709	0.1701
	CART	0.0297	0.0293	0.0296	0.0293	0.0294	0.0294	0.0292	0.0295	0.0292	0.0295	0.0289	0.0292	0.0296
	CatBoost	0.9429	0.9439	0.9445	0.9462	0.9445	0.9468	0.9331	0.9437	0.9376	0.9450	0.9409	0.9413	0.9282
	KNN	0.0032	0.0033	0.0032	0.0031	0.0032	0.0032	0.0033	0.0031	0.0032	0.0035	0.0032	0.0034	0.0034
	LGBM	0.0391	0.0384	0.0395	0.0392	0.0388	0.0390	0.0393	0.0396	0.0391	0.0405	0.0390	0.0377	0.0391
	LinearRegression	0.0006	0.0008	0.0006	0.0005	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006	0.0006
	MLP	0.5701	1.8147	3.3321	2.1864	1.9174	2.4677	1.5022	1.8880	2.8593	4.4512	6.4014	5.9718	4.9511
	RF	1.8482	1.8810	1.8664	1.8547	1.8550	1.8489	1.8500	1.8659	1.9096	1.8457	1.8309	1.8409	1.8465
	SVR	0.6315	0.5922	0.5742	0.4535	0.4049	0.3582	0.4079	0.7643	0.5973	0.5838	0.4433	0.4812	0.4799
	TabNet	8.0233	8.2011	8.0828	8.2231	8.0882	8.0729	8.3433	7.9980	7.9860	8.2220	8.2484	8.3199	8.1277
	XGBoost	0.0419	0.0422	0.0438	0.0431	0.0431	0.0415	0.0421	0.0450	0.0417	0.0419	0.0412	0.0418	0.0418
Appliances Energy Prediction	Ada	1.1052	0.5338	1.6459	1.4369	1.6431	1.6525	0.6962	0.6970	0.7305	0.5321	0.7032	0.5343	0.5393
	CART	0.2910	0.2860	0.2859	0.2859	0.2854	0.2859	0.2864	0.2866	0.2854	0.2863	0.2861	0.2857	0.2876
	CatBoost	1.3813	1.3763	1.3719	1.3633	1.3738	1.3676	1.3759	1.3649	1.3771	1.3725	1.3655	1.3775	1.3878
	KNN	0.0004	0.0004	0.0004	0.0004	0.0005	0.0004	0.0004	0.0004	0.0004	0.0005	0.0004	0.0004	0.0004
	LGBM	0.0522	0.0508	0.0532	0.0552	0.0551	0.0553	0.0553	0.0545	0.0520	0.0533	0.0527	0.0511	0.0580
	LinearRegression	0.0013	0.0020	0.0014	0.0015	0.0015	0.0016	0.0016	0.0014	0.0014	0.0017	0.0016	0.0016	0.0016
	MLP	3.3408	14.3606	12.6484	17.8381	16.0423	10.0974	12.9527	16.1881	14.7886	15.5263	0.3227	15.9089	16.6124
	RF	18.2580	18.4922	18.3987	18.3023	18.3270	18.3389	18.2072	18.2253	18.5976	18.2453	18.2356	18.5935	18.3139
	SVR	3.6868	3.6235	3.6063	3.6767	3.5467	3.7634	3.5353	3.7207	3.7124	3.5425	3.5711	3.5809	3.5765
	TabNet	17.4699	17.6083	17.7432	17.2341	17.3906	17.3073	17.9814	17.3632	17.5384	17.8526	17.6793	17.8897	17.6663
	XGBoost	0.0772	0.0779	0.0818	0.0820	0.0787	0.0774	0.0773	0.0821	0.0764	0.0772	0.0773	0.0774	0.0769
Concrete Compressive Strength	Ada	0.0399	0.0397	0.0397	0.0400	0.0395	0.0396	0.0396	0.0398	0.0398	0.0397	0.0414	0.0403	0.0402
	CART	0.0020	0.0020	0.0020	0.0027	0.0020	0.0020	0.0020	0.0021	0.0020	0.0020	0.0020	0.0020	0.0021
	CatBoost	0.4633	0.4593	0.4706	0.4671	0.4660	0.4733	0.4634	0.4705	0.4645	0.4675	0.4638	0.4664	0.4646
	KNN	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004
	LGBM	0.0221	0.0227	0.0232	0.0231	0.0231	0.0264	0.0236	0.0237	0.0231	0.0230	0.0230	0.0222	0.0245
	LinearRegression	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0004	0.0004
	MLP	0.1335	0.7523	0.7281	0.8502	0.8581	0.8483	0.6685	0.8323	0.7186	0.7916	0.1296	0.7726	0.8178
	RF	0.1397	0.1399	0.1396	0.1390	0.1389	0.1388	0.1387	0.1438	0.1392	0.1399	0.1385	0.1390	0.1388
	SVR	0.0086	0.0083	0.0093	0.0087	0.0084	0.0098	0.0086	0.0091	0.0087	0.0083	0.0084	0.0084	0.0083
	TabNet	0.0348	0.0342	0.0356	0.0348	0.0346	0.0355	0.0357	0.0343	0.0365	0.0357	0.0356	0.0361	0.0345
	XGBoost	0.0253	0.0242	0.0261	0.0265	0.0249	0.0243	0.0243	0.0257	0.0237	0.0241	0.0239	0.0240	0.0237
Forest Fires	Ada	0.0200	0.0117	0.0193	0.0121	0.0141	0.0302	0.0302	0.0153	0.0218	0.0143	0.0148	0.0139	0.0139
	CART	0.0014	0.0014	0.0016	0.0014	0.0015	0.0014	0.0014	0.0014	0.0015	0.0015	0.0015	0.0014	0.0015
	CatBoost	0.3892	0.3886	0.3922	0.3870	0.3903	0.3968	0.3836	0.3906	0.3869	0.3945	0.3834	0.3869	0.3951
	KNN	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	LGBM	0.0115	0.0120	0.0121	0.0119	0.0119	0.0127	0.0117	0.0118	0.0122	0.0121	0.0121	0.0119	0.0116
	LinearRegression	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	MLP	0.2457	0.3855	0.3816	0.4316	0.4790	0.4430	0.4089	0.4330	0.3801	0.3860	0.0644	0.3873	0.4248
	RF	0.0997	0.1010	0.1006	0.1002	0.1006	0.1001	0.1002	0.1063	0.1001	0.1006	0.1007	0.1000	0.1005
	SVR	0.0025	0.0026	0.0028	0.0027	0.0025	0.0026	0.0025	0.0025	0.0025	0.0024	0.0026	0.0026	0.0026
	TabNet	0.0325	0.0303	0.0320	0.0361	0.0306	0.0308	0.0325	0.0308	0.0328	0.0325	0.0323	0.0340	0.0306
	XGBoost	0.0215	0.0207	0.0205	0.0223	0.0206	0.0202	0.0202	0.0214	0.0197	0.0199	0.0198	0.0199	0.0198
Real Estate Valuation	Ada	0.0268	0.0269	0.0267	0.0267	0.0267	0.0268	0.0269	0.0269	0.0269	0.0269	0.0282	0.0276	0.0270
	CART	0.0008	0.0008	0.0009	0.0008	0.0009	0.0008	0.0009	0.0008	0.0008	0.0009	0.0009	0.0009	0.0009
	CatBoost	0.3852	0.3966	0.3833	0.3865	0.3852	0.3937	0.3925	0.3936	0.3962	0.3872	0.3816	0.3950	0.3953
	KNN	0.0002	0.0002	0.0002	0.0002	0.0003	0.0002	0.0002	0.0002	0.0002	0.0003	0.0002	0.0003	0.0002
	LGBM	0.0090	0.0095	0.0094	0.0094	0.0097	0.0100	0.0094	0.0098	0.0095	0.0094	0.0094	0.0091	0.0096
	LinearRegression	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	MLP	0.1022	0.3264	0.3386	0.3599	0.3951	0.1556	0.1754	0.3778	0.3339	0.3329	0.1273	0.3417	0.3463
	RF	0.0664	0.0672	0.0666	0.0663	0.0666	0.0660	0.0665	0.0689	0.0664	0.0660	0.0667	0.0663	0.0665
	SVR	0.0019	0.0017	0.0018	0.0023	0.0017	0.0017	0.0018	0.0017	0.0017	0.0017	0.0017	0.0017	0.0016
	TabNet	0.0306	0.0295	0.0319	0.0342	0.0296	0.0295	0.0312	0.0311	0.0327	0.0328	0.0314	0.0328	0.0318
	XGBoost	0.0205	0.0206	0.0193	0.0205	0.0200	0.0195	0.0193	0.0205	0.0190	0.0190	0.0190	0.0193	0.0190
Wine Quality	Ada	0.1357	0.1484	0.1660	0.1705	0.1126	0.1520	0.1815	0.1810	0.1824	0.1809	0.1299	0.1360	0.1364
	CART	0.0200	0.0199	0.0198	0.0198	0.0198	0.0198	0.0199	0.0199	0.0198	0.0200	0.0198	0.0198	0.0199
	CatBoost	0.7668	0.7568	0.7679	0.7628	0.7745	0.7653	0.7652	0.7749	0.7634	0.7586	0.7743	0.7668	0.7617
	KNN	0.0022	0.0022	0.0022	0.0023	0.0022	0.0021	0.0021	0.0022	0.0023	0.0022	0.0023	0.0024	0.0023
	LGBM	0.0314	0.0317	0.0313	0.0322	0.0323	0.0324	0.0321	0.0334	0.0324	0.0325	0.0313	0.0310	0.0328
	LinearRegression	0.0004	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005
	MLP	0.4713	1.7105	1.3455	2.3311	1.5885	0.8618	0.4947	1.6345	1.5528	1.3377	0.0908	1.3502	1.9695
	RF	1.2844	1.3053	1.2891	1.2968	1.2922	1.2920	1.2855	1.3414	1.2833	1.2865	1.2846	1.2828	1.2808
	SVR	0.3166	0.3176	0.3181	0.3239	0.3157	0.3177	0.3194	0.3213	0.3149	0.3054	0.3211	0.3169	0.3190
	TabNet	5.5797	5.4072	5.3061	5.4588	5.3787	5.3858	5.6280	5.7121	5.6199	5.4181	5.5159	5.3852	5.5306
	XGBoost	0.0350	0.0351	0.0334	0.0342	0.0353	0.0335	0.0336	0.0351	0.0328	0.0332	0.0331	0.0331	0.0332

TABLE XVIII: Time to Train (s) by dataset, model, and scaling method.

TABLE XIX: Time to Inference (s) by dataset, model, and scaling method.

Dataset	Model	NO	MM	MA	ZSN	PS	VAST	MC	RS	QT	DS	TT	LS	HT
Abalone	Ada	0.0025	0.0026	0.0019	0.0019	0.0025	0.0025	0.0025	0.0020	0.0025	0.0025	0.0025	0.0025	0.0025
	CART	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0001	0.0002	0.0001	0.0002	0.0002	0.0002	0.0002
	CatBoost	0.0007	0.0008	0.0010	0.0009	0.0009	0.0010	0.0010	0.0008	0.0007	0.0009	0.0008	0.0008	0.0007
	KNN	0.0034	0.0045	0.0048	0.0042	0.0045	0.0036	0.0035	0.0041	0.0042	0.0029	0.0044	0.0043	0.0044
	LGBM	0.0006	0.0008	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007	0.0007
	LinearRegression	0.0000	0.0001	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	RF	0.0110	0.0107	0.0111	0.0110	0.0110	0.0110	0.0110	0.0109	0.0111	0.0109	0.0110	0.0111	0.0112
	SVR	0.0652	0.0653	0.0652	0.0656	0.0647	0.0647	0.0654	0.0666	0.0647	0.0646	0.0642	0.0643	0.0643
	TabNet	0.0094	0.0095	0.0092	0.0091	0.0092	0.0092	0.0094	0.0094	0.0092	0.0093	0.0092	0.0095	0.0092
	XGBoost	0.0004	0.0005	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005	0.0005	0.0004	0.0004	0.0004	0.0004
Air Quality	Ada	0.0028	0.0028	0.0016	0.0016	0.0019	0.0017	0.0028	0.0028	0.0012	0.0029	0.0018	0.0015	0.0016
	CART	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002
	CatBoost	0.0011	0.0012	0.0010	0.0012	0.0011	0.0010	0.0010	0.0012	0.0013	0.0011	0.0010	0.0012	0.0011
	KNN	0.0171	0.0322	0.0317	0.0338	0.0215	0.0190	0.0173	0.0381	0.0289	0.0238	0.0346	0.0282	0.0276
	LGBM	0.0010	0.0010	0.0009	0.0009	0.0009	0.0010	0.0009	0.0009	0.0009	0.0009	0.0009	0.0009	0.0009
	LinearRegression	0.0000	0.0001	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0005	0.0007	0.0006	0.0006	0.0006	0.0006	0.0005	0.0006	0.0006	0.0006	0.0006	0.0007	0.0006
	RF	0.0169	0.0167	0.0171	0.0170	0.0171	0.0169	0.0171	0.0171	0.0173	0.0170	0.0170	0.0171	0.0167
	SVR	0.2986	0.2736	0.2676	0.1981	0.1717	0.1479	0.1690	0.3599	0.2825	0.2772	0.1929	0.2070	0.2069
	TabNet	0.0167	0.0165	0.0165	0.0166	0.0166	0.0166	0.0167	0.0166	0.0164	0.0171	0.0165	0.0170	0.0167
	XGBoost	0.0005	0.0006	0.0006	0.0007	0.0007	0.0006	0.0006	0.0007	0.0006	0.0005	0.0005	0.0006	0.0005
Appliances Energy Prediction	Ada	0.0053	0.0024	0.0084	0.0072	0.0089	0.0083	0.0034	0.0033	0.0035	0.0024	0.0035	0.0024	0.0026
	CART	0.0008	0.0008	0.0008	0.0008	0.0008	0.0007	0.0008	0.0008	0.0007	0.0008	0.0008	0.0008	0.0008
	CatBoost	0.0017	0.0016	0.0015	0.0017	0.0017	0.0014	0.0013	0.0016	0.0015	0.0016	0.0014	0.0016	0.0017
	KNN	0.0222	0.0214	0.0215	0.0225	0.0210	0.0204	0.0217	0.0216	0.0256	0.0214	0.0211	0.0213	0.0206
	LGBM	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0016	0.0017	0.0017	0.0017	0.0016	0.0016	0.0020
	LinearRegression	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	MLP	0.0012	0.0013	0.0012	0.0012	0.0016	0.0014	0.0013	0.0012	0.0012	0.0014	0.0013	0.0013	0.0012
	RF	0.0694	0.0686	0.0697	0.0700	0.0694	0.0702	0.0693	0.0692	0.0700	0.0692	0.0701	0.0701	0.0696
	SVR	1.8656	1.7988	1.8341	1.7995	1.8016	1.7985	1.7991	1.8159	1.8055	1.7986	1.7933	1.7972	1.7930
	TabNet	0.0347	0.0346	0.0345	0.0343	0.0347	0.0347	0.0350	0.0346	0.0349	0.0349	0.0350	0.0347	0.0350
	XGBoost	0.0008	0.0009	0.0009	0.0010	0.0010	0.0009	0.0009	0.0010	0.0008	0.0008	0.0009	0.0010	0.0009
Concrete Compressive Strength	Ada	0.0015	0.0015	0.0015	0.0015	0.0015	0.0015	0.0015	0.0015	0.0015	0.0015	0.0016	0.0015	0.0015
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0006	0.0006	0.0005	0.0005	0.0005	0.0005	0.0005	0.0006	0.0005	0.0006	0.0005	0.0005	0.0007
	KNN	0.0009	0.0009	0.0010	0.0010	0.0010	0.0007	0.0009	0.0010	0.0010	0.0008	0.0011	0.0010	0.0009
	LGBM	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005	0.0004	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005
	LinearRegression	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0001	0.0002	0.0002	0.0001	0.0002	0.0002	0.0001	0.0001	0.0001	0.0001	0.0001	0.0002	0.0001
	RF	0.0033	0.0034	0.0034	0.0032	0.0034	0.0033	0.0033	0.0034	0.0034	0.0033	0.0032	0.0033	0.0033
	SVR	0.0044	0.0040	0.0043	0.0041	0.0041	0.0041	0.0041	0.0041	0.0041	0.0041	0.0040	0.0040	0.0040
	TabNet	0.0043	0.0041	0.0040	0.0043	0.0043	0.0044	0.0043	0.0041	0.0043	0.0043	0.0041	0.0045	0.0044
	XGBoost	0.0003	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
Forest Fires	Ada	0.0008	0.0005	0.0008	0.0005	0.0006	0.0013	0.0013	0.0007	0.0009	0.0006	0.0007	0.0006	0.0006
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0005	0.0004	0.0005	0.0004	0.0004	0.0005	0.0005	0.0004	0.0006	0.0004	0.0004	0.0005	0.0005
	KNN	0.0003	0.0005	0.0006	0.0006	0.0004	0.0005	0.0004	0.0005	0.0006	0.0005	0.0007	0.0006	0.0006
	LGBM	0.0003	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	LinearRegression	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0018	0.0019
	SVR	0.0011	0.0011	0.0012	0.0011	0.0011	0.0011	0.0011	0.0013	0.0012	0.0011	0.0011	0.0011	0.0011
	TabNet	0.0040	0.0036	0.0037	0.0044	0.0037	0.0037	0.0039	0.0039	0.0037	0.0039	0.0036	0.0039	0.0049
	XGBoost	0.0003	0.0004	0.0003	0.0004	0.0003	0.0003	0.0003	0.0004	0.0003	0.0003	0.0003	0.0003	0.0003
Real Estate Valuation	Ada	0.0012	0.0013	0.0012	0.0013	0.0012	0.0013	0.0013	0.0013	0.0012	0.0013	0.0013	0.0013	0.0012
	CART	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	CatBoost	0.0005	0.0004	0.0005	0.0005	0.0005	0.0004	0.0004	0.0005	0.0005	0.0004	0.0005	0.0004	0.0004
	KNN	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0004	0.0006	0.0003
	LGBM	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
	LinearRegression	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001	0.0001
	RF	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017	0.0017
	SVR	0.0007	0.0007	0.0009	0.0007	0.0007	0.0007	0.0007	0.0007	0.0008	0.0007	0.0007	0.0007	0.0007
	TabNet	0.0033	0.0033	0.0034	0.0035	0.0034	0.0034	0.0035	0.0036	0.0034	0.0035	0.0035	0.0034	0.0041
	XGBoost	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003	0.0003
Wine Quality	Ada	0.0024	0.0026	0.0030	0.0031	0.0019	0.0027	0.0034	0.0034	0.0033	0.0034	0.0022	0.0025	0.0026
	CART	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002	0.0002
	CatBoost	0.0011	0.0012	0.0011	0.0009	0.0010	0.0009	0.0009	0.0009	0.0011	0.0010	0.0011	0.0009	0.0007
	KNN	0.0048	0.0286	0.0257	0.0369	0.0086	0.0042	0.0047	0.0329	0.0321	0.0169	0.0385	0.0321	0.0310
	LGBM	0.0008	0.0008	0.0008	0.0008	0.0008	0.0008	0.0008	0.0009	0.0008	0.0008	0.0008	0.0008	0.0008
	LinearRegression	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
	MLP	0.0004	0.0004	0.0004	0.0004	0.0004	0.0005	0.0004	0.0005	0.0004	0.0005	0.0004	0.0004	0.0004
	RF	0.0143	0.0148	0.0147	0.0145	0.0147	0.0145	0.0143	0.0151	0.0147	0.0147	0.0146	0.0146	0.0143
	SVR	0.1501	0.1434	0.1527	0.1419	0.1478	0.1521	0.1517	0.1422	0.1396	0.1503	0.1413	0.1406	0.1406
	TabNet	0.0116	0.0115	0.0115	0.0115	0.0115	0.0114	0.0120	0.0115	0.0118	0.0117	0.0120	0.0116	0.0117
	XGBoost	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005	0.0005

TABLE XIX: Time to Inference (s) by dataset, model, and scaling method.

TABLE XX: Memory Usage (kB) per Dataset and Scaling Method

Dataset

ZSN

VAST

Breast Cancer

Wisconsin

Diagnostic

0.1875

176.8210

175.7480

176.2767

229.9746

229.1289

228.2461

384.9801

267.6338

229.8695

232.2894

230.8722

232.3309

Dry Bean

1704.2923

2388.0706

2448.4049

2599.3600

2404.5947

2405.1895

2404.3014

2388.2383

2520.3968

2406.4144

2406.9097

2405.5592

2408.1296

Glass

Identification

0.1875

23.2331

23.2064

26.1930

36.6759

36.0576

34.7992

51.7995

34.2393

36.1146

38.0356

37.9139

40.7885

Heart Disease

33.6901

49.0889

67.2025

71.7708

62.5277

62.8802

62.7256

121.9645

74.4512

63.1191

64.0603

64.0238

65.3331

Iris

0.1875

8.6296

8.7682

10.5488

16.9004

16.4834

15.6758

20.2601

22.0664

16.0218

18.4840

18.5913

20.3986

Letter

Recognition

0.1875

2567.9333

3566.3210

3787.2116

3629.7077

3629.3434

3628.4954

2568.3698

3318.4468

3628.8434

3632.1798

3631.9978

3742.5224

Magic Gamma

Telescope

0.1875

1552.4850

1552.0501

1552.2611

2198.9515

2199.2367

2198.2923

1552.7106

1792.9922

2198.5312

2201.3631

2201.1431

2306.3952

Rice Cammeo

And Osmancik

211.2500

297.4365

358.2734

378.0465

304.2510

304.3102

303.1006

297.6322

359.7692

304.2708

305.9161

305.7961

328.2706

Wine

21.0007

31.2210

40.3809

43.8314

45.0397

44.8984

44.5498

73.9326

46.5228

45.7445

46.4517

47.1243

47.9475

Abalone

0.1875

294.7493

294.4055

294.5703

355.6029

354.9523

353.4513

295.0916

360.0691

353.7972

356.8120

356.2417

380.5150

Air Quality

880.1009

1233.9790

1294.5547

1373.1566

1247.1017

1246.0932

1245.9476

1234.2925

1335.2915

1246.6934

1247.9374

1247.9843

1248.8872

Appliances

Energy

Prediction

4164.2812

5832.9988

5894.4865

6261.5162

5867.4751

5867.0134

5865.2385

5832.9714

6051.0226

5868.1863

5868.2539

5868.5447

5970.7936

Concrete

Compressive

Strength

65.6575

94.2771

137.7692

144.9830

104.1352

103.5251

102.7008

94.4553

181.1929

103.1220

105.1453

104.8095

112.0503

Forest Fires

43.2770

62.2524

87.1918

92.4231

72.8778

72.3182

71.9400

157.8339

105.3538

72.2997

72.9565

72.5387

74.4466

Real Estate

Valuation

22.2614

32.7211

43.1918

46.3411

38.4530

38.1743

36.8636

79.0026

67.4926

38.1566

40.1310

39.8900

43.6567

Wine Quality

0.1875

624.8189

624.3658

624.5845

833.4797

833.3540

831.6786

625.0661

744.7711

832.4357

834.7470

835.6562

871.7646

TABLE XX: Memory Usage (kB) per Dataset and Scaling Method

\EOD