Question 1

What is the difference between normalization and standardization?

Accepted Answer

Normalization typically refers to Min-Max scaling, which transforms data to a fixed range — usually [0, 1]. It preserves the relative distances between values and is suitable when the algorithm requires bounded inputs, such as neural networks with sigmoid activations. Standardization (Z-score) transforms data to have zero mean and unit standard deviation, producing values typically in the range [−3, 3]. It does not bound the output, so extreme outliers retain their scale. Standardization is generally preferred for algorithms that assume Gaussian-distributed features, such as linear regression, logistic regression, PCA, and SVMs.

Question 2

When should I use Robust scaling instead of Z-score?

Accepted Answer

Use Robust scaling when your dataset contains significant outliers that you do not want to remove but also do not want to dominate the scaling. Robust scaling uses the median and IQR instead of the mean and standard deviation: x_rob = (x − median) / IQR. Since median and IQR are not sensitive to extreme values, outliers do not inflate the scale. This is especially useful for financial data (rare large transactions), sensor data (spike noise), or any domain where occasional extremes are real but should not compress the bulk of the distribution into a narrow range.

Question 3

What does Log transformation do to the data distribution?

Accepted Answer

Log transformation (log(x + 1)) compresses the high end of the value range while expanding the low end, pulling right-skewed distributions toward symmetry. It is commonly applied to income data, population counts, transaction amounts, and any feature where large values appear rarely but span many orders of magnitude. The +1 offset ensures log(0) is defined (log(0+1) = 0). Log transform is not defined for negative values, so it should only be applied to non-negative features. After log transformation, features that follow a log-normal distribution become approximately Gaussian, which benefits many ML algorithms.

Question 4

Why is feature scaling required before machine learning?

Accepted Answer

Many ML algorithms are sensitive to feature magnitude. Distance-based algorithms like KNN and K-Means cluster based on Euclidean distance — a feature with values in the range [0, 10000] will dominate a feature with values in [0, 1] purely because of scale difference. Gradient descent converges much faster when features have similar ranges, because the loss landscape becomes more spherical. PCA and other variance-based methods are biased toward high-variance features when features are not standardized. Tree-based algorithms (Random Forest, XGBoost) are generally scale-invariant because they split on thresholds, not distances.

Question 5

What is Decimal Scaling normalization?

Accepted Answer

Decimal scaling divides every value by a power of 10 selected to make the maximum absolute value less than 1. Specifically, it divides by 10^j where j is the smallest integer such that max|x_norm| < 1. For example, if the maximum absolute value is 4500, j = 4 and every value is divided by 10000. The result is always in [−1, 1] for negative data or [0, 1] for non-negative data. Decimal scaling is simple and interpretable, preserves proportionality, and is often used in older preprocessing pipelines. Unlike Min-Max scaling, it does not shift the data — it only scales it.

Column	Before				→	After
	Min	Max	Mean	Std		Min	Max	Mean	Std
size	750.0000	2000.0000	1310.0000	402.3680	→	0.0000	1.0000	0.4480	0.3219
bedrooms	1.0000	4.0000	2.7000	0.9000	→	0.0000	1.0000	0.5667	0.3000
price	160000.0000	380000.0000	257500.0000	68383.1120	→	0.0000	1.0000	0.4432	0.3108

Method	Formula	Output range	Sensitive to outliers	Use when
Min-Max	(x − min) / (max − min)	[0, 1]*	Yes	Neural networks, image processing
Z-Score	(x − μ) / σ	≈ [−3, 3]	Yes	Linear models, PCA, SVM
Robust	(x − median) / IQR	Unbounded	No	Data with outliers
Log	log(x + 1)	Unbounded	Moderate	Right-skewed, count data
Decimal	x / 10ʲ	[−1, 1]	Moderate	Simple interpretable scaling

Data Normalization Tool

Normalization Methods Compared

Before and After Statistics: What to Look For

Input Format: CSV and Single Column

Related Tools

Frequently Asked Questions