How to handle Outliers

Brain Glitch
1 min readFeb 3, 2023

1. Imputation Techniques: Imputation techniques are used to remove outliers from the dataset. The following are the most commonly used imputation techniques:

  • Imputation (capping): Imputation involves replacing the outliers with a fixed value, such as the mean, median, mode, or a static value determined by a business expert.
  • Mean imputation: Replacing the outliers with the mean of the dataset.
  • Median imputation: Replacing the outliers with the median of the dataset.
  • Mode imputation: Replacing the outliers with the mode of the dataset.
  • Zero imputation: Replacing the outliers with zero.
  • Max imputation: Replacing the outliers with the maximum value in the dataset.
  • Min imputation: Replacing the outliers with the minimum value in the dataset.
  • Static value imputation: Replacing the outliers with a value determined by a business expert.

2. Transformation: Transformation techniques are used to reduce the impact of outliers on the dataset. The following are the most commonly used transformation techniques:

  • Normalization (0 to 1): Scaling the data to a range of 0 to 1.
  • Standardization (-3 to +3): Scaling the data to a range of -3 to +3.
  • Cube root transformation: Taking the cube root of the data.
  • Square root transformation: Taking the square root of the data.
  • Reciprocal: Taking the reciprocal of the data.
  • Log reciprocal: Taking the log reciprocal of the data.

3. Deleting Observations: Deleting outliers is another way to handle outliers. This approach is not recommended as it can lead to loss of valuable information. However, in some cases, it may be necessary to delete outliers if they are deemed to be irrelevant or corrupt.

--

--

Brain Glitch

Artificial Intelligence | Spirituality | Manifestation | Conspiracy | Technology