winsorize(X, limit, [inclusive=true], [nanPolicy=’upper’])


X is a vector.

limit is a scalar or a vector with 2 elements indicating the percentages to cut on each side of X, with respect to the number of unmasked data, as floats between 0 and 1. If limit is a scalar, it means the percentages to cut on both sides of X. If limit has n elements (including NULLs), the (n * limit[0])-th smallest element and the (n * limit[1])-th largest element are masked, and the total number of unmasked data after trimming is n * (1-sum(limit)). The value of one element of limit can be set to 0 to indicate no masking is conducted on this side.

inclusive is a Boolean type scalar or a vector of 2 elements indicating whether the number of data being masked on each side should be truncated (true) or rounded (false).

nanPolicy is a string indicating how to handle NULL values. The following options are available (default is ‘upper’):

  • ‘upper’: allows NULL values and treats them as the largest values of X.

  • ‘lower’: allows NULL values and treats them as the smallest values of X.

  • ‘raise’: throws an error.

  • ‘omit’: performs the calculations without masking NULL values.


Return a winsorized version of the input array.


$ x=1..10
winsorize(x, 0.1);

$ winsorize(x, 0.12 0.17);

$ winsorize(x, 0.12 0.17, inclusive=false);

$ x=1..20;
$ x[19:]=NULL;
$ x;

$ winsorize(x, 0.1);

$ winsorize(x, 0.1, nanPolicy='upper');

$ winsorize(x, 0.1, nanPolicy='lower');