qcut
Syntax
qcut(X, q, [labels],[dropDuplicates=false]
Details
Determines the quantile bin for each element based on its rank in a numeric vector. For example, given 1,000 values, divides them into 10 quantile bins and returns the bin label each element belongs to.
Parameters
X: A numeric vector.
q: An INT scalar or a FLOATING vector.
- An INT scalar specifies the number of quantile bins (e.g., 10 for deciles, 4 for quartiles).
- A FLOATING vector specifies the quantile breakpoints. It must contain at least two elements, with values in the range [0, 1].
labels (optional): A vector of labels for each quantile bin.
- It defaults to NULL, which means the function returns an integer vector representing the bin index for each element.
- If q is a scalar, the length of labels must equal q.
- If q is a vector, the length of labels must be len(q) - 1.
dropDuplicates: A boolean value specifying whether to drop duplicate bin boundaries.
- It defaults to false, which means raising an error if duplicate boundaries exist.
- If it is set to true, duplicate boundaries are removed.
Returns
A vector indicating the quantile bin to which each element belongs.
Examples
// Divide the data into 4 quantile bins
qcut([1,2,3,4,5,6,7,8,9,10], 4)
// Output: [0 0 0 1 1 2 2 3 3 3]
// Divide using custom quantile breakpoints: 0–30%, 30–70%, 70–100%
qcut([1,2,3,4,5,6,7,8,9,10], [0, 0.3, 0.7, 1.0])
// Output: [0 0 0 1 1 1 1 2 2 2]
// Divide the data into 4 quantile bins and use custom labels
qcut([1,2,3,4,5,6,7,8,9,10], 4, ["Q1", "Q2", "Q3", "Q4"])
// Output: [Q1 Q1 Q1 Q2 Q2 Q3 Q3 Q4 Q4 Q4]
/* Due to a large number of duplicate values in the data,
the quantile boundaries are not unique.
After enabling dropDuplicates, duplicate boundaries are automatically removed,
resulting in fewer than 4 quantile bins.
*/
qcut(X=[1, 1, 1, 1, 2, 3], q=4, dropDuplicates=true)
// Output: [0 0 0 0 2 2]
