regexFindStr

Syntax

regexFindStr(str, pattern, [onlyFirst=true], [offset])

Arguments

str is a STRING scalar or vector, indicating the target string to be scanned.

pattern is a string indicating the string pattern with regular expression. It can contain literals and metacharacters.

onlyFirst (optional) is a Boolean value indicating whether to return only the first substring that matches pattern for each string.

  • true (default): Return the first match.

  • false: Return all non-overlapping matches.

offset (optional) is a non-negative integer indicating the starting position for the search in str. The default value is 0, which is the first position of str.

Details

Different from regexFind which returns the positions of the matched strings, regexFindStr searches from the offset position and returns the matched substring.

  • When str is a scalar:

    • If onlyFirst is set to true, return the first substring that matches pattern. Otherwise return an empty string.

    • If onlyFirst is set to false, return a STRING vector containing all non-overlapping matches. Otherwise return an empty STRING vector.

  • When str is a vector:

    • If onlyFirst is set to true, return the first substring that matches pattern for each string of str. Otherwise return a STRING vector of the same length as str, with all elements being empty strings.

    • If onlyFirst is set to false, return a tuple containing all non-overlapping matches for each string of str. Otherwise return an tuple of the same length as str, with all elements being empty STRING vectors.

Examples

// when str is a scalar and onlyFirst = true
regexFindStr('234AA(2)BBB S&P', '([A|B|C|+|-]*)', true)
//output: AA

// when str is a scalar and onlyFirst = false
regexFindStr('234AA(2)BBB S&P', '([A|B|C|+|-]*)', false)
//output: ["AA","BBB"]

// when str is a vector and onlyFirst = true
regexFindStr(['234AA(2)BBBS&P', '234AA(2)BBBS&P'], '([A|B|C|+|-]*)', true)
//output: ["AA","AA"]

// when str is a vector and onlyFirst = false
regexFindStr(['234AA(2)BBBS&P', '234AA(2)BBBS&P'], '([A|B|C|+|-]*)', false)
//output: (["AA","BBB"],["AA","BBB"])