MariaDB REGEXP_INSTR() Function

The MariaDB REGEXP_INSTR() function returns the starting index of a regular expression pattern in the given string. It returns 0 if there is no match. If the string or pattern is NULL, the return value will be NULL. Character indexes begin at 1.

The REGEXP_INSTR() function performs a case-insensitive match, except when used with binary strings.

Syntax

REGEXP_INSTR(string, pattern)

Parameters

`string`	`Required.` Specify the string to search.
`pattern`	`Required.` Specify the regular expression matching information. See the table below for pattern syntax.

Regular Expression Patterns

Symbol	Description
^	Matches the beginning of a string.
$	Matches the end of a string.
*	Matches zero or more occurrences.
+	Matches one or more occurrences.
?	Matches zero or one occurrence.
.	Matches any character except NULL.
\|	Used like an "OR" to specify more than one alternative.
[ ]	Matches any single character specified within [].
[^ ]	Matches any single character that is not specified within [].
-	Represents a range of characters.
( )	Used to group expressions as a subexpression.
{m}	Matches m times.
{m,}	Matches at least m times.
{m,n}	Matches at least m times, but no more than n times.
\n	n is a number between 1 and 9. Matches the nth subexpression found within ( ) before encountering \n.
\d	Matches a digit character.
\D	Matches a nondigit character.
\w	Matches a word character.
\W	Matches a nonword character.
\s	Matches a whitespace character.
\S	matches a non-whitespace character.
*?	Matches the preceding pattern zero or more occurrences.
+?	Matches the preceding pattern one or more occurrences.
??	Matches the preceding pattern zero or one occurrence.
{n}?	Matches the preceding pattern n times.
{n,}?	Matches the preceding pattern at least n times.
{n,m}?	Matches the preceding pattern at least n times, but not more than m times.
[..]	Matches one collation element that can be more than one character.
[=character_class=]	Represents an equivalence class. It matches all characters with the same collation value, including itself.
[:character_class:]	Represents a character class that matches all characters belonging to that class.

List of standard class name

Character Class Name	Meaning
alnum	Alphanumeric characters
alpha	Alphabetic characters
blank	Whitespace characters
cntrl	Control characters
digit	Digit characters
graph	Graphic characters
lower	Lowercase alphabetic characters
print	Graphic or space characters
punct	Punctuation characters
space	Space, tab, newline, and carriage return
upper	Uppercase alphabetic characters
xdigit	Hexadecimal digit characters

Example: REGEXP_INSTR() examples

The example below shows the usage of REGEXP_INSTR() function.

SELECT REGEXP_INSTR('dog cat dog', 'dog');
Result: 1

SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
Result: 1

SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
Result: 8

Example: Case-sensitive search

There are many ways in MariaDB which can be used for case-sensitive search while using this function. See the example below:

SELECT REGEXP_INSTR('ABC','b');
Result: 2

/* using COLLATE to convert string to a binary string */
SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'b');
Result: 0

/* using BINARY to convert string to a binary string */
SELECT REGEXP_INSTR(BINARY'ABC','b');
Result: 0

/* forces case-sensitive using (?-i) pattern */
SELECT REGEXP_INSTR('ABC','(?-i)b');
Result: 0

/* forces case-insensitive search */
SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?i)b');
Result: 2

/* forces case-insensitive search */
SELECT REGEXP_INSTR('ABC' COLLATE utf8_bin,'(?+i)b');
Result: 2

Example: Match on more than one alternative

Consider a database table called Employee with the following records:

EmpID	Name	City	Age	Salary
1	John	London	25	3000
2	Marry	New York	24	2750
3	Jo	Paris	27	2800
4	Smith	Amsterdam	30	3100
5	Aman	New Delhi	28	3000

The below query is used to get the location of first occurrence of vowel in the records of Name column. To specify more than one alternative in the pattern, | is used.

SELECT *, 
REGEXP_INSTR(Name, 'a|e|i|o|u') AS first_occurrence
FROM Employee

This will produce a result similar to:

EmpID	Name	City	Age	Salary	first_occurrence
1	John	London	25	3000	2
2	Marry	New York	24	2750	2
3	Jo	Paris	27	2800	2
4	Smith	Amsterdam	30	3100	3
5	Aman	New Delhi	28	3000	1

Example: Matching on multiple characters

Consider the above discussed table. The below query is used to get the location of first occurrence of "ar" or "er" (whichever comes first) in the records of City column. To specify more than one alternative in the pattern, [ ] is used

SELECT *, 
REGEXP_INSTR(City, '[ae]r') AS first_occurrence
FROM Employee

This will produce a result similar to:

EmpID	Name	City	Age	Salary	first_occurrence
1	John	London	25	3000	0
2	Marry	New York	24	2750	0
3	Jo	Paris	27	2800	2
4	Smith	Amsterdam	30	3100	5
5	Aman	New Delhi	28	3000	0

Example: Using with Multibyte character

Casting a Multibyte character set as BINARY is useful to get the offsets in bytes. See the example below:

SELECT REGEXP_INSTR('BJÖRN','N');
Result: 5

/* using BINARY to convert string to a binary string */
SELECT REGEXP_INSTR(BINARY 'BJÖRN','N');
Result: 6

❮ MariaDB - REGEXP