Regular Expressions (RegExp)
When working with strings in VBA, you may encounter complex pattern matching requirements that simple search and replace functions can’t handle. That’s where Regular Expressions (RegExp) become invaluable.
With regular expressions, you can efficiently perform advanced string operations like email format validation, phone number extraction, and pattern-based text replacement.
This article covers everything you need to know about using the RegExp object in VBA, from basics to practical applications.
When You Need Regular Expressions
Regular expressions are powerful in these real-world scenarios:
- Input validation: Check formats for email addresses, phone numbers, postal codes
- Data extraction: Extract specific patterns from HTML or text
- Complex replacement: Transform only parts matching certain patterns
- Log file analysis: Extract dates, IP addresses, or other formatted data
- Data cleansing: Remove unwanted characters or whitespace in bulk
Regular expressions enable “pattern-based matching” that’s difficult to achieve with functions like InStr or Replace.
RegExp Object Basics
Creating a RegExp Object
To use regular expressions in VBA, create a VBScript.RegExp object. There are two methods:
Method 1: Using Reference Setting (Early Binding)
In the VBA Editor, go to “Tools” → “References” and check Microsoft VBScript Regular Expressions 5.5.
Sub RegExpEarlyBinding()
' Requires reference setting
Dim re As RegExp
Set re = New RegExp
re.Pattern = "\d+"
Debug.Print re.Test("abc123") ' True
End Sub
Method 2: Using CreateObject (Late Binding)
This method works without reference settings and is more portable.
Sub RegExpLateBinding()
' No reference setting required
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d+"
Debug.Print re.Test("abc123") ' True
End Sub
The CreateObject method is recommended because it doesn’t require reference settings, making macro sharing across environments smoother.
Key RegExp Properties
| Property | Description | Default |
|---|---|---|
Pattern | The regular expression pattern | None |
Global | Search entire string | False |
IgnoreCase | Case-insensitive matching | False |
MultiLine | Enable multi-line mode | False |
Pattern Property
Specifies the regular expression pattern for searching.
Sub PatternExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
' Digit pattern
re.Pattern = "\d+"
Debug.Print re.Test("Order123") ' True
' Email pattern
re.Pattern = "[\w\.-]+@[\w\.-]+\.\w+"
Debug.Print re.Test("[email protected]") ' True
End Sub
Global Property
When set to True, searches the entire string and returns all matches. When False (default), returns only the first match.
Sub GlobalExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d+"
' Global = False (default): first match only
re.Global = False
Dim matches As Object
Set matches = re.Execute("abc123def456ghi789")
Debug.Print "Global=False: " & matches.Count & " match(es)" ' 1
' Global = True: all matches
re.Global = True
Set matches = re.Execute("abc123def456ghi789")
Debug.Print "Global=True: " & matches.Count & " match(es)" ' 3
End Sub
IgnoreCase Property
When set to True, performs case-insensitive matching.
Sub IgnoreCaseExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "hello"
' IgnoreCase = False (default): case-sensitive
re.IgnoreCase = False
Debug.Print re.Test("HELLO") ' False
' IgnoreCase = True: case-insensitive
re.IgnoreCase = True
Debug.Print re.Test("HELLO") ' True
End Sub
Key RegExp Methods
| Method | Description | Return Type |
|---|---|---|
Test | Check if pattern matches | Boolean |
Execute | Get matched strings | MatchCollection |
Replace | Replace matched strings | String |
Test Method: Check for Pattern Match
The Test method returns True or False indicating whether the string matches the pattern. Ideal for input validation.
Sub TestMethodExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
' Phone number format check (e.g., 555-123-4567)
re.Pattern = "^\d{3}-\d{3}-\d{4}$"
Debug.Print re.Test("555-123-4567") ' True
Debug.Print re.Test("555-1234-567") ' False
Debug.Print re.Test("5551234567") ' False
Debug.Print re.Test("abc-def-ghij") ' False
End Sub
Execute Method: Get Matched Strings
The Execute method returns matches as a MatchCollection object. Each match is stored as a Match object.
Sub ExecuteMethodExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d+"
re.Global = True
Dim matches As Object
Dim match As Object
Set matches = re.Execute("Order123 has 45 items totaling $678")
Debug.Print "Match count: " & matches.Count
For Each match In matches
Debug.Print "Value: " & match.Value & _
", Position: " & match.FirstIndex & _
", Length: " & match.Length
Next match
' Output:
' Value: 123, Position: 5, Length: 3
' Value: 45, Position: 13, Length: 2
' Value: 678, Position: 32, Length: 3
End Sub
Replace Method: Replace Matched Strings
The Replace method replaces pattern matches with specified text.
Sub ReplaceMethodExample()
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d+"
re.Global = True
Dim result As String
result = re.Replace("Order123 has 45 items", "X")
Debug.Print result ' OrderX has X items
End Sub
Common Pattern Syntax
| Pattern | Description | Example |
|---|---|---|
\d | Any digit (0-9) | \d+ matches “123” |
\w | Word character (a-z, A-Z, 0-9, _) | \w+ matches “abc” |
\s | Whitespace character | \s+ matches spaces |
. | Any character except newline | a.c matches “abc” |
* | Zero or more of preceding | ab*c matches “ac” |
+ | One or more of preceding | ab+c matches “abc” |
? | Zero or one of preceding | ab?c matches “ac” |
{n} | Exactly n occurrences | a{3} matches “aaa” |
{n,m} | Between n and m occurrences | a{2,4} matches “aa” to “aaaa” |
[] | Character class | [abc] matches “a”, “b”, or “c” |
[^] | Negated character class | [^abc] matches anything except “a”, “b”, “c” |
^ | Start of string | ^Hello matches “Hello World” |
$ | End of string | World$ matches “Hello World” |
| | Alternation (OR) | cat|dog matches “cat” or “dog” |
() | Grouping and capturing | (ab)+ matches “abab” |
Practical Examples
Email Address Validation
Function IsValidEmail(email As String) As Boolean
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
' Basic email pattern
re.Pattern = "^[\w\.-]+@[\w\.-]+\.\w{2,}$"
re.IgnoreCase = True
IsValidEmail = re.Test(email)
End Function
Sub TestEmailValidation()
Debug.Print IsValidEmail("[email protected]") ' True
Debug.Print IsValidEmail("[email protected]") ' True
Debug.Print IsValidEmail("invalid@") ' False
Debug.Print IsValidEmail("@domain.com") ' False
End Sub
Phone Number Formatting
Function FormatPhoneNumber(phone As String) As String
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
' Remove all non-digits
re.Pattern = "\D"
re.Global = True
Dim digits As String
digits = re.Replace(phone, "")
' Format as XXX-XXX-XXXX if 10 digits
If Len(digits) = 10 Then
FormatPhoneNumber = Left(digits, 3) & "-" & _
Mid(digits, 4, 3) & "-" & _
Right(digits, 4)
Else
FormatPhoneNumber = phone ' Return original if not 10 digits
End If
End Function
Sub TestPhoneFormatting()
Debug.Print FormatPhoneNumber("5551234567") ' 555-123-4567
Debug.Print FormatPhoneNumber("(555) 123-4567") ' 555-123-4567
Debug.Print FormatPhoneNumber("555.123.4567") ' 555-123-4567
End Sub
Extracting URLs from Text
Function ExtractURLs(text As String) As Collection
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "https?://[\w\.-]+(?:/[\w\.-]*)*"
re.Global = True
re.IgnoreCase = True
Dim matches As Object
Set matches = re.Execute(text)
Dim urls As New Collection
Dim match As Object
For Each match In matches
urls.Add match.Value
Next match
Set ExtractURLs = urls
End Function
Sub TestExtractURLs()
Dim text As String
text = "Visit https://www.example.com or http://test.org/page for more info."
Dim urls As Collection
Set urls = ExtractURLs(text)
Dim url As Variant
For Each url In urls
Debug.Print url
Next url
' Output:
' https://www.example.com
' http://test.org/page
End Sub
Data Cleansing
Function CleanText(text As String) As String
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Global = True
' Remove extra whitespace
re.Pattern = "\s+"
Dim result As String
result = re.Replace(text, " ")
' Remove special characters (keep alphanumeric and basic punctuation)
re.Pattern = "[^\w\s\.\,\!\?]"
result = re.Replace(result, "")
CleanText = Trim(result)
End Function
Sub TestCleanText()
Dim messy As String
messy = " Hello!!! World### How are you??? "
Debug.Print CleanText(messy) ' Hello! World How are you?
End Sub
Parsing CSV with Quoted Fields
Function ParseCSVLine(csvLine As String) As Collection
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
' Match quoted or unquoted fields
re.Pattern = """([^""]*)""|([^,]+)"
re.Global = True
Dim matches As Object
Set matches = re.Execute(csvLine)
Dim fields As New Collection
Dim match As Object
For Each match In matches
If match.SubMatches(0) <> "" Then
fields.Add match.SubMatches(0) ' Quoted field
Else
fields.Add match.SubMatches(1) ' Unquoted field
End If
Next match
Set ParseCSVLine = fields
End Function
Performance Considerations
Reuse RegExp Objects
Creating RegExp objects is relatively expensive. Reuse them when processing multiple strings:
Sub ProcessManyStrings()
' Create once
Dim re As Object
Set re = CreateObject("VBScript.RegExp")
re.Pattern = "\d+"
re.Global = True
Dim texts As Variant
texts = Array("abc123", "def456", "ghi789")
' Reuse for each string
Dim text As Variant
For Each text In texts
Debug.Print re.Execute(text).Count & " matches in " & text
Next text
End Sub
Summary
Regular expressions in VBA provide powerful pattern matching capabilities:
- Test method: Quick validation checks
- Execute method: Extract all matches with position information
- Replace method: Pattern-based text transformation
Master regular expressions to handle complex string processing tasks efficiently. While the initial learning curve may be steep, the productivity gains are well worth the investment.