Regular Expressions (RegExp)

Maintained on

When working with strings in VBA, you may encounter complex pattern matching requirements that simple search and replace functions can’t handle. That’s where Regular Expressions (RegExp) become invaluable.

With regular expressions, you can efficiently perform advanced string operations like email format validation, phone number extraction, and pattern-based text replacement.

This article covers everything you need to know about using the RegExp object in VBA, from basics to practical applications.

When You Need Regular Expressions

Regular expressions are powerful in these real-world scenarios:

  • Input validation: Check formats for email addresses, phone numbers, postal codes
  • Data extraction: Extract specific patterns from HTML or text
  • Complex replacement: Transform only parts matching certain patterns
  • Log file analysis: Extract dates, IP addresses, or other formatted data
  • Data cleansing: Remove unwanted characters or whitespace in bulk
チェック

Regular expressions enable “pattern-based matching” that’s difficult to achieve with functions like InStr or Replace.

RegExp Object Basics

Creating a RegExp Object

To use regular expressions in VBA, create a VBScript.RegExp object. There are two methods:

Method 1: Using Reference Setting (Early Binding)

In the VBA Editor, go to “Tools” → “References” and check Microsoft VBScript Regular Expressions 5.5.

regexp_early_binding.bas
Sub RegExpEarlyBinding()
    ' Requires reference setting
    Dim re As RegExp
    Set re = New RegExp

    re.Pattern = "\d+"
    Debug.Print re.Test("abc123")  ' True
End Sub

Method 2: Using CreateObject (Late Binding)

This method works without reference settings and is more portable.

regexp_late_binding.bas
Sub RegExpLateBinding()
    ' No reference setting required
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "\d+"
    Debug.Print re.Test("abc123")  ' True
End Sub
チェック

The CreateObject method is recommended because it doesn’t require reference settings, making macro sharing across environments smoother.

Key RegExp Properties

PropertyDescriptionDefault
PatternThe regular expression patternNone
GlobalSearch entire stringFalse
IgnoreCaseCase-insensitive matchingFalse
MultiLineEnable multi-line modeFalse

Pattern Property

Specifies the regular expression pattern for searching.

pattern_example.bas
Sub PatternExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    ' Digit pattern
    re.Pattern = "\d+"
    Debug.Print re.Test("Order123")  ' True

    ' Email pattern
    re.Pattern = "[\w\.-]+@[\w\.-]+\.\w+"
    Debug.Print re.Test("[email protected]")  ' True
End Sub

Global Property

When set to True, searches the entire string and returns all matches. When False (default), returns only the first match.

global_example.bas
Sub GlobalExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "\d+"

    ' Global = False (default): first match only
    re.Global = False
    Dim matches As Object
    Set matches = re.Execute("abc123def456ghi789")
    Debug.Print "Global=False: " & matches.Count & " match(es)"  ' 1

    ' Global = True: all matches
    re.Global = True
    Set matches = re.Execute("abc123def456ghi789")
    Debug.Print "Global=True: " & matches.Count & " match(es)"  ' 3
End Sub

IgnoreCase Property

When set to True, performs case-insensitive matching.

ignorecase_example.bas
Sub IgnoreCaseExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "hello"

    ' IgnoreCase = False (default): case-sensitive
    re.IgnoreCase = False
    Debug.Print re.Test("HELLO")  ' False

    ' IgnoreCase = True: case-insensitive
    re.IgnoreCase = True
    Debug.Print re.Test("HELLO")  ' True
End Sub

Key RegExp Methods

MethodDescriptionReturn Type
TestCheck if pattern matchesBoolean
ExecuteGet matched stringsMatchCollection
ReplaceReplace matched stringsString

Test Method: Check for Pattern Match

The Test method returns True or False indicating whether the string matches the pattern. Ideal for input validation.

test_method.bas
Sub TestMethodExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    ' Phone number format check (e.g., 555-123-4567)
    re.Pattern = "^\d{3}-\d{3}-\d{4}$"

    Debug.Print re.Test("555-123-4567")   ' True
    Debug.Print re.Test("555-1234-567")   ' False
    Debug.Print re.Test("5551234567")     ' False
    Debug.Print re.Test("abc-def-ghij")   ' False
End Sub

Execute Method: Get Matched Strings

The Execute method returns matches as a MatchCollection object. Each match is stored as a Match object.

execute_method.bas
Sub ExecuteMethodExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "\d+"
    re.Global = True

    Dim matches As Object
    Dim match As Object

    Set matches = re.Execute("Order123 has 45 items totaling $678")

    Debug.Print "Match count: " & matches.Count

    For Each match In matches
        Debug.Print "Value: " & match.Value & _
                    ", Position: " & match.FirstIndex & _
                    ", Length: " & match.Length
    Next match
    ' Output:
    ' Value: 123, Position: 5, Length: 3
    ' Value: 45, Position: 13, Length: 2
    ' Value: 678, Position: 32, Length: 3
End Sub

Replace Method: Replace Matched Strings

The Replace method replaces pattern matches with specified text.

replace_method.bas
Sub ReplaceMethodExample()
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "\d+"
    re.Global = True

    Dim result As String
    result = re.Replace("Order123 has 45 items", "X")

    Debug.Print result  ' OrderX has X items
End Sub

Common Pattern Syntax

PatternDescriptionExample
\dAny digit (0-9)\d+ matches “123”
\wWord character (a-z, A-Z, 0-9, _)\w+ matches “abc”
\sWhitespace character\s+ matches spaces
.Any character except newlinea.c matches “abc”
*Zero or more of precedingab*c matches “ac”
+One or more of precedingab+c matches “abc”
?Zero or one of precedingab?c matches “ac”
{n}Exactly n occurrencesa{3} matches “aaa”
{n,m}Between n and m occurrencesa{2,4} matches “aa” to “aaaa”
[]Character class[abc] matches “a”, “b”, or “c”
[^]Negated character class[^abc] matches anything except “a”, “b”, “c”
^Start of string^Hello matches “Hello World”
$End of stringWorld$ matches “Hello World”
|Alternation (OR)cat|dog matches “cat” or “dog”
()Grouping and capturing(ab)+ matches “abab”

Practical Examples

Email Address Validation

email_validation.bas
Function IsValidEmail(email As String) As Boolean
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    ' Basic email pattern
    re.Pattern = "^[\w\.-]+@[\w\.-]+\.\w{2,}$"
    re.IgnoreCase = True

    IsValidEmail = re.Test(email)
End Function

Sub TestEmailValidation()
    Debug.Print IsValidEmail("[email protected]")     ' True
    Debug.Print IsValidEmail("[email protected]")  ' True
    Debug.Print IsValidEmail("invalid@")             ' False
    Debug.Print IsValidEmail("@domain.com")          ' False
End Sub

Phone Number Formatting

phone_formatting.bas
Function FormatPhoneNumber(phone As String) As String
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    ' Remove all non-digits
    re.Pattern = "\D"
    re.Global = True

    Dim digits As String
    digits = re.Replace(phone, "")

    ' Format as XXX-XXX-XXXX if 10 digits
    If Len(digits) = 10 Then
        FormatPhoneNumber = Left(digits, 3) & "-" & _
                            Mid(digits, 4, 3) & "-" & _
                            Right(digits, 4)
    Else
        FormatPhoneNumber = phone  ' Return original if not 10 digits
    End If
End Function

Sub TestPhoneFormatting()
    Debug.Print FormatPhoneNumber("5551234567")      ' 555-123-4567
    Debug.Print FormatPhoneNumber("(555) 123-4567")  ' 555-123-4567
    Debug.Print FormatPhoneNumber("555.123.4567")    ' 555-123-4567
End Sub

Extracting URLs from Text

extract_urls.bas
Function ExtractURLs(text As String) As Collection
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    re.Pattern = "https?://[\w\.-]+(?:/[\w\.-]*)*"
    re.Global = True
    re.IgnoreCase = True

    Dim matches As Object
    Set matches = re.Execute(text)

    Dim urls As New Collection
    Dim match As Object
    For Each match In matches
        urls.Add match.Value
    Next match

    Set ExtractURLs = urls
End Function

Sub TestExtractURLs()
    Dim text As String
    text = "Visit https://www.example.com or http://test.org/page for more info."

    Dim urls As Collection
    Set urls = ExtractURLs(text)

    Dim url As Variant
    For Each url In urls
        Debug.Print url
    Next url
    ' Output:
    ' https://www.example.com
    ' http://test.org/page
End Sub

Data Cleansing

data_cleansing.bas
Function CleanText(text As String) As String
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")
    re.Global = True

    ' Remove extra whitespace
    re.Pattern = "\s+"
    Dim result As String
    result = re.Replace(text, " ")

    ' Remove special characters (keep alphanumeric and basic punctuation)
    re.Pattern = "[^\w\s\.\,\!\?]"
    result = re.Replace(result, "")

    CleanText = Trim(result)
End Function

Sub TestCleanText()
    Dim messy As String
    messy = "  Hello!!!   World###   How    are   you???  "

    Debug.Print CleanText(messy)  ' Hello! World How are you?
End Sub

Parsing CSV with Quoted Fields

parse_csv.bas
Function ParseCSVLine(csvLine As String) As Collection
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")

    ' Match quoted or unquoted fields
    re.Pattern = """([^""]*)""|([^,]+)"
    re.Global = True

    Dim matches As Object
    Set matches = re.Execute(csvLine)

    Dim fields As New Collection
    Dim match As Object
    For Each match In matches
        If match.SubMatches(0) <> "" Then
            fields.Add match.SubMatches(0)  ' Quoted field
        Else
            fields.Add match.SubMatches(1)  ' Unquoted field
        End If
    Next match

    Set ParseCSVLine = fields
End Function

Performance Considerations

Reuse RegExp Objects

Creating RegExp objects is relatively expensive. Reuse them when processing multiple strings:

reuse_regexp.bas
Sub ProcessManyStrings()
    ' Create once
    Dim re As Object
    Set re = CreateObject("VBScript.RegExp")
    re.Pattern = "\d+"
    re.Global = True

    Dim texts As Variant
    texts = Array("abc123", "def456", "ghi789")

    ' Reuse for each string
    Dim text As Variant
    For Each text In texts
        Debug.Print re.Execute(text).Count & " matches in " & text
    Next text
End Sub

Summary

Regular expressions in VBA provide powerful pattern matching capabilities:

  • Test method: Quick validation checks
  • Execute method: Extract all matches with position information
  • Replace method: Pattern-based text transformation

Master regular expressions to handle complex string processing tasks efficiently. While the initial learning curve may be steep, the productivity gains are well worth the investment.

#VBA #Regular Expression #RegExp #Pattern Matching