Regex timeout - C#

Karen Payne - Oct 27 - - Dev Community

Introduction

Learn about setting up regular expressions with a timeout for an entire application. The reason, suppose a malicious user enters an input for an email address that leads to a denial of service which can bring down an application, the same can occur for a desktop application cause the application to become unresponsive.

No matter if a regular expression accepts untrusted data or not a developer can set a timeout for all regular expressions in an application. In the samples provided, learn how to set a global timeout for an application reading the timeout from appsettings.json.

Microsoft docs: Define a time-out value

Source code has more code than shown below.

Setting up

Read settings

Add the following to appsettings.json which will set the default time out for all regular expressions to one second. Feel free to change to milliseconds if one second is not acceptable.

{
  "RegularExpressions": {
    "Timeout": "00:00:01.000"
  }
}
Enter fullscreen mode Exit fullscreen mode

A class/model for reading the Timeout.

/// <summary>
/// Represents a model for handling regular expressions with a configurable timeout.
/// In this case <see cref="Timeout"/> is read from appsettings.json
/// </summary>
public class RegularExpressions
{
    /// <summary>
    /// Gets or sets the timeout value for regular expressions.
    /// </summary>
    /// <value>
    /// A <see cref="TimeSpan"/> representing the maximum time allowed for a regular expression match to execute.
    /// </value>
    /// <remarks>
    /// This property is decorated with a <see cref="JsonConverterAttribute"/> that specifies the use of <see cref="TimeSpanConverter"/> 
    /// for JSON serialization and deserialization.
    /// </remarks>
    [JsonConverter(typeof(TimeSpanConverter))]
    public TimeSpan Timeout { get; set; }
}
Enter fullscreen mode Exit fullscreen mode

Reading timeout in uses the following class.

public static class Configuration
{
    /// <summary>
    /// Reads a configuration section and converts it to the specified type.
    /// </summary>
    /// <typeparam name="T">The type to which the configuration section will be converted.</typeparam>
    /// <param name="sectionName">The name of the configuration section to read.</param>
    /// <returns>An instance of <typeparamref name="T"/> representing the configuration section.</returns>
    public static T ReadSection<T>(string sectionName)
        => JsonRoot().GetSection(sectionName).Get<T>();
}
Enter fullscreen mode Exit fullscreen mode

Which the following method does the retrieval.

public static TimeSpan RegexTimeOut()
{
    var timeOut = Configuration.ReadSection<RegularExpressions>("RegularExpressions");
    return timeOut.Timeout;
}
Enter fullscreen mode Exit fullscreen mode

Set global timeout for an application

The following sets the global timeout.

public static string _timeout => "REGEX_DEFAULT_MATCH_TIMEOUT";

/// <summary>
/// Sets the regular expression timeout value in the application domain data.
/// </summary>
/// <remarks>
/// This method retrieves the timeout value from the configuration and sets it in the application domain data
/// using a predefined key. The timeout value is used to limit the execution time of regular expressions.
/// </remarks>
public static void SetTimeout()
{
    AppDomain.CurrentDomain.SetData(_timeout, TimeSpan.FromSeconds(RegexTimeOut().Seconds));
}
Enter fullscreen mode Exit fullscreen mode

Get global timeout for an application

/// <summary>
/// Retrieves the regular expression timeout value from the application domain data.
/// </summary>
/// <returns>
/// A <see cref="TimeSpan"/> representing the timeout value if it is set; otherwise, <c>null</c>.
/// </returns>
public static TimeSpan? GetTimeout() 
    => (TimeSpan?)AppDomain.CurrentDomain.GetData(_timeout);
Enter fullscreen mode Exit fullscreen mode

Determine if there is a timeout set

If Regex.InfiniteMatchTimeout.Milliseconds equals -1 the default timeout is used.

/// <summary>
/// Determines whether the default timeout for regular expression operations is set to infinite.
/// </summary>
/// <returns>
/// <c>true</c> if the default timeout is infinite; otherwise, <c>false</c>.
/// </returns>
public static bool IsDefaultTimeout()
{
    return Regex.InfiniteMatchTimeout.Milliseconds == -1;
}
Enter fullscreen mode Exit fullscreen mode

Samples

Here are several samples include in the include source code.

Crash-in-burn sample

To keep everything clear, the timeout is set directly for the timeout rather than using the timeout from appsettings.json.

In this case the timeout is one second to malicious input which take more than 30 seconds which goes back to in a web application a denial of service.

public static void BadSample()
{
    AppDomain.CurrentDomain.SetData("REGEX_DEFAULT_MATCH_TIMEOUT", TimeSpan.FromSeconds(1));

    try
    {
        // Takes more than 30s
        var isMatch = EmailRegex().IsMatch("t@t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.c%20");
    }
    catch (RegexMatchTimeoutException ex)
    {
        AnsiConsole.MarkupLine($"[red]Regex Timeout for[/] {ex.Message} after [cyan]{ex.MatchTimeout}[/] elapsed.");
        AnsiConsole.MarkupLine("[red]Pattern[/]");
        Console.WriteLine(ex.Pattern);
        Log.Error(ex,nameof(BadSample));
    }
    catch (ArgumentOutOfRangeException ex)
    {
        AnsiConsole.MarkupLine($"[red]{ex.Message}[/]");
        Log.Error(ex, nameof(BadSample));
    }
}
Enter fullscreen mode Exit fullscreen mode

Serilog dump

[2024-10-27 09:31:52.426 [Error] BadSample
System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
   at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
   at System.Text.RegularExpressions.Generated.<RegexGenerator_g>F4CCF545FEA8210BA650F96F69065B2DEAC44F2CBB643E9156FE00073346BE310__EmailRegex_1.RunnerFactory.Runner.TryMatchAtCurrentPosition(ReadOnlySpan`1 inputSpan) in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\obj\Debug\net8.0\System.Text.RegularExpressions.Generator\System.Text.RegularExpressions.Generator.RegexGenerator\RegexGenerator.g.cs:line 1107
   at System.Text.RegularExpressions.Generated.<RegexGenerator_g>F4CCF545FEA8210BA650F96F69065B2DEAC44F2CBB643E9156FE00073346BE310__EmailRegex_1.RunnerFactory.Runner.Scan(ReadOnlySpan`1 inputSpan) in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\obj\Debug\net8.0\System.Text.RegularExpressions.Generator\System.Text.RegularExpressions.Generator.RegexGenerator\RegexGenerator.g.cs:line 264
   at System.Text.RegularExpressions.Regex.ScanInternal(RegexRunnerMode mode, Boolean reuseMatchObject, String input, Int32 beginning, RegexRunner runner, ReadOnlySpan`1 span, Boolean returnNullIfReuseMatchObject)
   at System.Text.RegularExpressions.Regex.RunSingleMatch(RegexRunnerMode mode, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
   at System.Text.RegularExpressions.Regex.IsMatch(String input)
   at RegularExpressionsTimeOutApp.Classes.Samples.BadSample() in C:\OED\DotnetLand\VS2022\LanguageFeatures\RegularExpressionsTimeOutApp\Classes\Samples.cs:line 114
Enter fullscreen mode Exit fullscreen mode

Screenshot

shows same text as log dump with some formatting

Without a timeout example, IsMatch on the author's machine.

public static void BadSampleRaw()
{
    var timer = new Stopwatch();
    timer.Start();

    var isMatch = EmailRegex().IsMatch("t@t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.t.c%20");

    timer.Stop();

    TimeSpan timeTaken = timer.Elapsed;

    Console.WriteLine(isMatch.ToYesNo());
    Console.WriteLine($"Time taken: {timeTaken:m\\:ss\\.fff}");
}
Enter fullscreen mode Exit fullscreen mode

operation results

Using timeout from appsettings.json

In this case the timeout is one second but could be done with milliseconds but seconds is used for good measure.

public static void NormalUse()
{
    string input = @"\\SomeServer\HTTP\demo1\index.html 4 KB HTML File 2/19/2019 3:48:21 PM 2/19/2019 1:05:53 PM 2/19/2019 1:05:53 PM 5";

    const string format = "M/d/yyyy h:mm:ss tt";

    MatchCollection matches = DatesRegex().Matches(input);

    foreach (Match match in matches)
    {
        var dateTime = DateTime.ParseExact(match.Value, format, CultureInfo.InvariantCulture);
        Console.WriteLine(dateTime);
    }
}
Enter fullscreen mode Exit fullscreen mode

Screenshot

operation results

Main class

Although in provided source code the following class is in a console project, to make the code usable in other projects simply create a class project which can be used for web or desktop projects.

public class RegexOperations
{

    /// <summary>
    /// Retrieves the regular expression timeout value from the configuration.
    /// </summary>
    /// <returns>
    /// A <see cref="TimeSpan"/> representing the timeout value for regular expressions.
    /// </returns>
    /// <remarks>
    /// This method reads the "RegularExpressions" section from the configuration and returns the timeout value specified.
    /// </remarks>
    public static TimeSpan RegexTimeOut()
    {
        var timeOut = Configuration.ReadSection<RegularExpressions>("RegularExpressions");
        return timeOut.Timeout;
    }

    public static string _timeout => "REGEX_DEFAULT_MATCH_TIMEOUT";

    /// <summary>
    /// Sets the regular expression timeout value in the application domain data.
    /// </summary>
    /// <remarks>
    /// This method retrieves the timeout value from the configuration and sets it in the application domain data
    /// using a predefined key. The timeout value is used to limit the execution time of regular expressions.
    /// </remarks>
    public static void SetTimeout()
    {
        AppDomain.CurrentDomain.SetData(_timeout, TimeSpan.FromSeconds(RegexTimeOut().Seconds));
    }

    /// <summary>
    /// Retrieves the regular expression timeout value from the application domain data.
    /// </summary>
    /// <returns>
    /// A <see cref="TimeSpan"/> representing the timeout value if it is set; otherwise, <c>null</c>.
    /// </returns>
    public static TimeSpan? GetTimeout() 
        => (TimeSpan?)AppDomain.CurrentDomain.GetData(_timeout);
}
Enter fullscreen mode Exit fullscreen mode

Summary

All regular expressions should be setup for a user defined timeout rather than using the default timeout in all project types. By using provided code will protect against malicious user input and badly written regular expressions.

Please take time to study the source code to get a good understanding of the code.

NuGet packages used.

Top-level Package Version
ConfigurationLibrary 1.0.6
ConsoleConfigurationLibrary 1.0.0.4
ConsoleHelperLibrary 1.0.2
Microsoft.Extensions.Configuration.Json 8.0.1
Microsoft.Extensions.Options.ConfigurationExtensions 8.0.0
Serilog 3.1.1
Serilog.Extensions.Logging.File 3.0.0
Serilog.Sinks.Console 5.0.1
Serilog.Sinks.File 5.0.0
Spectre.Console 0.46.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .