Skip to content

PCRE "single-line mode" not properly represented in CTRE #282

Open
@Minty-Meeo

Description

@Minty-Meeo

I want to preface this with the fact that I am quite inexperienced with regular expressions, so I may be wrong about some things.

When I created issue #281, the example I linked for CTRE used a ctre::multiline_starts_with. This was because it was a simplified snippet from a personal project I am attempting to convert to using CTRE. I intended to use ctre::starts_with, as that is the direct analogue for the std::regex mode I was using before. However, ctre::starts_with consistently caused stack overflow crashes. I have now discovered, through trial and error, why this was.

STL: https://godbolt.org/z/vP9YqGP3v
CTRE: https://godbolt.org/z/bedTY8jxo

I do not know how to describe, it, but it seems regular expressions of various flavors (when not in multi-line mode) have special rules for the '\n' and '\r' characters that CTRE does not follow. I found a website that helps support this claim: https://regex101.com/r/Syt781/1. Notice that the regex behaves identically in ECMAScript, PCRE, and PCRE2 modes. I say it is a special rule for these characters in particular because other characters, including escape sequences like '\a', do still result in the greedy capture going too far with std::regex: https://godbolt.org/z/1cj3KqMas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions