forked from agarciadom/xeger
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
70 lines (60 loc) · 2.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
This is a maintenance fork of the Xeger library. It is a Java library
for generating strings that match a specific regular expression. The
original version of Xeger is available from this address, but
development seems to have stopped:
http://code.google.com/p/xeger/
The objective of this fork is to simply maintain Xeger, fixing the
reported bugs as they come. Pull requests are welcome, though :-).
For reference, I have copied several useful sections of the original
website below.
Introduction
------------
Think of it as the opposite of regular expression matchers. This
library allows you to generate text that is guaranteed to match a
regular expression passed in.
Let's take the regular expression: [ab]{4,6}c Using Xeger, you can now
generate Strings matching this pattern like this:
String regex = "[ab]{4,6}c";
Xeger generator = new Xeger(regex);
String result = generator.generate();
assert result.matches(regex);
Limitations
-----------
Xeger does not support all valid Java regular expressions. The full
set of what is defined here and is summarized below. Future versions
might support a more complete set, in case of popular demand.
regexp ::= unionexp
|
unionexp ::= interexp | unionexp (union)
| interexp
interexp ::= concatexp & interexp (intersection) [OPTIONAL]
| concatexp
concatexp ::= repeatexp concatexp (concatenation)
| repeatexp
repeatexp ::= repeatexp ? (zero or one occurrence)
| repeatexp * (zero or more occurrences)
| repeatexp + (one or more occurrences)
| repeatexp {n} (n occurrences)
| repeatexp {n,} (n or more occurrences)
| repeatexp {n,m} (n to m occurrences, including both)
| complexp
complexp ::= ~ complexp (complement) [OPTIONAL]
| charclassexp
charclassexp ::= [ charclasses ] (character class)
| [^ charclasses ] (negated character class)
| simpleexp
charclasses ::= charclass charclasses
| charclass
charclass ::= charexp - charexp (character range, including end-points)
| charexp
simpleexp ::= charexp
| . (any single character)
| # (the empty language) [OPTIONAL]
| @ (any string) [OPTIONAL]
| " <Unicode string without double-quotes> " (a string)
| ( ) (the empty string)
| ( unionexp ) (precedence override)
| < <identifier> > (named automaton) [OPTIONAL]
| <n-m> (numerical interval) [OPTIONAL]
charexp ::= <Unicode character> (a single non-reserved character)
| \ <Unicode character> (a single character)