Mastering Regular Expressions
Full Index -- use your browser's find function to search.
\? 139
\<...\> 21, 25, 50, 131-132, 150
\<...\>, in egrep 15
\<...\>, in Emacs 100
\<...\>, mimicking in Perl 341-342
\+ 139
\(...\) 135
`\+' history 87
\0 116-117
\1 136, 300, 303
\1, in Perl 41
\A 111, 127-128
\A, in Java 373
\A, optimization 246
\a 114-115
\b 65, 114-115, 400
\b, backspace and word boundary 44, 46
\b, in Perl 286
\b\B 240
\C 328
\D 49, 119
\d 49, 119
\d, in Perl 288
\e 79, 114-115
\E 290
\f 114-115
\f, introduced 44
\G 128-131, 212, 315-316, 362
\G, advanced example 130
\G, in Java 373
\G, in .NET 402
\G, optimization 246
\G, optimization, \kname (see named capture)
\l 290
\L...\E 290
\L...\E, inhibiting 292
\n 49, 114-115
\n, introduced 44
\n, machine-dependency 114
\N{LATIN SMALL LETTER SHARP S} 290
\N{name} 290
\N{name}, inhibiting 292
\p{...} 119
\p{^...} 288
\p{all} 380
\p{All} 123
\p{All}, in Perl 288
\p{Any} 123
\p{Any}, in Perl 288
\p{Arrows} 122
\p{Assigned} 123-124
\p{Assigned}, in Perl 288
\p{Basic_Latin} 122
\p{Box_Drawing} 122
\p{C} 120
\p{Cc} 121
\p{Cf} 121
\p{Cherokee} 120
\p{Close_Punctuation} 121
\p{Cn} 121, 123-124, 380, 401
\p{Co} 121
\p{Connector_Punctuation} 121
\p{Control} 121
\p{Currency} 122
\p{Currency_Symbol} 121
\p{Cyrillic} 120, 122
\p{Dash_Punctuation} 121
\p{Decimal_Digit_Number} 121
\p{Dingbats} 122
\p{Enclosing_Mark} 121
\p{Final_Punctuation} 121
\p{Format} 121
\p{Gujarati} 120
\p{Han} 120
\p{Hangul_Jamo} 122
\p{Hebrew} 120, 122
\p{Hiragana} 120
\p{InArrows} 122
\p{InBasic_Latin} 122
\p{InBox_Drawing} 122
\p{InCurrency} 122
\p{InCyrillic} 122
\p{InDingbats} 122
\p{InHangul_Jamo} 122
\p{InHebrew} 122
\p{Inherited} 122
\p{Initial_Punctuation} 121
\p{InKatakana} 122
\p{InTamil} 122
\p{InTibetan} 122
\p{IsCherokee} 120
\p{IsCommon} 122
\p{IsCyrillic} 120
\p{IsGujarati} 120
\p{IsHan} 120
\p{IsHebrew} 120
\p{IsHiragana} 120
\p{IsKatakana} 120
\p{IsLatin} 120
\p{IsThai} 120
\p{IsTibetan} 122
\p{Katakana} 120, 122
\p{L} 119-120, 131, 380, 390
\p{L&} 120-121, 123
\p{L&}, in Perl 288
\p{Latin} 120
\p{Letter} 120, 288
\p{Letter_Number} 121
\p{Line_Separator} 121
\p{Ll} 121, 400
\p{Lm} 121, 400
\p{Lo} 121, 400
\p{Lowercase_Letter} 121
\p{Lt} 121, 400
\p{Lu} 121, 400
\p{M} 120, 125
\p{Mark} 120
\p{Math_Symbol} 121
\p{Mc} 121
\p{Me} 121
\p{Mn} 121
\p{Modifier_Letter} 121
\p{Modifier_Symbol} 121
\p{N} 120, 390
\p{Nd} 121, 380, 400
\p{Nl} 121
\p{No} 121
\p{Non_Spacing_Mark} 121
\p{Number} 120
\p{Open_Punctuation} 121
\p{Other} 120
\p{Other_Letter} 121
\p{Other_Number} 121
\p{Other_Punctuation} 121
\p{Other_Symbol} 121
\p{P} 120
\p{Paragraph_Separator} 121
\p{Pc} 121, 400
\p{Pd} 121
\p{Pe} 121
\p{Pf} 121, 400
\p{Pi} 121, 400
\p{Po} 121
\p{Private_Use} 121
\p{Ps} 121
\p{Punctuation} 120
\p{S} 120
\p{Sc} 121-122
\p{Separator} 120
\p{Sk} 121
\p{Sm} 121
\p{So} 121
\p{Space_Separator} 121
\p{Spacing_Combining_Mark} 121
\p{Symbol} 120
\p{Tamil} 122
\p{Thai} 120
\p{Tibetan} 122
\p{Titlecase_Letter} 121
\p{Unassigned} 121, 123
\p{Unassigned}, in Perl 288
\p{Uppercase_Letter} 121
\p{Z} 119-120, 380, 400
\p{Zl} 121
\p{Zp} 121
\p{Zs} 121
\Q...\E 290
\Q...\E, inhibiting 292
\Q...\E, in Java 373
\r 49, 114-115
\r, machine-dependency 114
\s 49, 119
\s, introduction 47
\s, in Emacs 127
\s, in Perl 288
\S 49, 56, 119
\t 49, 114-115
\t, introduced 44
\u 116, 290, 400
\U 116
\U...\E 290
\U...\E, inhibiting 292
\V 364
\v 114-115, 364
\W 49, 119
\w 49, 65, 119
\w, in Emacs 127
\w, many different interpretations 93
\w, in Perl 288
\x 116, 400
\x, in Perl 286
\X 107, 125
\z 111, 127-128, 316
\z, in Java 373
\z, optimization 246
\Z 111, 127-128
\Z, in Java 373
\Z, optimization 246
// 322
/c 129-130, 315
/e 319-321
/g 61, 130, 307, 311-312, 315, 319
/g, introduced 51
/g, with regex object 354
/i 134
/i, introduced 47
/i, with study 359
/m 134
/o 352-353
/o, with regex object 354
/osmosis 293
/s 134
/x 134, 288
/x, introduced 72
/x, history 90
-Dr 363
-i as -y 86
-y old grep 86
<> 54
<>, and $_ 79
!~ 309
$_ 79, 308, 311, 314, 318, 322, 353-354, 359
$_, in .NET 418
$& 299-300
$&, checking for 358
$&, mimicking 302, 357
$&, naughty 356
$&, in .NET 418
$&, okay for debugging 331
$&, pre-match copy 355
$$ in .NET 418
$* 362
$ 111-112, 128
$, escaping 77
$, optimization 246
$, Perl interpolation 289
$+ 300-301, 345
$+, example 202
$+, .NET 202
$+, in .NET 418
$/ 35, 78
$' 300
$', checking for 358
$', mimicking 357
$', naughty 356
$', in .NET 418
$', okay for debugging 331
$', pre-match copy 355
$` 300
$`, checking for 358
$`, mimicking 357
$`, naughty 356
$`, in .NET 418
$`, okay for debugging 331
$`, pre-match copy 355
$0 300
$1 135-136, 300, 303
$1, introduced 41
$1, in Java 388
$1, in .NET 418
$1, in other languages 136
$1, pre-match copy 355
$ARGV 79
$HostnameRegex 76, 136, 303, 351
$HttpUrl 303, 305, 345, 351
$LevelN 330, 343
$^N 300-301, 344-346
${name} 403
${name~} 418
$NestedStuffRegex 339, 346
$^R 302, 327
$^W 297
% Perl interpolation 289
(?!) 240, 333, 335, 340-341
(?#...) 99, 134, 414
(?#...), in Java 373
(?#...), in Java, (?:...) (see non-capturing parentheses)
(?#...), in Java, (...) (see parentheses)
(?#...), in Java, (?i) (see: case-insensitive mode; mode modifier)
(?#...), in Java, (?i:...) (see mode-modified span)
(?#...), in Java, (?if then|else) (see conditional)
(?#...), in Java, (?m:...) (see mode-modified span)
(?#...), in Java, (?m) (see: enhanced line-anchor mode; mode modifier)
(?n) 402
.*, introduced 55
.*, mechanics of matching 152
.*, optimization 246
.*, warning about 56
.NET 399-432
.NET, $+ 202
.NET, flavor overview 91
.NET, after-match data 136
.NET, benchmarking 236
.NET, JIT 404
.NET, line anchors 128
.NET, literal-text mode 135
.NET, MISL 404
.NET, object model 411
.NET, regex approach 96-97
.NET, regex flavor 401
.NET, search-and-replace 408, 417-418
.NET, URL parsing example 204
.NET, version covered 91
.NET, word boundaries 132
=~ 308-309, 318
=~, introduced 38
=~, introduced, ? (see question mark)
?...? 308
@+ 300, 302, 314
@"..." 102
@- 300, 302, 339
@ Perl interpolation 289
[=...=] 126
[:<:] 92
[:...:] 125-126
[.....] 126
\p{...} in java.util.regex 380
^ 111-112, 128
^, optimization 245-246
^Subject: example 94, 151-152, 154, 242, 244-245, 289
^Subject: example, in Java 95, 393
^Subject: example, in Perl 55
^Subject: example, in Perl debugger 361
^Subject: example, in Python 97
^Subject: example, in VB.NET 96
{min,max} 20, 140
\0 116-117
$0 300
\1 136, 300, 303
\1, in Perl 41
$1 135-136, 300, 303
$1, introduced 41
$1, in Java 388
$1, in .NET 418
$1, in other languages 136
$1, pre-match copy 355
8859-1 encoding 29, 87, 105, 107, 121
\A 111, 127-128
\A, in Java 373
\A, optimization 246
@ escaping 77
\a 114-115
issues overview encoding 105
after-match variables, in Perl 299
after-match variables, pre-match copy 355
Aho, Alfred 86, 180
\p{All} 123
\p{All}, in Perl 288
\p{all} 380
all-in-one object model 369
alternation 138
alternation, and backtracking 231
alternation, introduced 13-14
alternation, efficiency 222, 231
alternation, greedy 174-175
alternation, hand tweaking 260-261
alternation, order of 175-177, 223, 260
alternation, order of, for correctness 28, 189, 197
alternation, order of, for efficiency 224
alternation, and parentheses 13
analogy, backtracking, bread crumbs 158-159
analogy, backtracking, stacking dishes 159
analogy, ball rolling 261
analogy, building a car 31
analogy, charging batteries 179
analogy, engines 143-147
analogy, first come, first served 153
analogy, gas additive 150
analogy, learning regexes, Pascal 36
analogy, learning regexes, playing rummy 33
analogy, regex as a language 5, 27
analogy, regex as filename patterns 4
analogy, regex as filename patterns, regex-directed match (see NFA)
analogy, regex as filename patterns, text-directed match (see DFA)
analogy, transmission 148-149, 228
analogy, transparencies (Perl's local) 298
analogy, transparencies (Perl's local), anchor (also see: word boundaries; enhanced line-anchor mode)
analogy, overview 127
analogy, caret 127
analogy, dollar 127
analogy, end-of-line optimization 246
analogy, exposing 255
analogy, line 87, 111-112, 150
anchored(...) 362
anchored `string' 362
AND class set operations 123-124
ANSI escape sequences 79
\p{Any} 123
\p{Any}, in Perl 288
\p{Any}, in Perl, any character (see dot)
Apache, org.apache.xerces.utils.regex 372
Apache, ORO 392-398
Apache, ORO, benchmark results 376
Apache, ORO, comparative description 374
Apache, Regexp, comparative description 375
Apache, Regexp, speed 376
appendReplacement() 388
appendTail() 389
$ARGV 79
\p{Arrows} 122
ASCII encoding 29, 105-106, 114, 121
Asian character encoding 29
AssemblyName 429
\p{Assigned} 123-124
\p{Assigned}, in Perl 288
\p{Assigned}, in Perl, asterisk (see star)
\p{Assigned}, in Perl, atomic grouping (also see possessive quantifiers)
\p{Assigned}, introduced 137-138
\p{Assigned}, details 170-172
\p{Assigned}, for efficiency 171-172, 259, 268-270
\p{Assigned}, essence 170-171
\p{Assigned}, example 198, 201, 213, 271, 330, 340-341, 346
AT&T Bell Labs 86
auto-lookaheadification 403
automatic possessification 251
awk, after-match data 136
awk, gensub 183
awk, history 87
awk, search-and-replace 99
awk, version covered 91
awk, word boundaries 132
\b 65, 114-115, 400
\b, backspace and word boundary 44, 46
\b, in Perl 286
<B>...</B> 165-167
<B>...</B>, unrolling 270
\b\B 240
backreferences 117, 135
backreferences, introduced with egrep 20-22
backreferences, DFA 150, 182-183
backreferences, vs. octal escape 406-407
backreferences, remembering text 21
backreferences, remembering text, backspace (see \b)
backtracking 163-177
backtracking, introduction 157-163
backtracking, and alternation 231
backtracking, avoiding 171-172
backtracking, computing count 227
backtracking, counting 222, 224
backtracking, detecting excessive 249-250
backtracking, efficiency 179-180
backtracking, essence 168-169
backtracking, exponential match 226
backtracking, global view 228-232
backtracking, LIFO 159
backtracking, of lookaround 173-174
backtracking, neverending match 226
backtracking, non-match example 160-161
backtracking, POSIX NFA example 229
backtracking, saved states 159
backtracking, simple example 160
backtracking, simple lazy example 161
balanced constructs 328-331, 340-341, 430
balancing regex issues 186
Balling, Derek xxii
Barwise, J. 85
base character 107, 125
Basic Regular Expressions 87-88
\p{Basic_Latin} 122
\b\B 240
beginOffset 396
benchmarking 232-239
benchmarking, comparative 248, 376-377
benchmarking, compile caching 351
benchmarking, in Java 234-236, 375-377
benchmarking, for naughty variables 358
benchmarking, in .NET 236, 404
benchmarking, with neverending match 227
benchmarking, in Perl 360
benchmarking, pre-match copy 356
benchmarking, in Python 237
benchmarking, in Ruby 238
benchmarking, in Tcl 239
Bennett, Mike xxi
Berkeley 86
Better-Late-Than-Never 234-236, 375
<B>...</B> 165-167
<B>...</B>, unrolling 270
blocks 122, 288, 380, 400
BLTN 235-236, 375
BOL 362
\p{Box_Drawing} 122
Boyer-Moore 244, 247
bracket expressions 125
BRE 87-88
bread-crumb analogy 158-159
Bulletin of Math. Biophysics 85
bump-along, introduction 148-149
bump-along, avoiding 210
bump-along, distrusting 215-218
bump-along, optimization 255
bump-along, in overall processing 241
byte matching 328
/c 129-130, 315
/c, strings 102
\p{C} 120
\C 328
¢ 122
C comments, matching 272-276
C comments, unrolling 275-276
C comments, unrolling, caching (also see regex objects)
C comments, benchmarking 351
C comments, compile 242-244
C comments, in Emacs 244
C comments, integrated 242
C comments, in Java 393
C comments, in .NET 426
C comments, object-oriented 244
C comments, procedural 243
C comments, in Tcl 244
C comments, unconditional 350
CANON_EQ (Pattern flag) 108, 380
Capture 431
CaptureCollection 432
car analogy 83-84
caret anchor introduced 8
carriage return 109
case title 109
case folding 290, 292
case folding, inhibiting 292
CASE_INSENSITIVE (Pattern flag) 95, 109, 380, 383
case-insensitive mode 109
case-insensitive mode, introduced 14-15
case-insensitive mode, egrep 14-15
case-insensitive mode, /i 47
case-insensitive mode, Ruby 109
case-insensitive mode, with study 359
cast 294-295
\p{Cc} 121
\p{Cf} 121
character, base 125
character, classes 117
character, combining 107, 125, 288
character, combining, Inherited script 122
character, vs. combining characters 107
character, control 116
character, initial character discrimination 244-246, 249, 251-252, 257-259, 332, 361
character, machine-dependent codes 114
character, multiple code points 107
character, as opposed to byte 29
character, separating with split 322
character, shorthands 114-115
character class, introduced 9-10
character class, vs. alternation 13
character class, mechanics of matching 149
character class, negated, must match character 11-12
character class, negated, and newline 118
character class, negated, Tcl 111
character class, positive assertion 118
character class, of POSIX bracket expression 125
character class, range 9, 118
character class, as separate language 10
character equivalent 126
CharacterIterator 372
charnames pragma 290
CharSequence 372, 390
CheckNaughtiness 358
\p{Cherokee} 120
Chinese text processing 29
chr 414
chunk limit, Java ORO 395
chunk limit, java.util.regex 391
chunk limit, Perl 323
class, vs. dot 118
class, elimination optimization 249
class, initial class discrimination 244-246, 249, 251-252, 257-259, 332, 361
class, and lazy quantifiers 167
class, set operations 123-125, 375
class, subtraction 124
Clemens, Sam 375
Click, Cliff xxii
client VM 234, 236
clock clicks 239
\p{Close_Punctuation} 121
closures 339
\p{Cn} 121, 123-124, 380, 401
\p{Co} 121
code point, introduced 106
code point, beyond U+FFFF 108
code point, multiple 107
code point, unassigned in block 122
coerce 294-295
cold VM 235
collating sequences 126
combining character 107, 125, 288
combining character, Inherited script 122
com.ibm.regex, comparative description 372
com.ibm.regex, speed 377
commafying a number example 64-65
commafying a number example, introduced 59
commafying a number example, in Java 393
commafying a number example, without lookbehind 67
COMMAND.COM 7
comments 99, 134
comments, in Java 98
comments, matching of C comments 272-276
comments, matching of Pascal comments 265
comments, in .NET regex 414
COMMENTS (Pattern flag) 99, 218, 378, 380, 386
comments and free-spacing mode 110
Communications of the ACM 85
compile() 383
compile, caching 242-244
compile, once (/o) 352-353
compile, on-demand 351
compile, regex 404-405
compile() (Pattern factory) 383
Compiled (.NET) 236, 402, 404, 414, 421-422, 429
Compilers -- Principles, Techniques, and Tools 180
CompileToAssembly 427, 429
com.stevesoft.pat, comparative description 374
com.stevesoft.pat, speed 377
conditional 138-139
conditional, with embedded regex 327, 335
conditional, in Java 373
conditional, mimicking with lookaround 139
conditional, in .NET 403
Config module 290, 299
conflicting metacharacters 44-46
\p{Connector_Punctuation} 121
Constable, Robert 85
Constable, Robert, forcing 310
Constable, Robert, metacharacters 44-46
Constable, Robert, regex use 189
continuation lines 178, 186-187
continuation lines, unrolling 270
contorting an expression 294-295
\p{Control} 121
control characters 116
Conway, Damian 339
cooking for HTML 68, 408
correctness vs. efficiency 223-224
www.cpan.org 358
CR 109, 382
Cruise, Tom 51
crummy analogy 158-159
CSV parsing example, java.util.regex 218, 386
CSV parsing example, .NET 429
CSV parsing example, ORO 397
CSV parsing example, Perl 212-219
CSV parsing example, unrolling 271
currency, \p{Currency} 122
currency, \p{Currency_Symbol} 121
currency, \p{Sc} 121
currency, Unicode block 121-122
\p{Currency} 122
\p{Currency_Symbol} 121
currentTimeMillis() 236
\p{Cyrillic} 120, 122
\d 49, 119
\d, in Perl 288
\D 49, 119
Darth 197
dash in character class 9
\p{Dash_Punctuation} 121
DBIx::DWIW 258
debugcolor 363
debugging 361-363
debugging, with embedded code 331-332
debugging, regex objects 305-306
debugging, run-time 362
\p{Decimal_Digit_Number} 121
default regex 308
define-key 100
delegate 417-418
delimited text 196-198
delimited text, standard formula 196, 273
delimiter, with shell 7
delimiter, with substitution 319
delimiter, with substitution, Deterministic Finite Automaton (see DFA)
Devel::FindAmpersand 358
Devel::SawAmpersand 358
DFA, introduced 145, 155
DFA, acronym spelled out 156
DFA, backreferences 150, 182-183
DFA, boring 157
DFA, compared with NFA 224, 227
DFA, efficiency 179
DFA, implementation ease 182
DFA, lazy evaluation 181
DFA, longest-leftmost match 177-179
DFA, testing for 146-147
DFA, in theory, same as an NFA 180
dialytika 108
\p{Dingbats} 122
dish-stacking analogy 159
dollar for Perl variable 37
dollar anchor 127
dollar anchor, introduced 8
dollar value example 24-25, 51-52, 167-170, 175, 194-195
DOS 7
dot 118
dot, introduced 11-12
dot, vs. character class 118
dot, mechanics of matching 149
dot, Tcl 112
.NET 399-432
.NET, $+ 202
.NET, flavor overview 91
.NET, after-match data 136
.NET, benchmarking 236
.NET, JIT 404
.NET, line anchors 128
.NET, literal-text mode 135
.NET, MISL 404
.NET, object model 411
.NET, regex approach 96-97
.NET, regex flavor 401
.NET, search-and-replace 408, 417-418
.NET, URL parsing example 204
.NET, version covered 91
.NET, word boundaries 132
DOTALL (Pattern flag) 380, 382
dot-matches-all mode 110-111
doubled-word example, description 1
doubled-word example, in egrep 22
doubled-word example, in Emacs 100
doubled-word example, in Java 81
doubled-word example, in Perl 35, 77-80
double-quoted string example, allowing escaped quotes 196
double-quoted string example, egrep 24
double-quoted string example, final regex 263
double-quoted string example, makudonarudo 165, 169, 228-232, 264
double-quoted string example, sobering example 222-228
double-quoted string example, unrolled 262, 268
double-word finder example, description 1
double-word finder example, in egrep 22
double-word finder example, in Emacs 100
double-word finder example, in Java 81
double-word finder example, in Perl 35, 77-80
-Dr 363
dragon book 180
DWIW (DBIx) 258
dynamic regex 327-331
dynamic regex, sanitizing 337
dynamic scope 295-299
dynamic scope, vs. lexical scope 299
/e 319-321
\e 79, 114-115
\E 290
earliest match wins 148-149
EBCDIC 29
ECMAScript (.NET) 400, 402, 406-407, 415, 421
ed 85
ed, and backtracking 179-180
ed, correctness 223-224
ed, Perl-specific issues 347-363
ed, regex objects 353-354
ed, unlimited lookbehind 133
egrep, flavor overview 91
egrep, introduced 6-8
egrep, metacharacter discussion 8-22
egrep, after-match data 136
egrep, backreference support 150
egrep, case-insensitive match 15
egrep, doubled-word solution 22
egrep, example use 14
egrep, flavor summary 32
egrep, history 86-87
egrep, regex implementation 182
egrep, version covered 91
egrep, word boundaries 132
electric engine analogy 143-147
Emacs, flavor overview 91
Emacs, after-match data 136
Emacs, control characters 116
Emacs, re-search-forward 100
Emacs, search 100
Emacs, strings as regexes 100
Emacs, syntax class 127
Emacs, version covered 91
Emacs, word boundaries 132
email address example 70-73, 98
email address example, in Java 98
email address example, in VB.NET 99
embedded code, local 336
embedded code, my 338-339
embedded code, regex construct 327, 331-335
embedded code, sanitizing 337
embedded string check optimization 247, 257
Embodiments of Mind 85
Empty 426
\p{Enclosing_Mark} 121
\p{Enclosing_Mark}, introduced 29
\p{Enclosing_Mark}, issues overview 105
\p{Enclosing_Mark}, ASCII 29, 105-106, 114, 121
\p{Enclosing_Mark}, Latin-1 29, 87, 105, 107, 121
\p{Enclosing_Mark}, UCS-2 106
\p{Enclosing_Mark}, UCS-4 106
\p{Enclosing_Mark}, UTF-16 106
\p{Enclosing_Mark}, UTF-8 106
end() 385
END block 358
endOffset 396
end-of-string anchor optimization 246
engine, introduced 27
engine, analogy 143-147
engine, hybrid 183, 239, 243
engine, implementation ease 182
engine, testing type 146-147
engine, testing type, with neverending match 227
engine, type comparison 156-157, 180-182
English module 357
English vs. regex 275
enhanced line-anchor mode 111-112
enhanced line-anchor mode, introduced 69
ERE 87-88
errata xxi
Escape 427
escape, introduced 22
escape, term defined 27
essence, atomic grouping 170-171
essence, greediness, laziness, and backtracking 168-169
essence, greediness, laziness, and backtracking, NFA (see backtracking)
eval 319
example, atomic grouping 198, 201, 213, 271, 330, 340-341, 346
example, commafying a number 64-65
example, commafying a number, introduced 59
example, commafying a number, in Java 393
example, commafying a number, without lookbehind 67
example, CSV parsing, java.util.regex 218, 386
example, CSV parsing, .NET 429
example, CSV parsing, ORO 397
example, CSV parsing, Perl 212-219
example, CSV parsing, unrolling 271
example, dollar value 24-25, 51-52, 167-170, 175, 194-195
example, double-quoted string, allowing escaped quotes 196
example, double-quoted string, egrep 24
example, double-quoted string, final regex 263
example, double-quoted string, makudonarudo 165, 169, 228-232, 264
example, double-quoted string, sobering example 222-228
example, double-quoted string, unrolled 262, 268
example, double-word finder, description 1
example, double-word finder, in egrep 22
example, double-word finder, in Emacs 100
example, double-word finder, in Java 81
example, double-word finder, in Perl 35, 77-80
example, email address 70-73, 98
example, email address, in Java 98
example, email address, in VB.NET 99
example, filename 190-192
example, five modifiers 316
example, floating-point number 194
example, form letter 50-51
example, gr[ea]y 9
example, hostname 22, 73, 76, 98-99, 136-137, 203, 260, 267, 304, 306
example, hostname, egrep 25
example, hostname, Java 209
example, hostname, plucking from text 71-73, 205-208
example, hostname, in a URL 74-77
example, hostname, validating 203-205
example, hostname, VB.NET 204
example, HTML, conversion from text 67-77
example, HTML, cooking 68, 408
example, HTML, encoding 408
example, HTML, <HR> 194
example, HTML, link 201-203
example, HTML, optional 139
example, HTML, paired tags 165
example, HTML, parsing 130, 315, 321
example, HTML, tag 9, 18-19, 26, 200-201, 326, 357
example, HTML, URL 74-77, 203, 205-208, 303
example, HTML, URL-encoding 320
example, IP 5, 187-189, 267, 311, 314, 348-349
example, Jeffs 61-64
example, lookahead 61-64
example, mail processing 53-59
example, makudonarudo 165, 169, 228-232, 264
example, pathname 190-192
example, population 59
example, possessive quantifiers 198, 201
example, postal code 208-212
example, regex overloading 341-345
example, stock pricing 51-52, 167-168
example, stock pricing, with alternation 175
example, stock pricing, with atomic grouping 170
example, stock pricing, with possessive quantifier 169
example, temperature conversion, in .NET 419
example, temperature conversion, in Java 389
example, temperature conversion, in Perl 37
example, temperature conversion, Perl one-liner 283
example, text-to-HTML 67-77
example, this|that 132, 138, 243, 245-246, 252, 255, 260-261
example, unrolling the loop 270-271
example, URL 74-77, 201-204, 208, 260, 303-304, 306, 320
example, URL, egrep 25
example, URL, Java 209
example, URL, plucking 205-208
example, username 73, 76, 98
example, username, plucking from text 71-73
example, username, in a URL 74-77
example, variable names 24
example, ZIP code 208-212
exception, IllegalArgumentException 383, 388
exception, IllegalStateException 385
exception, IndexOutOfBoundsException 384-385, 388
exception, IOException 81
exception, NullPointerException 396
exception, PatternSyntaxException 381, 383
Explicit (Option) 409
ExplicitCapture (.NET) 402, 414, 421
exponential match 222-228, 330, 340
exponential match, avoiding 264-265
exponential match, discovery 226-228
exponential match, explanation 226-228
exponential match, non-determinism 264
exponential match, short-circuiting 250
exponential match, solving with atomic grouping 268
exponential match, solving with possessive quantifiers 268
expose literal text 255
expression, context 294-295
expression, contorting 294-295
Extended Regular Expressions 87-88
\f 114-115
\f, introduced 44
\f, introduced, Fahrenheit (see temperature conversion example)
failure, atomic grouping 171-172
failure, forcing 240, 333, 335, 340-341
FF 109
file globs 4
file-check example 2, 36
filename, example 190-192
filename, patterns (globs) 4
filename, prepending to line 79
\p{Final_Punctuation} 121
find() 384
FindAmpersand 358
five modifiers example 316
Flanagan, David xxii
flavor, Perl 286-293
flavor, superficial chart, general 91
flavor, superficial chart, Perl 285, 287
flavor, superficial chart, POSIX 88
flavor, term defined 27
flex version covered 91
floating `string' 362
floating-point number example 194
forcing failure 240, 333, 335, 340-341
foreach vs. while vs. if 320
form letter example 50-51
\p{Format} 121
freeflowing regex 277-281
Friedl, Alfred 176
Friedl, brothers 33
Friedl, Fumie xxi
Friedl, Fumie, birthday 11-12
Friedl, Liz 33
Friedl, Stephen xxii
fully qualified name 295
functions related to regexes in Perl 285
\G 128-131, 212, 315-316, 362
\G, advanced example 130
\G, in Java 373
\G, in .NET 402
\G, optimization 246
/g 61, 130, 307, 311-312, 315, 319
/g, introduced 51
/g, with regex object 354
garbage collection Java benchmarking 236
gas engine analogy 143-147
gensub 183
George, Kit xxii
GetGroupNames 421-422
GetGroupNumbers 421-422
getMatch() 397
global vs. private Perl variables 295
globs filename 4
GNU Java packages 374
GNU awk, after-match data 136
GNU awk, gensub 183
GNU awk, version covered 91
GNU awk, word boundaries 132
GNU egrep, after-match data 136
GNU egrep, backreference support 150
GNU egrep, doubled-word solution 22
GNU egrep, -i bug 21
GNU egrep, regex implementation 182
GNU egrep, word boundaries 132
GNU egrep, word boundaries, GNU Emacs (see Emacs)
GNU grep, shortest-leftmost match 183
GNU grep, version covered 91
GNU sed, after-match data 136
GNU sed, version covered 91
GNU sed, word boundaries 132
gnu.regexp, comparative description 374
gnu.regexp, speed 377
gnu.rex 374
Goldberger, Ray xxii
Gosling, James 89
GPOS 362
gr[ea]y example 9
gr[ea]y example, introduced 151
gr[ea]y example, alternation 174-175
gr[ea]y example, and backtracking 162-177
gr[ea]y example, deference to an overall match 153, 274
gr[ea]y example, essence 159, 168-169
gr[ea]y example, favors match 167-168
gr[ea]y example, first come, first served 153
gr[ea]y example, global vs. local 182
gr[ea]y example, in Java 373
gr[ea]y example, vs. lazy 169, 256-257
gr[ea]y example, localizing 225-226
gr[ea]y example, quantifier 139-140
gr[ea]y example, too greedy 152
green dragon 180
grep, flavor overview 91
grep, as an acronym 85
grep, history 86
grep, regex flavor 86
grep, version covered 91
grep, -y option 86
grep in Perl 324
group(), java.util.regex 385
group(), ORO 396
Group object (.NET) 412
Group object (.NET), Capture 431
Group object (.NET), creating 423
Group object (.NET), Index 424
Group object (.NET), Length 424
Group object (.NET), Success 424
Group object (.NET), ToString 424
Group object (.NET), using 424
Group object (.NET), Value 424
GroupCollection 423, 432
groupCount() 385
grouping and capturing 20-22
GroupNameFromNumber 421-422
GroupNumberFromName 421-422
groups() ORO 397
Groups Match object method 423
\p{Gujarati} 120
Gutierrez, David xxii
\p{Han} 120
hand tweaking, alternation 260-261
hand tweaking, caveats 253
\p{Hangul_Jamo} 122
HASH(0x80f60ac) 257
\p{Hebrew} 120, 122
hex escape 116-117
hex escape, in Java 373
hex escape, in Perl 286
Hietaniemi, Jarkko xxii
highlighting with ANSI escape sequences 79
\p{Hiragana} 120
history, `\+' 87
history, AT&T Bell Labs 86
history, awk 87
history, Berkeley 86
history, ed trivia 86
history, egrep 86-87
history, grep 86
history, lex 87
history, Perl 88-90, 308
history, of regexes 85-91
history, sed 87
history, underscore in \w 89
history, /x 90
hostname example 22, 73, 76, 98-99, 136-137, 203, 260, 267, 304, 306
hostname example, egrep 25
hostname example, Java 209
hostname example, plucking from text 71-73, 205-208
hostname example, in a URL 74-77
hostname example, validating 203-205
hostname example, VB.NET 204
$HostnameRegex 76, 136, 303, 351
hot VM 235, 375
HTML, cooking 68, 408
HTML, matching tag 200-201
HTML example, conversion from text 67-77
HTML example, cooking 68, 408
HTML example, encoding 408
HTML example, <HR> 194
HTML example, link 201-203
HTML example, optional 139
HTML example, paired tags 165
HTML example, parsing 130, 315, 321
HTML example, tag 9, 18-19, 26, 200-201, 326, 357
HTML example, URL 74-77, 203, 205-208, 303
HTML example, URL-encoding 320
HTTP newlines 115
HTTP URL example 25, 74-77, 201-209, 260, 303-304, 306, 320
http://regex.info/ xxi, 7, 345, 372
$HttpUrl 303, 305, 345, 351
hybrid regex engine 183, 239, 243
hyphen in character class 9
-i as -y 86
/i 134
/i, introduced 47
/i, with study 359
/i, with study, (?i) (see: case-insensitive mode; mode modifier)
IBM (Java package), comparative description 372
IBM (Java package), speed 377
identifier matching 24
if vs. while vs. foreach 320
IgnoreCase (.NET) 96, 99, 402, 413, 421
IgnorePatternWhitespace (.NET) 99, 402, 413, 421
IllegalArgumentException 383, 388
IllegalStateException 385
implementation of engine 182
implicit 362
implicit anchor optimization 246
Imports 407, 409, 428
\p{InArrows} 122
\p{InBasic_Latin} 122
\p{InBox_Drawing} 122
\p{InCurrency} 122
\p{InCyrillic} 122
Index, Group object method 424
Index, Match object method 423
IndexOutOfBoundsException 384-385, 388
\p{InDingbats} 122
indispensable TiVo 3
\p{InHangul_Jamo} 122
\p{InHebrew} 122
\p{Inherited} 122
initial class discrimination 244-246, 249, 251-252, 257-259, 332, 361
\p{Initial_Punctuation} 121
\p{InKatakana} 122
\p{InTamil} 122
integrated handling 94-95
integrated handling, compile caching 242
interpolation 288-289
interpolation, introduced 77
interpolation, caching 351
interpolation, mimicking 321
interpolation, in PHP 103
INTERSECTION class set operations 124
interval 140
interval, introduced 20
interval, [X{0,0}] 140
\p{InTibetan} 122
IOException 81
IP example 5, 187-189, 267, 311, 314, 348-349
Iraq 11
Is vs. In 120, 122-123
Is vs. In, with java.util.regex 380
Is vs. In, in .NET 401
Is vs. In, in Perl 288
\p{IsCherokee} 120
\p{IsCommon} 122
\p{IsCyrillic} 120
\p{IsGujarati} 120
\p{IsHan} 120
\p{IsHebrew} 120
\p{IsHiragana} 120
\p{IsKatakana} 120
\p{IsLatin} 120
IsMatch (Regex object method) 415
ISO-8859-1 encoding 29, 87, 105, 107, 121
\p{IsThai} 120
\p{IsTibetan} 122
Japanese, text processing 29
japhy 246
Java 365-398
Java, benchmarking 234-236
Java, BLTN 235-236, 375
Java, choosing a regex package 366
Java, exposed mechanics 374
Java, fastest package 377
Java, JIT 235
Java, list of packages 372
Java, matching comments 272-276
Java, object models 368-372
Java, package flavor comparison 373
Java, Perl5 flavors 375
Java, strings 102
Java, version covered 91
Java, VM 234-236, 375
java.util.regex 95-96, 378-391
java.util.regex, after-match data 136
java.util.regex, code example 383, 389
java.util.regex, comparative description 372
java.util.regex, CSV parsing 386
java.util.regex, dot modes 111
java.util.regex, doubled-word example 81
java.util.regex, line anchors 128
java.util.regex, line terminators 382
java.util.regex, match modes 380
java.util.regex, object model 381
java.util.regex, regex flavor 378-381
java.util.regex, search-and-replace 387
java.util.regex, speed 377
java.util.regex, split 390
java.util.regex, URL parsing example 209
java.util.regex, version covered 91
java.util.regex, word boundaries 132
Jeffs example 61-64
JfriedlsRegexLibrary 428-429
JIT, Java 235
JIT, .NET 404
JRE 234
jregex comparative description 374
\p{Katakana} 120, 122
keeping in sync 210-211
Keisler, H. J. 85
Kleene, Stephen 85
The Kleene Symposium 85
Korean text processing 29
Kunen, K. 85
\p{L&} 120-121, 123
\p{L&}, in Perl 288
\p{L} 119-120, 131, 380, 390
£ 122
\l 290
\l, character class 10, 13
\l, identifiers 24
\p{Latin} 120
Latin-1 encoding 29, 87, 105, 107, 121
lazy 166-167
lazy, essence 159, 168-169
lazy, favors match 167-168
lazy, vs. greedy 169, 256-257
lazy, in Java 373
lazy, optimization 249, 256
lazy, quantifier 140
lazy evaluation 181, 355
\L...\E 290
\L...\E, inhibiting 292
lc 290
lcfirst 290
leftmost match 177-179
Length, Group object method 424
Length, Match object method 423
length() ORO 396
length-cognizance optimization 245, 247
\p{Letter} 120, 288
\p{Letter_Number} 121
$LevelN 330, 343
lex 86
lex, $ 111
lex, dot 110
lex, history 87
lex, and trailing context 182
lexer building 130, 315
lexical scope 299
LF 109, 382
Li, Yadong xxii
LIFO backtracking 159
limit, backtracking 237
limit, recursion 249-250
limit, recursion, line (also see string)
limit, anchor optimization 246
limit, vs. string 55
line anchor 111-112
line anchor, mechanics of matching 150
line anchor, variety of implementations 87
line feed 109
LINE SEPARATOR 109, 121, 382
line terminators 108-109, 111, 128, 382
line terminators, with $ and ^ 111
\p{Line_Separator} 121
link, matching 201
link, matching, Java 204, 209
list context 294, 310-311
list context, forcing 310
literal string initial string discrimination 244-246, 249, 251-252, 257-259, 332, 361
literal text, introduced 5
literal text, exposing 255
literal text, mechanics of matching 149
literal text, pre-check optimization 244-246, 249, 251-252, 257-259, 332, 361
literal-text mode 112, 134-135, 290
literal-text mode, inhibiting 292
\p{Ll} 121, 400
\p{Lm} 121, 400
\p{Lo} 121, 400
local 296, 341
local, in embedded code 336
local, vs. my 297
locale 126
locale, overview 87
locale, \w 119
localizing 296-297
localtime 294, 319, 351
locking in regex literal 352
A logical calculus of the ideas imminent in nervous activity 85
longest match finding 334-335
longest-leftmost match 148, 177-179
lookahead 132
lookahead, introduced 60
lookahead, auto 403
lookahead, example 61-64
lookahead, in Java 373
lookahead, mimic atomic grouping 174
lookahead, mimic optimizations 258-259
lookahead, negated, <B>...</B> 167
lookahead, positive vs. negative 66
lookaround, introduced 59
lookaround, backtracking 173-174
lookaround, in conditional 139
lookaround, and DFAs 182
lookaround, doesn't consume text 60
lookaround, mimicking class set operations 124
lookaround, mimicking word boundaries 132
lookaround, in Perl 288
lookbehind 132
lookbehind, in Java 373
lookbehind, in .NET 402
lookbehind, in Perl 288
lookbehind, positive vs. negative 66
lookbehind, unlimited 402
lookingAt() 385
Lord, Tom 182
\p{Lowercase_Letter} 121
LS 109, 121, 382
\p{Lt} 121, 400
\p{Lu} 121, 400
Lunde, Ken xxii, 29
\p{M} 120, 125
/m 134
m/.../ introduced 38
machine-dependent character codes 114
MacOS 114
mail processing example 53-59
makudonarudo example 165, 169, 228-232, 264
\p{Mark} 1