Commit 5de4cc5
Fix regexp performance regression for patterns starting with s/k
Commit 981ee02 ("Fix performance problem with /k/i and /s/i") was
merged for Ruby 4.0 to enable partial Boyer-Moore optimization for
patterns containing 's' or 'k' by using the prefix before those
characters.
However, when 's' or 'k' appears at the start of a pattern (no usable
prefix), set_bm_skip() returns 0 and the code returned early without
setting any optimization mode, leaving reg->optimize at
ONIG_OPTIMIZE_NONE. This caused up to 30x slowdown for patterns like
/slackware/i when matched against strings with non-ASCII characters.
This patch keeps the improvement from 981ee02 for patterns with
3+ char prefix, while fixing the regression by falling back to
ONIG_OPTIMIZE_EXACT_IC with the full pattern when the usable prefix
is less than 3 characters.
Before: /\bslackware\b/i with non-ASCII string: 2.24 us/op
After: /\bslackware\b/i with non-ASCII string: 0.70 us/op (3.2x faster)
[Bug #21824]1 parent 09cd131 commit 5de4cc5
1 file changed
Lines changed: 10 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5264 | 5264 | | |
5265 | 5265 | | |
5266 | 5266 | | |
| 5267 | + | |
5267 | 5268 | | |
5268 | 5269 | | |
5269 | | - | |
5270 | 5270 | | |
| 5271 | + | |
5271 | 5272 | | |
5272 | 5273 | | |
5273 | 5274 | | |
5274 | | - | |
| 5275 | + | |
| 5276 | + | |
| 5277 | + | |
| 5278 | + | |
| 5279 | + | |
| 5280 | + | |
| 5281 | + | |
| 5282 | + | |
5275 | 5283 | | |
5276 | 5284 | | |
5277 | | - | |
5278 | | - | |
5279 | 5285 | | |
5280 | 5286 | | |
5281 | 5287 | | |
| |||
0 commit comments