Fix inconsistent behavior with ALL-CAPS strings containing separators#1583
Fix inconsistent behavior with ALL-CAPS strings containing separators#1583ManojLingala wants to merge 4 commits into
Conversation
Fixes Humanizr#1557: Humanize(), ApplyCase(), and Transform() methods now handle ALL-CAPS strings with underscores or hyphens more consistently. Changes: - Modified StringHumanizeExtensions.Humanize() to process ALL-CAPS strings that contain separators (underscore/hyphen) instead of preserving them - Updated ToTitleCase transformer to handle words with separators - Enhanced ToSentenceCase transformer to convert separator-containing ALL-CAPS strings to sentence case - Added comprehensive unit tests for the new behavior Examples of improvements: - "LONGER_WORD".Humanize() now returns "LONGER WORD" instead of "LONGER_WORD" - "HYPEN-SEPARATOR".Transform(To.SentenceCase) now returns "Hypen-separator" - "LONGER_WORD".Transform(To.TitleCase) now returns "Longer_word" Regular ALL-CAPS words without separators are still preserved as potential acronyms to maintain backward compatibility.
The LONGER_WORD test cases should expect 'LONGER WORD' output after humanization, as the ALL-CAPS words are preserved when converted to multi-word format.
|
@dotnet-policy-service agree |
This commit resolves issue Humanizr#1557 where ALL-CAPS strings with separators (underscores/hyphens) were inconsistently handled by Humanize() and Transform() methods. ## Changes Made: ### Enhanced ToTitleCase Transformer: - Added context-aware acronym detection via `ContainsMultipleAllCapsWords()` - Distinguishes between single words (preserve more acronyms) vs multi-word results from separator-based input (transform more aggressively) - Preserves genuine acronyms like "HELLO", "HTML", "ALLCAPS" while transforming words from separator-based input like "LONGER WORD" → "Longer Word" ### Improved ToSentenceCase Transformer: - Added logic to handle both direct transformer usage and integration with Humanize() - Preserves acronyms in mixed-case contexts but transforms ALL-CAPS words when all words are caps (indicating separator-based origin) - Maintains backward compatibility for existing sentence case behavior ### Updated Test Cases: - Added test case for "LONGER_WORD" → "Longer word" (sentence case) - Updated test expectation for "LONGER_WORD" → "Longer Word" (title case) - All existing tests continue to pass ## Behavior Fixed: - **"HELLO" → "HELLO"** (preserves standalone acronyms) - **"LONGER_WORD" → "Longer Word"** (title case for separator-based input) - **"LONGER_WORD" → "Longer word"** (sentence case for separator-based input) - **"honors UPPER case" → "Honors UPPER case"** (preserves acronyms in mixed contexts) ## Test Results: - StringHumanizeTests: 57/57 passing - TransformersTests: 23/23 passing - All target frameworks (.NET 10.0, 8.0, Framework 4.8) passing Fixes Humanizr#1557
There was a problem hiding this comment.
Pull Request Overview
This PR fixes inconsistent behavior with ALL-CAPS strings containing separators (underscores or hyphens) by ensuring they are properly humanized and transformed instead of being preserved as potential acronyms.
- Updated the Humanize() method to process ALL-CAPS strings with separators instead of preserving them unchanged
- Enhanced ToTitleCase and ToSentenceCase transformers to handle separator-containing strings appropriately
- Added comprehensive test coverage for the new behavior while maintaining backward compatibility for regular ALL-CAPS words
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| StringHumanizeExtensions.cs | Modified Humanize() method to exclude strings with separators from acronym preservation |
| ToTitleCase.cs | Added logic to detect and transform multi-word strings from separator-based input while preserving short acronyms |
| ToSentenceCase.cs | Enhanced to handle separator-containing ALL-CAPS strings and multi-word transformations |
| TransformersTests.cs | Added test cases for the new transformer behavior |
| StringHumanizeTests.cs | Added test cases covering the updated Humanize() and casing methods |
| [InlineData("", "")] | ||
| [InlineData("JeNeParlePasFrançais", "Je ne parle pas français")] | ||
| [InlineData("LONGER_WORD", "LONGER WORD")] // Issue #1557: ALL-CAPS with separators should be humanized to separated words | ||
| [InlineData("HELLO", "HELLO")] // ALL-CAPS words without separators should be preserved as potential acronyms |
There was a problem hiding this comment.
Remove trailing whitespace at the end of the comment line.
| foreach (var word in words) | ||
| { | ||
| if (word.Any(char.IsLetter)) | ||
| { | ||
| totalWordCount++; | ||
| if (word.All(char.IsUpper)) | ||
| { | ||
| allCapsCount++; | ||
| } | ||
| } | ||
| } |
Check notice
Code scanning / CodeQL
Missed opportunity to use Where Note
|
Considering this as it could change too much. Perhaps an analyzer could detect this case and warn/suggest doing making it lower case first? Just needs some thought. |
Fixes #1557: Humanize(), ApplyCase(), and Transform() methods now handle ALL-CAPS strings with underscores or hyphens more consistently.
Changes:
Examples of improvements:
Regular ALL-CAPS words without separators are still preserved as potential acronyms to maintain backward compatibility.
Here is a checklist you should tick through before submitting a pull request:
mainbranch (more info below)fixes #<the issue number>build.cmdorbuild.ps1and ensure there are no test failures