Improve and add exceptions for singular method#493
Conversation
6c3654d to
f475c2c
Compare
gnodet
left a comment
There was a problem hiding this comment.
Here's a proposal:
private static final Map<String, String> PLURAL_EXCEPTIONS = new HashMap<>();
static {
// Irregular plurals
PLURAL_EXCEPTIONS.put("men", "man");
PLURAL_EXCEPTIONS.put("women", "woman");
PLURAL_EXCEPTIONS.put("children", "child");
PLURAL_EXCEPTIONS.put("mice", "mouse");
PLURAL_EXCEPTIONS.put("people", "person");
PLURAL_EXCEPTIONS.put("teeth", "tooth");
PLURAL_EXCEPTIONS.put("feet", "foot");
PLURAL_EXCEPTIONS.put("geese", "goose");
// Invariant plurals
PLURAL_EXCEPTIONS.put("series", "series");
PLURAL_EXCEPTIONS.put("species", "species");
PLURAL_EXCEPTIONS.put("sheep", "sheep");
PLURAL_EXCEPTIONS.put("fish", "fish");
PLURAL_EXCEPTIONS.put("deer", "deer");
PLURAL_EXCEPTIONS.put("aircraft", "aircraft");
// Special "oes" exceptions
PLURAL_EXCEPTIONS.put("heroes", "hero");
PLURAL_EXCEPTIONS.put("potatoes", "potato");
PLURAL_EXCEPTIONS.put("tomatoes", "tomato");
PLURAL_EXCEPTIONS.put("echoes", "echo");
PLURAL_EXCEPTIONS.put("vetoes", "veto");
PLURAL_EXCEPTIONS.put("torpedoes", "torpedo");
PLURAL_EXCEPTIONS.put("cargoes", "cargo");
PLURAL_EXCEPTIONS.put("haloes", "halo");
PLURAL_EXCEPTIONS.put("mosquitoes", "mosquito");
PLURAL_EXCEPTIONS.put("buffaloes", "buffalo");
}
public static String singular(String plural) {
if (plural == null || plural.isEmpty()) return plural;
String lower = plural.toLowerCase();
if (PLURAL_EXCEPTIONS.containsKey(lower)) {
return PLURAL_EXCEPTIONS.get(lower);
}
// Suffix-based rules
if (lower.endsWith("ies") && plural.length() > 3) {
return plural.substring(0, plural.length() - 3) + "y";
}
if (lower.endsWith("ves")) {
return plural.substring(0, plural.length() - 3) + "f";
}
if (lower.endsWith("zzes")) {
return plural.substring(0, plural.length() - 2);
}
if (lower.endsWith("sses")) {
return plural.substring(0, plural.length() - 2);
}
if (lower.endsWith("ches") || lower.endsWith("shes")) {
return plural.substring(0, plural.length() - 2);
}
if (lower.endsWith("xes")) {
return plural.substring(0, plural.length() - 2);
}
if (lower.endsWith("oes")) {
return plural.substring(0, plural.length() - 1);
}
if (lower.endsWith("s") && plural.length() > 1) {
return plural.substring(0, plural.length() - 1);
}
return plural;
}
With a more complete test pairs:
// Known exceptions
"men", "man",
"women", "woman",
"children", "child",
"mice", "mouse",
"people", "person",
"teeth", "tooth",
"feet", "foot",
"geese", "goose",
"series", "series",
"species", "species",
"sheep", "sheep",
"fish", "fish",
"deer", "deer",
"aircraft", "aircraft",
"heroes", "hero",
"potatoes", "potato",
"tomatoes", "tomato",
"echoes", "echo",
"vetoes", "veto",
"torpedoes", "torpedo",
"cargoes", "cargo",
"haloes", "halo",
"mosquitoes", "mosquito",
"buffaloes", "buffalo",
// Regular plural forms with suffixes
"voes", "voe",
"hoes", "hoe",
"canoes", "canoe",
"toes", "toe",
"foes", "foe",
"oboes", "oboe",
"noes", "no",
"boxes", "box",
"wishes", "wish",
"dishes", "dish",
"brushes", "brush",
"classes", "class",
"buzzes", "buzz",
"cars", "car",
"dogs", "dog",
"cats", "cat",
"horses", "horse",
// Some test cases with different rules
"wolves", "wolf",
"knives", "knife",
"leaves", "leaf",
"wives", "wife",
"lives", "life",
"babies", "baby",
"parties", "party",
"cities", "city",
"buses", "bus",
"boxes", "box",
"churches", "church",
"matches", "match",
"watches", "watch",
"riches", "rich",
"dresses", "dress",
"crosses", "cross",
// More edge cases
"heroes", "hero",
"vetoes", "veto",
"torpedoes", "torpedo",
"tomatoes", "tomato",
"potatoes", "potato",
"echoes", "echo",
"mosquitoes", "mosquito",
"buffaloes", "buffalo",
"volcanoes", "volcano",
"goes", "go"
|
|
||
| static { | ||
| PLURAL_EXCEPTION.put("children", "child"); | ||
| PLURAL_EXCEPTION.put("licenses", "license"); |
There was a problem hiding this comment.
Why is that one an exception?
The plural just adds an 's'
modello-core/src/main/java/org/codehaus/modello/plugin/AbstractModelloGenerator.java
Show resolved
Hide resolved
| "repositories, repository", | ||
| "roles, role", | ||
| "rushes, rush", | ||
| "series, series" |
There was a problem hiding this comment.
We can add a few more, here's a more extensive list:
"women", "woman", "men", "man", "children", "child", "mice", "mouse", "people", "person", "series", "series", "species", "species", "roses", "rose", "fezzes", "fez", "kisses", "kiss", "buses", "bus", "glasses", "glass", "heroes", "hero", "potatoes", "potato", "tomatoes", "tomato", "echoes", "echo", "torpedoes", "torpedo", "vetoes", "veto", "cargoes", "cargo", "haloes", "halo", "mosquitoes", "mosquito", "babies", "baby", "wolves", "wolf", "knives", "knife", "leaves", "leaf", "wives", "wife", "lives", "life", "boxes", "box", "wishes", "wish", "dishes", "dish", "churches", "church", "brushes", "brush", "classes", "class", "buzzes", "buzz", "cars", "car", "dogs", "dog", "voes", "voe", "does", "doe", "hoes", "hoe", "canoes", "canoe"
| } else if (name.endsWith("xes")) { | ||
| } else if (name.endsWith("zzes")) { | ||
| return name.substring(0, name.length() - 3); | ||
| } else if (name.endsWith("ches") || name.endsWith("xes") || name.endsWith("ses") || name.endsWith("oes") |
There was a problem hiding this comment.
The oes rule is a bit more complicated.
It usually looses the final s, but there are exceptions such as heroes, potatoes, tomatoes, echoes, torpedoes, vetoes, cargoes, haloes, and mosquitoes.
I think the rule should be loose the final s, with exceptions such as:
"heroes", "hero",
"potatoes", "potato",
"tomatoes", "tomato",
"echoes", "echo",
"vetoes", "veto",
"torpedoes", "torpedo",
"cargoes", "cargo",
"haloes", "halo",
"mosquitoes", "mosquito",
"buffaloes", "buffalo"
|
rule: not works for: |
Those are actually incorrect plurals. the correct ones are ending with |
I see |
|
@gnodet - based on your proposition I have a next fix ... I'm afraid that it will be difficult to support all cases, so I added a parameter to Mojo, when we can add special exclusion in project. |
|
defining what "improving" means would be useful: as such, it's just a vague personal judgement but I read the content, and it's related to being able to add exceptions and have a default list of classical ones question: is there an official list somewhere? |
|
As I started work on it .... I've hoped it will be only simple improvement .... But the problem turned out to be more complicated I have added a list of some irregular form @gnodet comments - I don't know a source. Finally I added a parameter to allow providing more special cases, as it is difficult to discover all irregular noun Of course we can adjust change title to show what exactly was done. |
988a6b0 to
31295ab
Compare
No description provided.