-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow pluralisation of words in the dictionary without explicitly adding the plurals as additional entries #4942
Comments
Let me rephrase to see if I get this right: You would like a way to reduce the number of lines need to represent a dictionary by using expressions to represent stemming rules or explicit prefix/suffixes. Automatic stemming is only useful for searching, not for spelling. People make mistakes while spelling search terms, so automatically apply stemming rules to search terms isn't an issue, because the whole point it find a matching set of results. But using automatic stemming for spelling would reduce the usefulness of the spell checker because it would allow misspelled words to be considered correct. Explicit Stemming RulesHunspell uses explicit stemming rules to define words in a dictionary. There are two files, one for the rules
Rules tend to be broken down into two types: Prefix and Suffix. Prefix ane Suffix rules can generally be combined with each other while Prefix rules cannot be combined with other Prefix rules with the same being true for Suffix rules. A single rule might have many actions. Each action has multiple parts:
Advantages
Disadvantages
Note: @cspell/cspell-tools can be used to compile Hunspell files into a words list Explicit prefix/suffix logicAs you described above:
Advantages
Disadvantages
ConclusionIt is possible to add some form of explicit stemming.
|
Thanks for taking the time to add your thoughts, I think they are very good ideas I think I would prefer the first approach, I'm currently doing a lot of maintenance for our spelling words at my company, although it may have a bit of a learning curve to it, I would be willing to learn how it works, and if its a standard approach used by other dictionaries perhaps its the way to go. However I would be willing to adopt either solution, should it be implemented as both of them are better than what I'm doing now. I think we would like to avoid things like Would be interesting to see what other peoples' opinions are |
Explicit Stemming Rules is the way to go. I'll leave this issue as an enhancement. The likelihood of it getting completed in the next 6 months is low unless there is enough funding. |
Here is a working config that will do what you ask using the explicit technique: |
Thanks @Jason3S |
Is your feature request related to a problem? Please describe.
For instance, if we add the word
chromecast
to the wordlist, I would expect thatchromecasts
would also not be a spelling mistake. I appreciate this may be easier in english than other langauges, but would we have anallowPlurals
config option which would allow words but with an s on the endDescribe the solution you'd like
either an allowPlurals config flag
perhaps a pattern we can add in the dictionary word list e.g.
chromecast(s)
inde(x|cies)
or perhaps we could have a mechanism for some langauges where there's a consistant pattern for instance
Describe alternatives you've considered
The current work around is to just explicitly add the plural, however we have 900 words in our word list and the less that are in it, the easier the maintenance is
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: