Conversation
| std::make_unique<DataTypeCustomDesc>(std::make_unique<DataTypeCustomFixedName>("IPv6"), std::make_unique<DataTypeCustomIPv6Serialization>())); | ||
| }); | ||
|
|
||
| /// MySQL, MariaDB |
There was a problem hiding this comment.
This is a bonus.
Broken in #11903 |
|
@akuzm Any ideas on how to automate this? |
Not sure. The problem is that we don't want to support custom dictionaries? Maybe we can pipe all sources to clickhouse-local, build a list of words that are used in sources, and warn if we add a word that is not in the list, but is trigram-similar to an existing word, and detect typos in this way. |
|
Yes, I had the same idea. We can support custom dictionary and store it directly in repository. But it will be very fragile if we don't normalize words. We can invest into text processing functions in ClickHouse but it's a big project. |
There are false matches (but not too much), and be adjusted by flags / dicts. Check: https://github.com/codespell-project/codespell |
|
There is also But it requires tuning (it stores & filling dictionaries incrementally etc. - smth like you describe above) check |
|
This result looks very promising, we can incorporate this tool with a simple exception list. |
|
It also has plenty of options / switches |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix some typos in code.
Detailed description / Documentation draft:
It's easy: