Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spellchecker to CI #1943

Open
brillout opened this issue Nov 1, 2024 · 14 comments
Open

Add spellchecker to CI #1943

brillout opened this issue Nov 1, 2024 · 14 comments

Comments

@brillout
Copy link
Member

brillout commented Nov 1, 2024

Description

Maybe we an use the tool that was used at #1835; IIRC the tool has an npm package.

@IanWorley
Copy link

IanWorley commented Nov 20, 2024

I would like to take a shot with this issue. Does not seem to difficult at all. I look into using the suggested spell checker @brillout

@brillout
Copy link
Member Author

@IanWorley Ok! So far I think the best would be to integrate the spell check similar to how the formatting check is integrated: https://github.com/vikejs/vike/blob/main/.github/workflows/formatting.yml.

IanWorley added a commit to IanWorley/vike that referenced this issue Nov 20, 2024
@IanWorley
Copy link

IanWorley commented Nov 20, 2024

I will submit a draft pr but I still have some issue at this time for right now I have wrote a config to detect typos from mdx files in the docs directory. Do we want this running on the entire project and what is the scope? Anyway I will also reach out in the discussion of typos as it seems there examples of typos are not being pick up in the mdx file.

@brillout
Copy link
Member Author

Do we want this running on the entire project and what is the scope?

Not for now. (Eventually yes once we don't have any pending open PRs.)

For now we can run against the whole docs/ directory, include .tsx files. (No pending PR is modifying it.)

I will also reach out in the discussion of typos as it seems there examples of typos are not being pick up in the mdx file.

👍

@IanWorley
Copy link

IanWorley commented Nov 20, 2024

@IanWorley 👍 Btw. how big is their npm package? It's fine if it's a bit heavy, but it shouldn't be too heavy.
Sorry I did not mean to close the pr I just drop my comments.

To response to this I do not think typos is a npm package. I will look into using cspell at this moment my only problem is that there might be dictionary file if that is fine with you. By the way size of cspell is 203kb

@brillout
Copy link
Member Author

I found https://github.com/dalisoft/typos-rs-npm so maybe we can use that? We can also ask the typos team if there is an official npm package and if not whether it's on their radar.

I will look into using cspell at this moment my only problem is that there might be dictionary file if that is fine with you.

How would that be a problem? Can we implement this workflow with it?

By the way size of cspell is 203kb

That's totally fine.

IanWorley added a commit to IanWorley/vike that referenced this issue Nov 22, 2024
@brillout
Copy link
Member Author

Only obvious spelling issues

One issue of integrating spell checking in the CI is that it isn't clear how to handle false positives: upon a false positive the CI will be read but how can we tell the CI that it's a false positive? I guess some kind of special comment like with Prettier and Biome would be the best solution.

So far I'm inclined to go with a spell checker that has only few false positives.

Workflow

The reason I was mentioning using their an package is because it would allow us to correct spelling mistakes by simply running an npm script.

That's what we're doing with formatting:

  • pnpm run format:check => throws an error if formatting is incorrect, used by the CI
  • pnpm run format => fixes formatting issues, used by the user

I wonder if we can reproduce the same workflow for spell checking?

@IanWorley
Copy link

Yeah, at first I was going to use cspell, but the amount of manual dictionary entries seemed tedious at best. That’s why I chose typos-rs-npm instead—it seems far more maintainable than managing a massive custom dictionary with cspell.

@cr7pt0gr4ph7
Copy link
Contributor

For reference, here's the cSpell.json I used for #1983 (where unfortunately most changes didn't make it through) together with the Code Spell Checker VS Code extension.

The config should cover everything within vike/. docs/ is also covered, except for all the links containing GitHub usernames, where I gave up manually ignoring all those usernames halfway through (and a better approach would be to ignore those occurences using a general pattern anyway, which is also possible with that VS Code extension, though I didn't bother to in this case).

{
    "version": "0.2",
    "ignorePaths": [
        "**/node_modules/**"
    ],
    "dictionaryDefinitions": [],
    "dictionaries": [],
    "words": [
        "36masync",
        "36mconst",
        "36mexport",
        "36mfrom",
        "36mfunction",
        "36mimport",
        "36mreturn",
        "aaaaaba",
        "AaronBeaudoin",
        "Abramov's",
        "actiongroup",
        "Actix",
        "AFAICT",
        "aheissenberger",
        "Ahrefs",
        "alexrabin",
        "Alignable",
        "antd",
        "Aslemammad",
        "atrule",
        "AurelienLourot",
        "automagically",
        "bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq",
        "Bati",
        "batijs",
        "binedge",
        "birkskyum",
        "Blankeos",
        "brillout",
        "brunsten",
        "btih",
        "Bugsnag",
        "build-f7i251e0iwnw",
        "build-j95xb988fpln",
        "Bulma",
        "Burda",
        "burdaforward",
        "Chakra",
        "changeme",
        "chokidar",
        "chunk-DJBYDrsP",
        "cjsx",
        "crawlable",
        "ctsx",
        "cwabehiwqehqwueh",
        "cyco130",
        "daisyui",
        "dataendpointurl",
        "demonii",
        "disableautorun",
        "docpress",
        "dotenv",
        "ecosia",
        "ELIFECYCLE",
        "enableeagerstreaming",
        "esbuild",
        "evenodd",
        "fabioricali",
        "Fastify",
        "favicons",
        "flac",
        "Flightcontrol",
        "FOUC",
        "frontmatter",
        "gensync",
        "getbody",
        "getpage",
        "gonesurfing",
        "hapi",
        "hattip",
        "hemengke1997",
        "homescreens",
        "Hono",
        "httpresponse",
        "IIRC",
        "imagetools",
        "Immortalin",
        "importee",
        "importees",
        "inlang",
        "ipfs",
        "ipns",
        "jamesladd",
        "Jearce",
        "jeremypress",
        "jfif",
        "keepscrollposition",
        "Kenzo-Wada",
        "KHTML",
        "Laravel",
        "lewis-fidlers",
        "libvpx",
        "libx",
        "liefern",
        "llqijrlvr",
        "lolipop",
        "lourot",
        "luisfloat",
        "Macbook",
        "magne4000",
        "mantine",
        "marko",
        "mdast",
        "mdxeditor",
        "metafile",
        "mikew",
        "minifiers",
        "mjsx",
        "modulepreload",
        "msvideo",
        "mtsx",
        "navigations",
        "nextauth",
        "nextjs",
        "nextui",
        "nitedani",
        "nitedani's",
        "noexternal",
        "nojekyll",
        "Nprogress",
        "Nuxt",
        "onbeforeprerender",
        "onbeforerender",
        "onbeforeroute",
        "onclick",
        "onwarn",
        "openbittorrent",
        "opentrackr",
        "opral",
        "osseonews",
        "outro",
        "pagecontext",
        "pdanpdan",
        "phiberber",
        "phonzammi",
        "pinia",
        "Pixelatex",
        "pjpeg",
        "popstate",
        "Preact",
        "prerender",
        "prerendered",
        "primereact",
        "Prio",
        "pullstate",
        "quicktime",
        "quotepath",
        "Readables",
        "redirections",
        "renderpage",
        "Rollbar",
        "rollup",
        "royalswe",
        "ryanweal",
        "scaffolder",
        "shadcn",
        "signup",
        "solidjs",
        "sourcegraph",
        "stamppipe",
        "Strydom",
        "styleregistry",
        "stylesheet",
        "suppor",
        "Symfony",
        "tabspace",
        "tanstack",
        "tauri",
        "telefunc",
        "ThimoDEV",
        "transpiles",
        "transpiling",
        "trpc",
        "typesafe",
        "ultimateshadsform",
        "unpic",
        "unplugin",
        "untab",
        "Unternehmen",
        "urql",
        "useclientrouter",
        "useconfig",
        "usedata",
        "usepagecontext",
        "USEPOLLING",
        "vchirikov",
        "Vercel",
        "veryslow",
        "vike",
        "vike-react-antd",
        "vikejs",
        "Vilay",
        "vite",
        "vitejs",
        "vuetify",
        "vueuse",
        "Vuex",
        "Weltall",
        "wobsoriano",
        "yuva",
        "Zustand"
    ],
    "ignoreWords": [],
    "import": [],
    "enabled": true
}

@cr7pt0gr4ph7
Copy link
Contributor

@brillout What's your IDE of choice? It would probably make sense to use something in CI whose dictionaries/ignore comments/etc. are compatible with the IDE plugin.

@cr7pt0gr4ph7
Copy link
Contributor

cr7pt0gr4ph7 commented Nov 23, 2024

The reason I was mentioning using their an package is because it would allow us to correct spelling mistakes by simply running an npm script.

Based on the spelling errors I have seen in the source code, I don't now if a non contextual word-only spelling correction would work as intended, because it's not always the most similar "correct" word that was intended (which can often only be determined by taking context into account).

My recommendation would be an IDE extension to catch & fix errors as they appear, and a CI action as a safety net.

@brillout
Copy link
Member Author

@brillout What's your IDE of choice? It would probably make sense to use something in CI whose dictionaries/ignore comments/etc. are compatible with the IDE plugin.

Maybe I'm wrong but I feel like this would add too many restrictions on what spellchecker we can use. Especially considering that we already have a couple of restrictions.

Based on the spelling errors I have seen in the source code, I don't now if a non contextual word-only spelling correction would work as intended, because it's not always the most similar "correct" word that was intended (which can often only be determined by taking context into account).

Yea, the more typos the spellchecker catches the better. But I think we can already catch more than 90% of typos with a dumb spellcheck which I think it's good enough.

Having a good workflow is more important, the spellchecker shouldn't be too disruptive for contributors. That would be a no-go. And so far I think having to maintain a whitelist of unknown words is a no-go for that reason.

If we can find a very good spellchecker as well as a good workflow, then that'd be great but I ain't sure we'll find something perfect here.

My recommendation would be an IDE extension to catch & fix errors as they appear, and a CI action as a safety net.

Makes sense.

@IanWorley @cr7pt0gr4ph7 Btw. I was thinking one thing we could do is to implement our own npm package with typos-rs-npm as peer dependency. So we can add custom logic (e.g. for implementing // @typo-ignore) without having to maintain the spellchecking part. But let's see maybe we don't need this.

@IanWorley
Copy link

IanWorley commented Nov 26, 2024

@brillout Just a quick heads up just point out a bug with the typos package told by the maintainer should be up by sometime tomorrow

@dalisoft
Copy link

@IanWorley @brillout Try remove lockfile, node_modules and cache and re-install last version. It should work on Arch Linux (and other linux too). And all credits goes to @crate-ci for typos. I just made npm port for typos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants