Yesterday I was updating one of the specifications I maintain at work for my day job.
I have previously blogged about configuring the use of Markdownlint for use with a GitHub Action, migrating from Travis CI to GitHub Actions for this part and I had for some time wanted to venture into more extensive use of GitHub Actions for this work.
The issue addressed was some misinformation in the description of a process, which was obsoleted. It was not something I could have caught using software or similar, but it got me thinking if there where other parts of this maintenance where GitHub Actions could be of assistance and tt struck me that perhaps checking for spelling errors could be an area where GitHub Actions could help.
I did a quick search on the GitHub MarketPlace and three Actions were already available.
As I commented on my mentioned blog post often somebody has already implemented the action you need and it can save you a lot of time, using something which is is already out there, instead of rolling your own.
I use VScode as my primary editor and I have both spell checking and Markdown linting integration in the editor using extensions, but sometimes I miss the reported problems and commit anyway. This argues for implementing these as Git pre-commit hooks, but for now CI using GitHub Actions is my safety net. And for occasional PRs I cannot be sure the contributors toolchain matches my own, even though relevant configuration files are included in the repositories, so GitHub Actions are useful.
Now lets get down to business.
I decided on "Spellcheck Action" since it had 17 stars.
I started by reading over the documentation. The documentation on the Marketplace was quite sparse. It did mention use of PySpelling and the possibility of specifying a spellcheck.yaml
to overwrite the default configuration, which all sounded very good and useful. It was only release 0.2.0, but I am not so hung up on version numbers and it sounded like it would fit my use-case.
Next up was reading the code. The implementation was based on a Docker image, which also suited me fine, since the little experience I have with GitHub Actions is with using a Docker based solution and not JavaScript, which is the other option.
Oh yeah and I got a PR created, since I fell over something which I believe to be a spelling error in a configuration example.
I added the action to a new unpublicized repository I am setting up for a new initiative and started to configure it.
Many, many attempts later, I decided to take a break and decided do something else.
The first problem was simply the action complaining about missing the required dictionary file: wordlist.txt
cp: cannot stat '/wordlist.txt': No such file or directory
['aspell', '--lang', 'en', '--encoding', 'utf-8', 'create', 'master', '/github/workspace/wordlist.dic']
Current wordlist: 'wordlist.txt'
Problem compiling dictionary. Check the binary path and options.
Traceback (most recent call last):
File "/usr/local/bin/pyspelling", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 34, in main
debug=args.debug
File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 59, in run
debug=debug
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 672, in spellcheck
for result in spellchecker.run_task(task, source_patterns=sources):
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 307, in run_task
personal_dict = self.setup_dictionary(task)
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 350, in setup_dictionary
output
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 380, in compile_dictionary
with open(wordlist, 'rb') as src:
FileNotFoundError: [Errno 2] No such file or directory: 'wordlist.txt'
I tried a variation of solutions adding an empty wordlist.txt
$ touch wordlist.txt
But to no avail.
I returned later in the afternoon after some pondering, still without luck and I decided call it quits for the day.
Later it came to me that I had ignored all best practices in problem solving. Instead of was just firing commits at the problem, thinking:
this will do it!
I decided to take a step back and examine the Dockerfile as a stand-alone container on my on machine instead of wasting resources evaluating possibles solutions via GitHub and the resources allocated to running the actions, which for this particular action, which is not particularly fast, I needed to speed up the feedback process and I still felt that it was me who was missing something and I was not using the action correctly.
On a side note, this is one of the reasons I love open source, if you have a problem, you can peek at the innards and often even poke at them to get the them to behave.
Getting the Docker image to build was quite easy,
$ docker build -t github-action-spellcheck .
Running it not so much since the context of Action was not really set up. In the examples I have seen all actions work on the checked out project sort of magically.
$ docker run -it github-action-spellcheck
¯\_(ツ)_/¯
A lot of information is available on the context of the action in GitHub. however I decided to not focus on the GitHub integration, but the basic container and I went for a small detour to understand the action as a whole.
- It was a Docker based solution, with a
Dockerfile
and anENTRYPOINT
file:entrypoint.sh
-
PySpelling is a encapsulation of
aspell
a widely adopted component for spelling correction
I installed aspell
via Homebrew and tried it out. It worked like a charm and is good to have as a backup for doing more interactive editorial work and in the long run for building up dictionaries.
I installed PySpelling, the recipe for this was extracted from the Dockerfile
.
RUN pip3 install pyspelling
After repeating that step locally I could emulate the work done by the ENTRYPOINT
outlined in the entrypoint.sh
file.
pyspelling -c spellcheck.yaml
Traceback (most recent call last):
File "/usr/local/bin/pyspelling", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 34, in main
debug=args.debug
File "/usr/local/lib/python3.7/site-packages/pyspelling/__main__.py", line 59, in run
debug=debug
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 672, in spellcheck
for result in spellchecker.run_task(task, source_patterns=sources):
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 316, in run_task
for sources in self._walk_src(source_patterns, glob_flags, glob_limit, self.pipeline_steps, expect_match):
File "/usr/local/lib/python3.7/site-packages/pyspelling/__init__.py", line 216, in _walk_src
'\n'.join('- {}'.format(target) for target in targets)
RuntimeError: None of the source targets from the configuration match any files:
- **/*.py
Do note the above output was added after, when retracing my steps, based on the default spellcheck.yaml
.
In my commit frenzy (ref: April 4.) I had altered the file, to focus on Markdown targets.
matrix:
- name: Markdown
aspell:
lang: en
dictionary:
wordlists:
- wordlist.txt
output: wordlist.dic
encoding: utf-8
pipeline:
- pyspelling.filters.markdown:
- pyspelling.filters.html:
comments: false
ignores:
- code
- pre
sources:
- '**/*.md'
default_encoding: utf-8
AND FINALLY we failed with a success:
$ pyspelling -c spellcheck.yaml
Misspelled words:
<htmlcontent> README.md: html>body>ul>li
--------------------------------------------------------------------------------
Pragma
Readonly
--------------------------------------------------------------------------------
!!!Spelling check failed!!!
NB: The above example is from another repository, but you get the picture. Again recapping exact steps is not always feasible.
I had found out the my configuration files were working, the tools were working (locally though), which left me with the Docker integration. It seemed the Docker integration did not put the files in the right place so Python component could examine use the configuration files. As I wrote earlier at lot of context was available in GitHub in the action logs, but still focus was still on getting it to work locally before attacking the action, which had already been attempted with failure.
In desperation I went over the actions repository and fell over an reported issue.
Heh, should perhaps have checked this the moment I observed my issues - well.
I had gotten around to familiarize my self with the action's implementation and yet still a n00b in actions I felt like I was close to a solution.
As mentioned running the locally built Docker image was pretty useless.
$ docker run -it github-action-spellcheck
¯\_(ツ)_/¯
What if I provided the repository to the Docker image like so:
$ docker run -it -v $PWD:/ github-action-spellcheck
docker: Error response from daemon: invalid volume specification: '/Users/jonasbn/develop/github/blog-examples:/': invalid mount config for type "bind": invalid specification: destination can't be '/'.
See 'docker run --help'.
Next attempt using a non-root directory in the container, which required a minor change to the Dockerfile
so the following line was added just above the ENTRYPOINT
entry.
WORKDIR /tmp
New attempt:
$ docker run -it -v $PWD:/tmp github-action-spellcheck
And it worked and now I had working tools, configuration files and a working Docker image. Now I needed to get it to work as a proper action, which was the original goal.
The good thing about open source is that it is so easily available. The Spellcheck action is available on GitHub under an MIT license, which made it possible to address the issues without fear of repercussions, so lets go over the changes.
- The already mentioned change of setting
WORKDIR
in theDockerfile
made it possible to test locally. - I altered the content of the
ENTRYPOINT
file:entrypoint.sh
, this was not crucial to make it work, but it suits my temper better
The original:
#!/bin/bash
if [ ! -f ./spellcheck.yaml ]; then
cp /spellcheck.yaml .
fi
if [ ! -f ./wordlist.txt ]; then
cp /wordlist.txt .
fi
pyspelling -c spellcheck.yaml
- I have a hard time with defaults, so I removed the copying in of the actions own
spellcheck.yaml
andwordlist.txt
. If would much rather have the action crash and burn if preconditions were not met, than applying some general policy and dictionary - I would prefer if the user where told to include a
spellcheck.yaml
and in this the user should specify awordlist.txt
if needed. The documentation could prove pointers on basic configurations for repositories with different contents, like Markdown, HTML, Python etc. - I would prefer the
spellcheck.yaml
to be a hidden file due to the fact, that it is a basic configuration, not the primary contents of a action using repository, so a name of:.spellcheck.yaml
should be chosen, the recommendation should be for thewordlist.txt
to adhere to the same policy, using the name.wordlist.txt
in the documentation and in referred to as this in the configuration examples.
#!/bin/sh -l
SPELLCHECK_CONFIG_FILE=''
if [ -f ./.spellcheck.yaml ]; then
SPELLCHECK_CONFIG_FILE='.spellcheck.yaml'
fi
if [ -f ./.spellcheck.yml ]; then
SPELLCHECK_CONFIG_FILE='.spellcheck.yml'
fi
echo ""
echo "Using pyspelling on repository files outlined in $SPELLCHECK_CONFIG_FILE"
echo "----------------------------------------------------------------"
pyspelling -c $SPELLCHECK_CONFIG_FILE
EXITCODE=$?
test $EXITCODE -eq 0 || echo "($EXITCODE) Repository contains spelling errors or spelling check failed, please check diagnostics";
exit $EXITCODE
Going over the issues I had fallen over another open issue, where somebody was naming their YAML file: spellcheck.yml
using yml
as the suffix, which is a more widely adopted naming convention, so my take on the entrypoint.sh
also takes this into consideration.
I have adopted my local implementation for several of my repositories with more to come. This is the opposite of my first recommendation of using the available component if one exists, but it does demonstrate another _power
roll your own, if what is available is not working for you or match_your use-case
I can see that I can come in a situation where I would have to maintain several of these actions, so a common action or Docker container, should be the approach. Right now I need to backport the latest changes from the last repository I touch to those already having a similar action implemented locally.
I love to experiment and learn using, hacking, generalising is a good exercise, but I would prefer to work on the actual contents of my repositories and not the infrastructure.
I am by no means unhappy with the original Spell Action or it's author @rojopolis. I learned a lot and I see several points of improvement to my own toolbox.
- Can I get the repository dictionary to be shared between VScode and the PySpelling based action, like I can with Markdownlint?
- Can I put
aspell
to better using during my writing process?`
And finally - I could deci to roll my own action based on the work done by @rojopolis at the same time, I would prefer to send my proposed changes upstream, since this is more the open source way and it would solve the maintenance burden for me and others - if @rojopolis is unresponsive after a period of time, setting up my own project based on a fork could be the way ahead.
Next steps:
- Backport changes to all my actions
- Create PR with proposed changes and improved documentation
- Make further use of the action, preferably based on the original as a Docker container instead of my own
There are lots of grey areas in this post, lots of uncharted territory, if you can fill in some blanks please comment on the post. Feedback most welcome, smarter ways, better solutions, questions and insights.
Take care and watch out for each other.