TL;DR—look at Protokolo and do exactly what it does.
This is a short article because I am lazy but do want to be helpful. The sections are the steps you should take. All code presented in this article is licensed CC0-1.0.
Use gettext
As a first step, you should use
gettext. This effectively
means wrapping all string literals in _()
calls. This article won’t waste a
lot of time on how to do this or how gettext works. Just make sure to get
plurals right, and make sure to provide translator comments where necessary.
I recommend using the class-based API. In your module, create the following file
i18n.py
.
import gettext as _gettext_module
import os
_PACKAGE_PATH = os.path.dirname(__file__)
_LOCALE_DIR = os.path.join(_PACKAGE_PATH, "locale")
TRANSLATIONS = _gettext_module.translation(
"your-module", localedir=_LOCALE_DIR, fallback=True
)
_ = TRANSLATIONS.gettext
gettext = TRANSLATIONS.gettext
ngettext = TRANSLATIONS.ngettext
pgettext = TRANSLATIONS.pgettext
npgettext = TRANSLATIONS.npgettext
This assumes that your compiled .mo
files will live in
your-module/locale/<lang>/LC_MESSAGES/your-module.mo
. We’ll take care of that
later. Putting the compiled files there isn’t ideal (you want them in
/usr/share/locale
), but it’s the best you can do with Python packaging.
In subsequent files, just do the following to translate strings:
from .i18n import _
# TRANSLATORS: translator comment goes here.
print(_("Hello, world!"))
However, the Click module doesn’t use our TRANSLATIONS
object. To fix this, we
need to use the GNU gettext API. This is kind of dirty, because it messes with
the global state, so let’s do it in cli.py
(the file which contains all your
Click groups and commands).
if gettext.find("your-module", localedir=_LOCALE_DIR):
gettext.bindtextdomain("your-module", _LOCALE_DIR)
gettext.textdomain("your-module")
Internationalise Click
When using Click, you have two challenges:
- You need to translate the help docstrings of your groups and commands.
- You need to translate the Click gettext strings.
Translating docstrings
Normally, you have some code like this:
@click.group(name="your-module")
def main():
"""Help text goes here."""
...
And when you run your-module --help
, you get the following output:
$ your-module --help
Usage: your-module [OPTIONS] COMMAND [ARGS]...
Help text goes here.
Options:
--help Show this message and exit.
You cannot wrap the docstring in a _()
call. So by necessity, we will need to
remove the docstring and do something like this:
_MAIN_HELP = _("Help text goes here.")
@click.group(name="your-module", help=_MAIN_HELP)
def main():
...
For multiple paragraphs, I translate each paragraph separately, which is easier for the translators:
_HELP_TEXT = (
_("Help text goes here.")
+ "\n\n"
+ _(
"Longer help paragraph goes here. We use implicit string concatenation"
" to avoid putting newlines in the translated text."
)
)
Translate the Click gettext strings
We will create a script generate_pot.sh
that generates our .pot
file,
including the Click translations. My script-fu isn’t very good, but it appears
to work.
#!/usr/bin/env sh
# Set VIRTUAL_ENV if one does not exist.
if [ -z "${VIRTUAL_ENV}" ]; then
VIRTUAL_ENV=$(poetry env info --path)
fi
# Get all the translation strings from the source.
xgettext --add-comments --from-code=utf-8 --output=po/your-module.pot src/**/*.py
xgettext --add-comments --output=po/click.pot "${VIRTUAL_ENV}"/lib/python*/*-packages/click/**.py
# Put everything in your-module.pot.
msgcat --output=po/your-module.pot po/your-module.pot po/click.pot
# Update the .po files. Ideally this should be done by Weblate, but it appears
# that it isn't.
for name in po/*.po
do
msgmerge --output="${name}" "${name}" po/your-module.pot;
done
After running this script, all strings that must be translated are in your
.pot
and existing .po
files.
You can use the above script for argparse as well, with minor modifications.
Generate .pot
file automagically
You don’t want to manually run the generate_pot.sh
script. Instead, you want
the CI (Forgejo Actions) to run it on your behalf whenever a gettext string is
changed or introduced.
Use the following .forgejo/workflows/gettext.yaml
file.
name: Update .pot file
on:
push:
branches:
- main
# Only run this job when a Python source file is edited. Not strictly
# needed.
paths:
- "src/your-module/**.py"
jobs:
create-pot:
runs-on: docker
container: nikolaik/python-nodejs:python3.11-nodejs21
steps:
- uses: actions/checkout@v3
- name: Install gettext and wlc
run: |
apt-get update
apt-get install -y gettext wlc
# We mostly install your-module to install the click dependency.
- name: Install your-module
run: poetry install --no-interaction --only main
- name: Lock Weblate
run: |
wlc --url https://hosted.weblate.org/api/ --key ${{secrets.WEBLATE_KEY }} lock your-project/your-module
- name: Push changes from Weblate to upstream repository
run: |
wlc --url https://hosted.weblate.org/api/ --key ${{secrets.WEBLATE_KEY }} push your-project/your-module
- name: Pull Weblate translations
run: git pull origin main
- name: Create .pot file
run: ./generate_pot.sh
# Normally, POT-Creation-Date changes in two locations. Check if the diff
# includes more than just those two lines.
- name: Check if sufficient lines were changed
id: diff
run:
echo "changed=$(git diff -U0 | grep '^[+|-][^+|-]' | grep -Ev
'^[+-]("POT-Creation-Date|#:)' | wc -l)" >> $GITHUB_OUTPUT
- name: Commit and push updated your-module.pot
if: ${{ steps.diff.outputs.changed != '0' }}
run: |
git config --global user.name "your-module-bot"
git config --global user.email "<>"
git add po/your-module.pot po/*.po
git commit -m "Update your-module.pot"
git push origin main
- name: Unlock Weblate
run: |
wlc --url https://hosted.weblate.org/api/ --key ${{ secrets.WEBLATE_KEY }} pull your-project/your-module
wlc --url https://hosted.weblate.org/api/ --key ${{ secrets.WEBLATE_KEY }} unlock your-project/your-module
The job is fairly self-explanatory. The wlc
command talks with Weblate, which
we will set up soon. The job installs dependencies, gets the latest
translations from Weblate, generates the .pot
, and then pushes the generated
.pot
(and .po
files) if there were changed strings.
See
reuse-tool
for a GitHub Actions job. It is currently missing the wlc
locking.
Set up Weblate
Create your project in Weblate. In the VCS
settings, set version control system to ‘Git’. Set your source repository and
branch correctly. Set the push URL to
https://<your-token>@codeberg.org/your-name/your-module.git
. You get the token
from https://codeberg.org/user/settings/applications. You will need to give
the token access to ‘repository’. There should be a more granular way of doing
this, but I am not aware of it.
Set the repository browser to
https://codeberg.org/your-name/your-module/src/branch/{{branch}}/{{filename}}#{{line}}
.
Turn ‘Push on commit’ on, and set merge style to ‘rebase’. Also, always lock on
error.
In your project settings on Weblate, generate a project API token. Then in your
Forgejo Actions settings, create a secret named WEBLATE_KEY
with the project
API token as value.
Publishing your translations with Poetry
Now that all the translation plumbing is working, you just need to make sure
that you generate your .mo
files when building/publishing with Poetry.
We add a build step to Poetry using the undocumented build
script. Add the following to your pyproject.toml
:
[tool.poetry.build]
generate-setup-file = false
script = "_build.py"
Do NOT name your file build.py
. It will break Arch Linux
packaging.
Create the file _build.py
. Here are the contents:
import glob
import logging
import os
import shutil
import subprocess
from pathlib import Path
_LOGGER = logging.getLogger(__name__)
ROOT_DIR = Path(os.path.dirname(__file__))
BUILD_DIR = ROOT_DIR / "build"
PO_DIR = ROOT_DIR / "po"
def mkdir_p(path):
"""Make directory and its parents."""
Path(path).mkdir(parents=True, exist_ok=True)
def rm_fr(path):
"""Force-remove directory."""
path = Path(path)
if path.exists():
shutil.rmtree(path)
def main():
"""Compile .mo files and move them into src directory."""
rm_fr(BUILD_DIR)
mkdir_p(BUILD_DIR)
msgfmt = None
for executable in ["msgfmt", "msgfmt.py", "msgfmt3.py"]:
msgfmt = shutil.which(executable)
if msgfmt:
break
if msgfmt:
po_files = glob.glob(f"{PO_DIR}/*.po")
mo_files = []
# Compile
for po_file in po_files:
_LOGGER.info(f"compiling {po_file}")
lang_dir = (
BUILD_DIR
/ "your-module/locale"
/ Path(po_file).stem
/ "LC_MESSAGES"
)
mkdir_p(lang_dir)
destination = Path(lang_dir) / "your-module.mo"
subprocess.run(
[
msgfmt,
"-o",
str(destination),
str(po_file),
],
check=True,
)
mo_files.append(destination)
# Move compiled files into src
rm_fr(ROOT_DIR / "src/your-module/locale")
for mo_file in mo_files:
relative = (
ROOT_DIR / Path("src") / os.path.relpath(mo_file, BUILD_DIR)
)
_LOGGER.info(f"copying {mo_file} to {relative}")
mkdir_p(relative.parent)
shutil.copyfile(mo_file, relative)
if __name__ == "__main__":
main()
It is probably a little over-engineered (building into build/
and then
consequently copying to src/your-module/locale
is unnecessary), but it works.
Finally, make sure to actually include *.mo
files in pyproject.toml
:
include = [
{ path = "src/your-module/locale/**/*.mo", format="wheel" }
]
And that’s it! A rather dense and curt blog post, but it should contain helpful bits and pieces.