A translate script is provided to facilitate working with pandoc and deepl translation services.
The user manual is available: https://jodygarnett.github.io/translate/
The user manual is written as an example in sphinx reStructuredText and translated to mkdocs as a regression test:
-
This script requires pandoc be installed:
Ubuntu:
apt-get install pandoc
macOS:
brew install pandoc
References:
-
A writable python environment is required.
If you use homebrew (popular on macOS). This installs into user space so it is a witable environment.
brew install python
You may also use the system python provided by:
- Linux distribution
- Microsoft App Store
- https://www.python.org/ (windows and macOS)
The system python is not used directly, it includes
virtualenv
used to setup a writable Python enviornment:virtualenv venv source venv/bin/activate
-
Install mkdocs_translate into your writable Python environment.
To install latest release from pypi:
pip install mkdocs-translate
To install development version use (to preview and provide feedback):
pip install git+https://github.com/jodygarnett/translate.git
-
To check it is installed correctly:
mkdocs_translate --help
-
The script is intended to run from the location of your mkdocs project (with
docs
andmkdocs.yml
files):cd core-genetwork/docs/manual
-
The script makes use of existing
build
(ortarget
) folder for scratch files:mkdir build
-
Optional: Create a translate.yml filling in the conversion parameters for your project.
This file is used to indicate the
build
ortarget
directory to use for temporary files.Additional configuration options are required for advanced sphinx-build
config.py
options like substitutions and external links.
A working example is provided to be adapted for your project:
-
Create requirements.txt with mkdocs plugins required.
-
Create mkdocs.yml.
-
Optional: If your content uses
download
directive to include external content, there is amkdocs
hook for processing ofdownload.txt
files.Create download.py.
Register hook with
mkdocs.yml
:# Customizations hooks: - download.py
-
Use
.gitignore
to ignore the following:build target
-
The resulting directory structure is:
doc/ source/ .gitignore requirements.txt mkdocs.yml download.py
GeoServer is used as an example here, which is a maven project with a convention of target
for temporary files.
-
Initial setup of
docs
folder structure (so all the images fromsource
folder are present):mkdocs_translate init
-
To scan
rst
files before conversion:mkdocs_translate scan
The scan collects an index of pages and headings, and looks for any download files that have been managed by sphinx.
--scan=all
: (default)--scan=index
: scan anchors and headings intotarget/convert/anchors.txt
fordoc
andref
directives.--scan=download
: scandownload
directives for external content, intodocs
folder, producingdownload/download.txt
folders.
mkdocs_translate scan
-
To migrate content from
rst
tomd
:mkdocs_translate migrate
-
Review this content you may find individual files to fix.
Some formatting is easier to fix in the
rst
files before conversion:-
Indention of nested lists in
rst
is often incorrect, resulting in restarted numbering or block quotes. -
Random
{.title-ref}
snippets is a general indication to simplify the rst and re-translate. -
Anchors or headings with trailing whitespace throwing off the heading scan, resulting in broken references
To reconvert migrate accepts paths to a file or folder:
mkdocs_translate migrate source/introduction/license.rst mkdocs_translate migrate source/introduction/**/*.rst
-
-
To generate out navigation tree:
mkdocs_translate nav
Supply path information for a file or folder:
mkdocs_translate nav source/index.rst mkdocs_translate nav source/introdction/**/*.rst
The output is printed to standard out and may be appended to
mkdocs.yml
file.
Some things are not supported by pandoc, which will produce WARNING:
messages:
-
Substitutions used for inline images
-
Underlines: replace with bold or italic
WARNING: broken reference 'getting_involved' link:getting_involved-broken.rst
Translations are listed alongside english markdown:
example.md
example.fr.md
Using pandoc to convert to html
, and then using the Deepl REST API.
-
Provide environmental variable with Deepl authentication key:
export DEEPL_AUTH="xxxxxxxx-xxx-...-xxxxx:fx"
-
Translate a document to french using pandoc and deepl:
mkdocs_translate french docs/help/index.md
-
To translate several documents in a folder:
mkdocs_translate french docs/overview/*.md
Deepl charges by the character so bulk translation not advisable.
See mkdocs_translate french --help
for more options.
You are welcome to use google translate, ChatGPT, or Deepl directly - keeping in mind markdown formatting may be lost.
Please see the writing guide for what mkdocs functionality is supported.
To build and test locally:
-
Clone:
git clone https://github.com/jodygarnett/translate.git translate
-
Install requirements:
cd translate pip3 install -r mkdocs_translate/requirements.txt
-
Install locally:
pip3 install -e .
Distribution:
-
Update version number in
mkdocs_translate/__init__.py
version:__version__ = 0.4.2
-
Build wheel:
python3 -m build
-
Upload wheel:
python3 -m twine upload --repository pypi dist/*
Debugging:
-
Recommend troubleshooting a single file at a time:
mkdocs_translate rst docs/index.rst
-
Compare the temporary files staged for pandoc conversion:
bbedit docs/index.rst docs/index.md target/convert/index.tmp.html target/convert/index/tmp.md
-
To turn on logging during conversion:
mkdocs_translate --log=DEBUG translate.yml rst
Pandoc:
-
The pandoc plugin settings are in two constants:
md_extensions_to = 'markdown+definition_lists+fenced_divs+backtick_code_blocks+fenced_code_attributes-simple_tables+pipe_tables' md_extensions_from = 'markdown+definition_lists+fenced_divs+backtick_code_blocks+fenced_code_attributes+pipe_tables'
-
The pandoc extensions are chosen to align with mkdocs use of markdown extensions, or with post-processing:
markdown extension pandoc extension post processing tables pipe_tables pymdownx.keys post processing pymdownx.superfences backtick_code_blocks post processing admonition fenced_divs post processing -
To troubleshoot just the markdown to html conversion:
mkdocs_translate internal_html manual/docs/contributing/style-guide.md mkdocs_translate internal_markdown target/contributing/style-guide.html diff manual/docs/contributing/style-guide.md target/contributing/style-guide.md
For geoserver or core-geonetwork (or other projects following maven conventions) no configuration is required.
To override configuration on command line add -concfig <file.yml>
before the command:
mkdocs_translate --config translate.yml rst
The file mkdocs_translate/config.yml
file contains some settings (defaults are shown below):
-
deepl_base_url
: "https://api-free.deepl.com"Customize if you are paying customer.
-
project_folder
: "."Default assumes you are running from the current directory.
-
rst_folder
: "source" -
docs_folder
: "docs" -
build_folder
: "target"The use of "target" follows maven convention, python projects may wish to use "build"
-
docs_folder
: "docs"mkdocs convention.
-
anchor_file
: 'anchors.txt' -
upload_folder
: "translate"Combined with
build_folder
to stage html files for translation (example:build/translate
) -
convert_folder
: "convert"Combined with
build_folder
for rst conversion temporary files (example:build/convert
). Temporary files are required for use by pandoc. -
download_folder
: "translate"Combined with
build_folder
to retrieve translation results (example:build/translate
) Temporary files are required for use by pandoc. -
substitutions
: dictionary of|substitutions|
to use when converting config.py rst_epilog common substitutions.project: GeoServer author: Open Source Geospatial Foundation copyright: 2023, Open Source Geospatial Foundation project_copyright: 2023, Open Source Geospatial Foundation
-
The built-in substitutions for
|version|
and|release|
are changed to{{ version }}
and{{ release }}`` variables for use with
mkdocs-macros-plugin` variable substitution:Use
mkdocs.yml
to define:extra: homepage: https://geoserver.org/ version: '2.24' release: '2.24.2'
-
extlinks
: dictionary of config.py extlinks substitutions.To convert sphinx-build config.py:
extlinks = { 'wiki': ('https://github.com/geoserver/geoserver/wiki/%s', None), 'user': ('https://docs.geoserver.org/'+branch+'/en/user/%s', None), 'geos': ('https://osgeo-org.atlassian.net/browse/GEOS-%s','GEOS-%s') }
Use config.yml (note use of mkdocs-macros-plugin for variable substitution:
extlinks: wiki: https://github.com/geoserver/geoserver/wiki/%s user: https://docs.geoserver.org/{{ branch }}/en/user/%s geos: https://osgeo-org.atlassian.net/browse/GEOS-%s|GEOS-%s download_release: https://sourceforge.net/projects/geoserver/files/GeoServer/{{ release }}/geoserver-{{ release }}-%s.zip|geoserver-{{ release }}-%s.zip