Support Single-File re-rendering #926

garvinhicking · 2024-03-14T08:06:07Z

Feature request

Motivation

Large set of documentation files (here: TYPO3 Core Changelogs (https://github.com/TYPO3/typo3/tree/main/typo3/sysext/core/Documentation, 3.237 files)
Parsing takes 12 seconds, compiling about 25 seconds, rendering the full set there takes ~8 minutes (depending on the CPU of the editor, it can be much slower)
Editors may want to write a new single changelog file. To preview the rendering of such a file, they have to go through the whole process and wait very long. This in turn leads to editors simply committing ReST files to the project without having tested them. During writing, re-rendering may happen multiple times as writing goes on.
The same feature can be used for a "localization" rendering problem, where localizations are sub-directories of a main project. They also need to be rendered distinctly, while keeping relations to files outside the main directory intact (Includes.rst.txt).

Current implementation

Currently, one might try to restrict rendering to a specific path, like: bin/guides --config=/Documentation --input-file=Documentation/Changelog/13.1/Feature-4711-document.rst Documentation/Changelog/13.1/ , however:
- File references for example the Include /Includes.rst.txt directive will fail, because the documentation root has been mapped to "/Documentation/Changelog/13.1" and then "/" would not refer to "/Documentation/Include.rst.txt".
- No proper menu would be rendered (because the other files in the structure will be missing)
- Links to other documents would not work (because they have not been parsed)
- The output would be saved in "/Documentation-GENERATED-temp/Feature-4711-document.html" instead of the proper directory structure

Feature Idea

If there was either an automatic process or a specific command that only renders a single file instead of the whole documentation, that would considerably speed up the process, and also not waste resources
For the command line, there could be an option like --single-render=/path/to/one/File.rst
For automation, the parse process could compare the last generated file timestamp, and only re-render a single file if the timestamp of the .rst file is newer than the .html file

Details about possible implementations

We talked about this in a slack chat (https://typo3.slack.com/archives/C0638JMJVLY/p1710358832263659) and had some ideas.

The general processing is:

Parse (collect files)
Compile (index files, build cross-references)
Rendering (transform the actual RST to HTML)

Two steps could benefit from intermediate caches (as maybe serialized PHP, or format easy to parse like json). The "parse" and "compile" processes already write objects.inv.json, so a single-render could re-use that generated file to restore the tree into memory again? That file might need additional metadata for full menu/interlink generation.

So the execution would be:

Check if objects.inv.json already exists (if not: full rendering)
Only load the specified input file, and with the map of the objects.inv.json file, build the menu information and interlinks
Bonus: If no specific input file is specified, but a directory (or "everything"), the render process could try to only re-render RST files that are newer than their HTML file (or if the HTML file is missing)

Watcher

Another idea was to let guides be able to run as an ongoing daemon with a watch task. In this case the result of the parse+compile operation would be kept in-memory, and a constant loop would check for changes in the filesystem (maybe php-inotify based).

Whenever a RST file is changed, only that single file would get re-rendered. No intermediate cache processing would be needed, because the process always has this information in memory.

When a NEW RST file is created, the in-memory map would need to be expanded. When a RST file was deleted, the in-memory map would need to be reduced.

Having a watch task for the rendering process on the INPUT directory would allow a second process (maybe a NodeJS/Vite-based watcher) to watch the OUTPUT directory (HTML files). On any change in the OUTPUT directory, a browser hot reload process could reload changes made by the rendering process.

This watch task might even allow multi-threading, so that single threads could be triggered for each single file change; and initial rendering could split up the results of the parse+compile step to allow parallel rendering of HTML files (that would get complex though for race conditions writing to the in-memory pool of parse+compile data, when new/removed files are addressed).

For a solution like TYPO3's render-guides this would mean the docker container could get a command that spawns two processes which would run alongside each other.

TL;DR

A first step could be to allow reading the generated objects.inv.json as a "cache" to prevent the need for re-parse and re-compile, so that when a single input file (or directory) gets specified, only that is rendered but can still use the menu and interlink metadata. The pool of files to be rendered would then just be reduced to those files specified.

We could do this operation on a condition for a new input directive (--single-render) so it doesn't affect the current rendering negatively. Conditions on that directive could bypass the parse+compile steps, and try to populate the needed object properties / in-memory data maps from the objects.inv.json file.

The text was updated successfully, but these errors were encountered:

garvinhicking added feature triage labels Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Single-File re-rendering #926

Support Single-File re-rendering #926

garvinhicking commented Mar 14, 2024

Support Single-File re-rendering #926

Support Single-File re-rendering #926

Comments

garvinhicking commented Mar 14, 2024

Feature request

Motivation

Current implementation

Feature Idea

Details about possible implementations

Watcher

TL;DR