Skip to content

Commit

Permalink
Some doc on a TeX bug (latex3#1493)
Browse files Browse the repository at this point in the history
* some doc on a TeX bug

* David found a typo, so maybe spell-checking is a good idea (sorry lost the British English this way)

* oe further UK spelling

* De-dubplicate "the"

---------

Co-authored-by: Joseph Wright <joseph@texdev.net>
  • Loading branch information
FrankMittelbach and josephwright authored Oct 17, 2024
1 parent c86ee03 commit e8033b5
Showing 1 changed file with 136 additions and 43 deletions.
179 changes: 136 additions & 43 deletions required/tools/array.dtx
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
% \begin{macrocode}
%<+package>\NeedsTeXFormat{LaTeX2e}[2024/06/01]
%<+package>\ProvidesPackage{array}
%<+package> [2024/10/12 v2.6g Tabular extension package (FMi)]
%<+package> [2024/10/17 v2.6g Tabular extension package (FMi)]
%
% \fi
%
Expand Down Expand Up @@ -259,7 +259,7 @@
% \begin{table}[!t]
% \begin{center}
% \setlength{\extrarowheight}{1pt}
% \begin{tabular}{|>{\tt}c|m{9cm}|}
% \begin{tabular}{|>{\ttfamily}c|m{9cm}|}
% \hline
% \multicolumn{2}{|c|}{Unchanged options}\\
% \hline
Expand Down Expand Up @@ -382,6 +382,20 @@
% \end{itemize}
%
%
%
% \subsection{A note on the allowed content of \texttt{>\{...\}} and
% \texttt{<\{...\}}}
%
% These specifiers are meant to hold declarations, such as
% \verb=>{\itshape}=. They cannot end in commands that take arguments
% without providing these arguments as part of the \verb={...}=. It
% would be a mistaken assumption that they pick up all or parts of the
% alignment entry data if their argument is not provided. E.g.,
% \verb=>{\textbf}= would not make the whole column bold nor would it
% make the first character bold (technically it would try to
% bolden \cs{ignorespaces}). Thus, it would not fail with an error,
% but effectively the output would be wrong and not as expected.
%
% \subsection{The behavior of the \texttt{\string\\} command}
%
% In the basic \texttt{tabular} implementation of \LaTeX{} the \cs{\bslash}
Expand Down Expand Up @@ -464,16 +478,16 @@
% "\newcolumntype{L}{>{$}l<{$}}" \\
% "\newcolumntype{R}{>{$}r<{$}}"
% \end{quote}
% Then we can use \texttt{C} to get centred LR-mode in an
% \texttt{array}, or centred math-mode in a \texttt{tabular}.
% Then we can use \texttt{C} to get centered LR-mode in an
% \texttt{array}, or centered math-mode in a \texttt{tabular}.
%
% The example given above for `centred decimal points' could be
% The example given above for `center decimal points' could be
% assigned to a \texttt{d} specifier with the following command.
% \begin{quote}
% "\newcolumntype{d}{>{\centerdots}c<{\endcenterdots}}"
% \end{quote}
%
% The above solution always centres the dot in the
% The above solution always centers the dot in the
% column. This does not look too good if the column consists of large
% numbers, but to only a few decimal places. An alternative definition
% of a \texttt{d} column is
Expand Down Expand Up @@ -734,7 +748,7 @@
% user the opportunity of overriding the settings of a
% "\newcolumntype" defined using these declarations. For example,
% suppose in an \texttt{array} environment we use a \texttt{C}
% column defined as above. The \texttt{C} specifies a centred text
% column defined as above. The \texttt{C} specifies a centered text
% column, however ">{\bfseries}C", which re-writes to
% ">{\bfseries}>{$}c<{$}" would not specify a bold column as might
% be expected, as the preamble would essentially expand to
Expand Down Expand Up @@ -896,14 +910,14 @@ Bug reports can be opened (category \texttt{#1}) at\\%
%
% \section{A note on the updates done December 2023}
%
% We introduced support for tagged PDf and at the same time we added
% We introduced support for tagged PDF and at the same time we added
% code to determine row and column numbers for each cell in
% preparation for supporting formatting or type specifications for individual
% cells (or group of cells) from the outside, e.g., \enquote{rows 1,
% 2, and 10 are header rows} (syntax to be decided).
%
% This new code is already written with L3 programming layer conventions
% while most of the legay code is still as it was before. This make the code
% while most of the legacy code is still as it was before. This make the code
% currently somewhat clattered, unfortunately. Eventually this will all move to L3
% programming layer but this will take time.
%
Expand Down Expand Up @@ -1345,7 +1359,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% of \textsf{token} register $15$ instead of $1$ later on.
%
% The example above referred to an older version of =\save@decl= which
% inserted a =\relex= inside the token register. This is now moved to
% inserted a =\relax= inside the token register. This is now moved to
% the places where the actual token registers are inserted (look for
% =\the@toks=) because the old version would still make =@=
% expressions to moving arguments since after expanding the second
Expand Down Expand Up @@ -1386,33 +1400,112 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% \begin{macrocode}
\UseTaggingSocket{tbl/cell/begin}%
% \end{macrocode}
% Here, we assume that the \textsf{count} register
% Next we have to insert the toks register holding the content of
% \verb=>{...}=. Here, we assume that the \textsf{count} register
% =\@tempcnta= has saved the value $=\count@= - 1$.
%
% To keep \TeX{} happy if there is a look ahead in the tabular preamble
% which uses the Appendix~D trick (for example anything with a trailing
% optional argument defined by \pkg{ltcmd}), we wrap everything here in
% a protected version of \cs{@firstofone}.\footnote{The reason this
% works is not really clear: almost certainly there is a bug in \TeX{}
% here that we are simply avoiding, but as the master counter doesn't
% show up in a trace, a full understanding likely means working through
% the code that implements \cs{halign}!}
% \TeX{} otherwise can get
% To keep \TeX{} happy if there is a look ahead in the tabular
% preamble, i.e., starting in \verb=>{...}=, which uses the
% Appendix~D trick (for example, anything with a trailing optional
% argument defined by \pkg{ltcmd}), we wrap everything here in a
% protected version of \cs{@firstofone}. \TeX{} otherwise can get
% confused about the value of the master counter, and we get some
% strange errors. (Quite possibly the underlying issue is a \TeX{}
% bug, but rather than try to fix in 2024 we accept it's there and
% work-around.) As an example, without this approach, something
% like
% strange errors. We suspected that there was an underlying issue
% is the \TeX{} engine, but it turned out to be rather hard to get
% to the bottom of it, because the master counter is not accessible
% through \TeX{}'s tracing tools. Thus, all we could do was
% producing various example documents, observing results, as well
% as staring at a printout of the \TeX{} program. As an example,
% without this approach, something like
% \begin{verbatim}
%\NewDocumentCommand\foo{o}{x}
%\begin{tabular}{>{\foo}l}
% Foo
%\end{tabular}
% \NewDocumentCommand\foo{o}{x}
% \begin{tabular}{>{\foo}l}
% Foo
% \end{tabular}
% \end{verbatim}
% will fail; that can be fixed by adding a \cs{relax} after the \cs{@tempcnta},
% but that then leads to issues if you are collecting whole cells (tagging code
% or \\pkg{collcell}), where you can no longer alter the meaning of \cs{cr}
% as the master counter goes wrong.
% failed. That can be fixed by adding a \cs{relax} after the
% \cs{@tempcnta}, but that then leads to issues if you are
% collecting whole cells (tagging code or \pkg{collcell}), where
% you can no longer alter the meaning of \cs{cr} as the master
% counter goes wrong due to an obscure bug (or perhaps, say, an
% undocumented feature of \TeX{}). Eventually, we were able to pin
% down the root cause and really understand why
% \cs{@protected@firstofone} solves the problem, even though it
% looks like a nonsense addition to the code that does nothing
% useful.\footnote{So it is a \TeX{} engine bug that was in there
% from day one, or if you like, it is a hidden feature that is not
% explained; neither in the \TeX{}book nor in the program code. We
% don't really expect this to change in \TeX{} after such a long
% time, other than perhaps documenting it as a feature, so this is
% a proper solution to the problem and not just a workaround.}
%
% The problem is that \TeX{} tries to conserve stack space, and
% when the last token of an existing token list is a macro, then
% this token list is \emph{first} removed from memory (reducing the
% stack) \emph{before} the macro replacement text (as a new
% token list) is given to the parser adding a new stack level. This
% is done using the routine \texttt{end\_token\_list} in the \TeX{}
% program and ending the u-part of an \cs{halign} column with this
% routine immediately sets the \emph{master counter} used by alignments to
% zero (see chapter~22 and Appendix~D of the \TeX{}book). This
% means that technically the expansion of the last token in the u-part (if it
% is a macro) is not executed in the context of the u-part, but in
% the context of the alignment entry in the document. That normally
% doesn't make any difference whatsoever --- unless you play
% around (as we sometimes have to) with tricks like those from
% Appendix~D.
%
% To illustrate the issue we show a bit of strange low-level plain
% \TeX{} code.\footnote{If all of this looks mighty strange to you,
% don't worry. You will be unlikely to need to know about it. It is
% just there so that programmers at some point in the future do not
% have to wonder too much why there is this odd
% \cs{@protected@firstofone} that apparently does nothing
% useful. It took us several nights of head scratching to come up
% with these minimal examples and then some more time to understand
% what the heck is going on inside \TeX{}---thanks to Bruno for the
% right ideas on the latter.} Below are two very special grouping
% commands that are like \cs{bgroup} and \cs{egroup} but also
% affect the alignment master counter when expanded (see
% \TeX{}book p.385). If one of them is used as the
% last macro in the u-part of a column, then you get strange errors
% that you shouldn't get.
% \begin{verbatim}
% \def\bbgroup{{\ifnum0=`}\fi}
% \def\eegroup{\ifnum0=`{\fi}}
%
% % Fails with an error message, but there should be none:
% \halign{%
% \message{u-part^^J}%
% \bbgroup % <-- in the u-part
% \eegroup % <-- in the u-part
% #%
% \message{v-part^^J}%
% \hfill\cr
% \message{body^^J}x
% \cr
% }
%
% % Fails but should work, the v-part is never reached:
% \halign{%
% \message{u-part^^J}%
% \bbgroup % <-- in the u-part
% #%
% \message{v-part^^JJ}%
% \eegroup % <-- in the v-part
% \hfill\cr
% \message{body^^J}x
% \cr
% }
% \end{verbatim}
%
% So the trick we use now is making \cs{@protected@firstofone} the
% last macro in the u-part, i.e., before the \cs{@sharp}. That way
% its argument is always fully expanded as part of the alignment
% entry and not as part of the u-part and this way we know exactly
% what the master counter value is at this point, regardless of the content of
% \verb=>{...}=.
%
% \changes{v2.6f}{2024/09/13}{Stop parsing for optional argument (gh/1468)}
% \changes{v2.6g}{2024/10/12}{Further work to support optional args in preamble (gh/1468)}
% \begin{macrocode}
Expand All @@ -1429,7 +1522,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% afterwards; in math mode, the latter is suppressed while the
% \cs{ignorespaces} makes no difference.
% \changes{v2.0e}{1991/02/07}{Added \{\} around \cs{@sharp} for new ftsel}
% \changes{v2.0h}{1992/06/22}{Removed \{\} again in favour of
% \changes{v2.0h}{1992/06/22}{Removed \{\} again in favor of
% \cs{d@llarbegin}}
% \changes{v2.6b}{2024/04/08}{Do not \cs{unskip} if in math mode (gh/1323)}
% \begin{macrocode}
Expand Down Expand Up @@ -1920,7 +2013,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% For that reason the new implementation does the centering
% manually: First we check the height of the cell and if that is
% less or equal to =\ht\strutbox= we assume that this is a
% single line cell. In that case we don't do any vertical maneuvre
% single line cell. In that case we don't do any vertical maneuver
% and simply output the box, i.e., make it behave like a single
% line p-cell.
%
Expand Down Expand Up @@ -2363,7 +2456,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
\lineskip \z@
\baselineskip \z@
% \end{macrocode}
% Don't use \cs{m@th} here as that signals to the math taggingg
% Don't use \cs{m@th} here as that signals to the math tagging
% code that this is fake math that should not be tagged.
% \changes{v2.6a}{2023/12/11}{Support for tagged PDF}
% \begin{macrocode}
Expand Down Expand Up @@ -2489,7 +2582,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
%
% We then start a special brace which I have directly
% copied from the original definition. It is
% necessary, because the =\futurlet= in =\@ifnextchar=
% necessary, because the =\futurelet= in =\@ifnextchar=
% might
% expand a following =&= \textsf{token} in a construction like
% =\\ &=. This would otherwise end the alignment template at a
Expand All @@ -2502,7 +2595,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% that a =&= will not be considered belonging to the current
% =\halign= while we are looking for a =*= or =[=.
% For further information see
% \cite[Appendix D]{bk:knuth}.
% \cite[Appendix~D]{bk:knuth}.
% \begin{macrocode}
\iffalse{\fi\ifnum 0=`}\fi
% \end{macrocode}
Expand Down Expand Up @@ -2687,7 +2780,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% \section{The Environment Definitions}
%
% After these preparations we are able to define the environments. They
% only differ in the initialisations of =\d@llar...=, =\col@sep=
% only differ in the initializations of =\d@llar...=, =\col@sep=
% and =\@halignto=.
%
% \begin{macro}{\@halignto}
Expand All @@ -2702,7 +2795,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% able overwrite the =\@halignto=
% setting of a tabular in the main text resulting in a very weird error.
% \changes{v2.4d}{2016/10/06}{\cs{@halignto} set locally (pr/4488)}
% \changes{v2.0g}{1992/06/18}{`d@llarbegin defined on toplevel.}
% \changes{v2.0g}{1992/06/18}{`d@llarbegin defined on top-level.}
% When the new font selection scheme is in force we have to
% we surround all =\halign= entries
% with braces. See remarks in TUGboat 10\#2. Actually we are going
Expand Down Expand Up @@ -2752,7 +2845,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% \begin{macro}{\tabular}
% \begin{macro}{\tabular*}
% The environments \textsf{tabular} and \textsf{tabular$*$} differ
% only in the initialisation of the command =\@halignto=. Therefore
% only in the initialization of the command =\@halignto=. Therefore
% we define
% \changes{v2.4d}{2016/10/06}{\cs{@halignto} set locally (pr/4488)}
% \begin{macrocode}
Expand Down Expand Up @@ -3034,7 +3127,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% rewrites (in the token register "\NC@list") will look like
% "\NC@do *" "\NC@do C" "\NC@do L".
% So we need to define "\NC@do" as a one argument macro which
% initialises the rewriting of the specified column. Let us assume that
% initializes the rewriting of the specified column. Let us assume that
% `C' is the argument.
% \begin{macrocode}
\def\NC@do#1{%
Expand Down Expand Up @@ -3102,7 +3195,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% \end{macro}
%
% \subsection{The $*$--form}
% We view the $*$-form as a slight generalisation of the system
% We view the $*$-form as a slight generalization of the system
% described in the previous subsection. The idea is to define a $*$
% column by a command of the form:
% \begin{verbatim}
Expand Down Expand Up @@ -3540,7 +3633,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
% the cell content is too wide.
% \changes{v2.4f}{2017/11/07}{Column type added}
% \changes{v2.5a}{2020/04/06}{Use \cs{d@llarbegin} and \cs{d@llarend} so
% that cell is typeset in mathmode inside \texttt{array} (gh/297)}
% that cell is typeset in math mode inside \texttt{array} (gh/297)}
% \begin{macrocode}
\newcolumntype{W}[2]
{>{\begin{lrbox}\ar@cellbox\d@llarbegin}%
Expand Down

0 comments on commit e8033b5

Please sign in to comment.