[FPU] doc

TrampolineRTOS · Feb 28, 2024 · 27d66cb · 27d66cb
1 parent ac9c5a2
commit 27d66cb
Show file tree

Hide file tree

Showing 2 changed files with 81 additions and 21 deletions.
diff --git a/documentation/manual/main.pdf b/documentation/manual/main.pdf
diff --git a/documentation/manual/ports.tex b/documentation/manual/ports.tex
@@ -1214,52 +1214,112 @@ \subsection{Cortex-M FPU support}
 
 Processors with a floating-point unit add the following registers to the context:
 \begin{itemize}
-\item 32 32-bit registers named \reg{s0} to \reg{s31} which can be seen as 16 64-bit registers named \reg{d0} to \reg{d15} for instructions operating on double-precision floating-point numbers.
+\item 32 32-bits registers named \reg{s0} to \reg{s31} which can be seen as 16 64-bit registers named \reg{d0} to \reg{d15} for instructions operating on double-precision floating-point numbers.
 \item \reg{fpscr} is the floating-point status and control register.
-\item \reg{fpexc} is the floating-point exception register.
 \end{itemize}
 
 \reg{fpsid} is the floating-point system ID register but as this register seems to be read-only, it is not part of the context.
 
-When floating point is activated, the static task descriptor has an additional member, a pointer to the floating point context structure, which is located just after the pointer to the integer context structure. Function that save and load the contexte, \cfunction{tpl_save_context}, \cfunction{tpl_load_context}, \cfunction{tpl_save_context_under_it} and \cfunction{tpl_load_context_under_it} all have a pointer to the static task descriptor in \reg{r0} register. The floating context is accessed by reading its pointer. If the pointer is \constant{NULL}, the is not saved:
+\subsubsection{Interrupts/SVC}
+
+A system call (svc or interrupt) will stack a different exception frame depending on whether the FPU is present or not. If the FPU is present, registers \reg{s0} to \reg{s15} and \reg{fpscr} are stacked, and a reserved word for 8-bytes alignment:
+
+\begin{lstlisting}{language=C}
+/*-----------------------------------------------------------------------*
+ * +-------------------------------+                                     *
+ * | R0                            | <- PSP                              *
+ * +-------------------------------+                                     *
+ * | R1                            | <- PSP+4                            *
+ * +-------------------------------+                                     *
+ * | R2                            | <- PSP+8                            *
+ * +-------------------------------+                                     *
+ * | R3                            | <- PSP+12                           *
+ * +-------------------------------+                                     *
+ * | R12                           | <- PSP+16                           *
+ * +-------------------------------+                                     *
+ * | LR (aka R14)                  | <- PSP+20                           *
+ * +-------------------------------+                                     *
+ * | Return Address (saved PC/R15) | <- PSP+24                           *
+ * +-------------------------------+                                     *
+ * | xPSR (bit 9 = 1)              | <- PSP+28                           *
+ * +------------------------+---------------------\                      *
+ * | s0 (FPU)               | <- PSP+32 - 0x20    |                      *
+ * +------------------------+                     |                      *
+ * | ..                     | <- PSP+.. -         |                      *
+ * +------------------------+                     |                      *
+ * | s15 (FPU)              | <- PSP+92 - 0x5C    |- only if FPU is      *
+ * +------------------------+                     | available and        *
+ * | FPSCR (FPU)            | <- PSP+96 - 0x60    | process is using FPU *
+ * +------------------------+                     | (USEFLOAT = TRUE     *
+ * | reserved (align)       | <- PSP+100- 0x64    |  in .oil)            *
+ * +------------------------+---------------------/                      */
+
+\end{lstlisting}
+
+In addition, during a system call, the value of the LR register is different depending on whether or not the FPU is used. The 2 values of interest here are: 
+\begin{itemize}
+  \item \texttt{0xFFFFFFFD} - Return to Thread mode, exception return uses non-floating-point state from MSP and execution uses PSP after return.
+  \item \texttt{0xFFFFFFED} - Return to Thread mode, exception return uses floating-point state from MSP and execution uses PSP after return.
+\end{itemize}
+Note that bit 4 of \reg{lr} indicates FPU usage in all cases.
+
+\subsubsection{Data structure}
+
+
+When floating point is activated, the static task descriptor has an additional member, a pointer to the floating point context structure, which is located just after the pointer to the integer context structure. Function that save and load the context, \cfunction{tpl_save_context}, \cfunction{tpl_load_context}, \cfunction{tpl_save_context_under_it} and \cfunction{tpl_load_context_under_it} all have a pointer to the static task descriptor in \reg{r0} register. The floating context is accessed by reading its pointer. If the pointer is \constant{NULL}, the is not saved:
 
 \begin{lstlisting}[language=C]
   ldr r1,[r0,#FLOAT_CONTEXT]
   cmp r1,#0
-  beq no_save:
+  beq no_save_fp
 \end{lstlisting}
 
-Saving the floating-point context is a two-part process. First, the registers \reg{s0} to \reg{s31} are saved.
-
-\warning{C'est pas clair. Les registres doivent être consécutifs mais je ne sais pas combien de registres on peut écrire d'un coup. Mais ça pourrait ressembler à ça.}
+Saving the floating-point context is a two-part process: \reg{s0} to \reg{s31} and \reg{fpscr} are saved on the stack during the interrupt call by hardware, and \reg{s16} to \reg{s31} are saved by software.
 
 \begin{lstlisting}[language=C]
-  vstm r1!,{s0-s31}
+  vstm r1!, {s16-s31}
+no_save_fp:
 \end{lstlisting}
 
-Then:
+Loading the floating-point context is the same reversed. However, when loading the context, we have to update \reg{lr} so that it uses an FPU ro non-FPU exception frame scheme. The \lstinline|tpl_load_context|and \lstinline|tpl_load_context_under_it| have been updated so that they return the new value of \reg{lr} (either \texttt{0xFFFFFFFD} or \texttt{0xFFFFFFED}). The return value is in \reg{r0} and is built through 2 instructions:
 
 \begin{lstlisting}[language=C]
-  mrs r2,fpscr
-  str r2,[r1]
-  mrs r2,fpexc
-  str r2,[r1,#4]
-no_save:
+  mov r0, #0xFFED  /* low 16 bits of LR, FPU */
+  movt r0, #0xFFFF /* high 16 bits of LR */
 \end{lstlisting}
 
-Loading the floating-point context is the same reversed. Assuming \reg{r1} is loaded with a pointer to the floating-point context. Remember that if the pointer is \constant{NULL}, these instructions are skiped:
+
+Assuming \reg{r1} is loaded with a pointer to the floating-point context. Remember that if the pointer is \constant{NULL}, these instructions are skiped:
 
 \begin{lstlisting}[language=C]
-  vldm r1!,{s0-s31}
-  ldr  r2,[r1]
-  msr  fpsrc
-  ldr  r2,[r1,#4]
-  msr. fpexc
-no_load:
+  #if WITH_FLOAT == YES
+  /*--------------------------------------------------
+   * Get a the pointer to the floating point context 
+   * from the pointer to the static descriptor of the 
+   * running task
+   */
+  ldr r1,[r0,#FLOAT_CONTEXT]
+  /* r1 is NULL if there is no float context for this process */
+  cmp r1, #0
+  beq no_load_fp
+  vldm r1!, {s16-s31} /* load s[16..31] */
+  /* now update LR to use the FPU */
+  mov r0, #0xFFED  /* low 16 bits of LR, FPU */
+  b end_fp 
+no_load_fp:
+#endif // WITH_FLOAT
+  mov r0, #0xFFFD  /* low 16 bits of LR, NO FPU */
+end_fp:
+  movt r0, #0xFFFF /* high 16 bits of LR */
+  bx   lr
 \end{lstlisting}
 
+\subsubsection{Lazy Context Switch mode}
+The Lazy Context Switch mode (LSPEN bit in FPU->FPCCR) is enabled by default. 
 
+This means that when an interrupt occurs, the stack frame is reserved for FPU registers, but these registers are not pushed effectively. They are pushed when an FPU related instruction is executed in the handler.
 
+Thus, if a system call does not preempt the task, the registers \reg{s0}-\reg{s15} and \reg{fpscr} are not effectively saved. When a context switch occurs, the instruction to save registers \reg{s16}-\reg{s31} (\lstinline|vstm r1!, {s16-s31}|) will trigger the stack update.
 
 \subsection{Interrupt handler}