Basic Fault Protection Ports and Components #2536
Replies: 5 comments 6 replies
-
One thing we got from the lessons-learned page (https://llis.nasa.gov/lesson/772) is that fault detection and response should be as configurable from the ground as possible. This could be accomplished in your system by having fault detection and response rely on parameters in the ParamDB that the ground can configure, but we were actually looking at going a step further and having fault detection and response be entirely written in command sequence files, which could be entirely replaced by the ground if necessary. This will mean that we will have to implement logic in command sequence files, which is a separate issue but one that we were already looking at doing for other reasons. |
Beta Was this translation helpful? Give feedback.
-
I like the more centralized fault protection approach. A response can be defined as a set of ground commands (picked from the dictionary). This way responses can leverage from existing commands. We don't have to create extra fault protection ports and handlers into components to specially handle fault behavior. |
Beta Was this translation helpful? Give feedback.
-
Perhaps @EbenezerA99 or Antoine could weigh in here on the logic aspect. They presented at FSW and had a poster at SmallSat for their VISORS software, which used a parameter table driven system for fault logic. So they may have opinions that were formed during the process of implementing this in F Prime, and they may be interesting. |
Beta Was this translation helpful? Give feedback.
-
Yes, I worked with Antoine on FDIR logic for a two spacecraft formation flying mission called VISORS which used Fprime for its FSW. See this paper for more detailed information on our FDIR strategy. This implementation looks great to me! IMO, having basic FDIR components in the Fprime repo would be of great benefit to users. One thing I would like to see is flexibility in how Fault Responses are mapped to 'Fault Announcements', instead of having a strict 1-1 pairing between a Fault Announcement and a Fault Response. For example, for the VISORS mission we had to implement logic such as: If Fault1 && Fault 2 -> Invoke FaultResponse1 & FaultResponse2 etc. |
Beta Was this translation helpful? Give feedback.
-
Added a discussion for possible FPP updates for this: #2540 |
Beta Was this translation helpful? Give feedback.
-
This is a proposal to implement a basic fault protection engine for F Prime. It has the following FDIR concepts:
Concepts
1) Fault Announcement (The
FD
in FDIR)Components detect faults locally since they are the "experts". They look at data for their domain and decide when a fault is present and then announce via a port and a specific fault identifier that a fault has occurred. This is separate from the response, which may also be handled by the same component via an input port, but only when decided by the system fault protection implementation. This does not preclude local responses that may be more appropriate, but provides a standard way to have system coordination of responses. An example might be an instrument that is producing bad data that is detected by an instrument manager component.
2) Fault Monitors (The
FD
in FDIR)Implementations of fault responses here at JPL have the notions of fault monitors. This is code that implements persistence counts and state to provide a level of filtering of fault symptoms so that the system is not overreactive. Each monitors is tunable to the particular item being monitored.
3) Fault Response (The
IR
in FDIR)The fault protection implementation will look at a set of fault announcements and decided on a project-specific response. The component will have input ports for fault announcement, will map the announcements to various responses, and invoke an output port with the response. The topology will connect the response output ports to the components implementing the response. The response can be "fanned out" by a splitter component if more than one component implements the response. The response may not be implemented by the component that does the announcement. For instance, the instrument fault example might have one component announce the fault (the instrument manager) while another handles the response (a power component turns off the instrument).
4) Fault Completion
When a fault response is done, the component implementing the response announces it is done, so the fault response implementation can either declare the fault response done, or move to the next step in a response if multiple steps are needed. In our example,
Implementation
Enumerations
A set of FPP enumerations would be created to enumerate faults and responses. These enumerations would live in the project-specific config directories, but would be used by the ports and components below. That allows the names to be customized by a project.
Ports
A set of F Prime ports would be created in
Fw
to implement the fault interface. The ports would have arguments based on the above enumerated types.FaultAnnounce
FaultRespond
FaultResponseComp
Components
Projects can implement their own arbitrarily complex fault component (or components) that implements the
FaultAnnounce
,FaultRespond
andFaultResponseComp
ports. A basic implementation can be provided in the F Prime repo that has a table to map announcements to responses.Helpers
The Fault Monitors can be implemented as helper classes that can be instantiated by components to track a particular item. The persistence counts can be updated via [parameters][https://fprime-community.github.io/fpp/fpp-users-guide.html#Defining-Components_Parameters] or programmatically.
Future Direction
Fault protection is important enough and lends itself to modeling, so perhaps it could become a first-class FPP at some point.
Beta Was this translation helpful? Give feedback.
All reactions