Skip to content

Input fields for data sources and processors

Stijn Peeters edited this page Nov 20, 2023 · 8 revisions

Processors and data sources can define options that allow someone using 4CAT to tweak the data generated by that processor or data source to their liking. This can range from setting a search query for a data source to choosing what column of a CSV file a processor should operate on.

In 4CAT's code, these are set in the options property of the processor object. This value is read by the front-end to determine what kind of interface to show, and by the back-end to parse data entered by the user. Remember that data sources' workers are (a special type of) processor and use the same logic.

The options object can be added to a processor as follows:

from backend.abstract.processor import BasicProcessor
from common.lib.helpers import UserInput

class SuperCoolProcessor(BasicProcessor):
    """
    A dummy processor to show input fields.
    """
    type = "dummy-processor"
    title = "Dummy Processor"
    extension = "csv"

    options = {
		"sample-input": {
			"type": UserInput.OPTION_TEXT,
			"default": "all",
			"help": "A sample text input",
			"tooltip": "This text will show when hovering over the question mark."
		}
	}
    
    ...

This will show the following input field:

The value set for these options, when running the processor, may then be accessed as follows:

    def process():
        input_text = self.parameters.get("sample-input")

and so on. Each option can have a number of settings; type and help are the only ones that are strictly required, but moresettings are available (see below).

Additionally, the same syntax/structure is used to provide configuration options for a data source or processor. Configuration options cannot be set by an ordinary user but may be changed by 4CAT administrators in the control panel. They determine, for instance, the look and feel of 4CAT, and the availability of various modules.

Options (adjustable by a user) are defined in a processor class' options property; Configuration options (adjustable by an administrator) are defined in the config property. Additionally, if a get_options() class method is available, that will be called and the result will be used instead of the options property. The signature of the get_options() method is:

@classmethod
get_options(cls=None, parent_dataset=None, user=None):
    return [options object]

What happens with this data?

The definition is used to display input forms in the front-end interface, and to parse the data entered therein. The basic sequence is as follows:

  • User navigates to a page where the processor may be started
  • A form is rendered according to the option definitions
  • User submits the form
  • Input is parsed using the option definitions
  • Input is passed to validate_query if defined (see below)
  • Input, or output of validate_query, is saved as dataset metadata and available in a processor as self.parameters.

Validating option values

A processor can define a validate_query method that will be used to validate any user input. It has the following structure:

@staticmethod
def validate_query(query, request, user):
    return query

The method is called after user input has been parsed. The parsed input is passed on to the method as its query parameter. The return value of the method is what is in the end stored in the metadata of the dataset the processor is operating on.

The validate_query method can raise the following exceptions (from common.lib.exceptions import *) to indicate issues with the input:

  • QueryNeedsFurtherInputException: If the input was valid, but further data is needed. This can be used to build 'wizard'-style processors. The exception can be passed a config argument, containing an option definition (as discussed above) describing what further input is needed. This is currently only supported for data source options.
  • QueryNeedsExplicitConfirmationException: If the input was valid, but user confirmation is required before the processor can run. The message passed with the exception is used as the confirmation question shown to the user. After confirming, there will be an additional key frontend-confirm in the query dictionary with True as its value. This is currently only supported for data source options.
  • QueryParametersException: If the input was invalid. The message passed with the exception is shown as an error message.

Settings

type

Required.

Option type. This is a constant defined in the UserInput class (from common.lib.user_input import UserInput). The following types are available:

Type UI control Default type of value Example
OPTION_TOGGLE Checkbox boolean
OPTION_CHOICE Select list string
OPTION_TEXT Text input field (single line) string
OPTION_MULTI Select list (allow multiple choice) list
OPTION_MULTI_SELECT List of checkboxes list
OPTION_TEXT_LARGE Text input field (multi-line) string
OPTION_TEXT_JSON Text input field (multi-line) dict
OPTION_DATE Text input field (type=date) int (Unix timestamp)
OPTION_DATERANGE Two text input fields (type=date) tuple (of two ints; Unix timestamps)
OPTION_HUE Colour hue int ('H' value of a HSL colour, 0-360)
OPTION_FILE File upload field N/A (uploaded files are handled separately)
OPTION_INFO Text paragraph N/A
OPTION_DIVIDER Dividing line N/A

help

Required.

A somewhat awkwardly-named setting determining the name of the option. This is used as the input control's label in the user interface.

default

Optional.

If no value is found for the option in the input, use this value. Also determines for example which value is pre-selected for an OPTION_CHOICE control or what an OPTION_TEXT is pre-filled with. If not given, the default value is None.

tooltip

Optional.

A longer explanation of what the option does, which makes a "?" widget appear next to the control in the interface. Hovering the widget displays the information in a tooltip.

coerce_type

Optional.

Cast the value to the given type (e.g. int) while parsing. If the value cannot be cast to the type, use the default value.

min and max

Optional.

If the option has an int or float value, or is cast to it, clamp the value between min and max. It is also possible to provide only one of these two.

options

Optional. Ignored unless type is one of OPTION_CHOICE, OPTION_MULTI or OPTION_MULTI_SELECT.

A dict of possible values for this option, for example:

  "options": {
    "value1": "Description of value 1",
    "value2": "Description of value 2"
  }

requires

Optional.

Makes the availability of the option contingent on the value of another option. There is a limited set of other syntaxes to use for the requires setting:

  • "requires": "model": Enable if the option named model is not empty, or (in the case of a boolean/checkbox) not False. Equivalent to "requires": "model!=".
  • "requires": "model=openai": Enable if model is openai.
  • "requires": "model!=openai": Enable if model is not openai.
  • "requires": "model~=openai": Enable if model contains openai, or (in the case of a list) if openai is a value in the list.
  • "requires": "model^=openai": Enable if model starts with openai.
  • "requires": "model$=openai": Enable if model ends with openai.

If the requires condition is not met, a processor behaves as if the option does not exist at all, i.e. no value is parsed or stored for it.

Behaviour for options whose requires condition involves the value of other options with a requires setting is undefined (i.e. you shouldn't try to use nested conditionals). Consider using QueryNeedsFurtherInputException (see above) instead if you need this.

sensitive

Optional.

Indicates that the value is sensitive information (e.g. passwords or API keys) and should not be stored within a dataset's metadata once the dataset has started processing.

cache

Optional.

Indicates that the value should be cached client-side, i.e. that if the same input form is viewed again, the value previously entered should be automatically pre-filled. Does not have an effect on what is stored server-side (often useful in combination with sensitive).

saturation and value

Optional. Ignored unless type is OPTION_HUE.

Sets the S and V values of a HSV colour that together determine the range of available colours in the colour picker control.

indirect

Optional. Ignored unless parsing as a 4CAT configuration option.

Indicates that the 4CAT configuration option is set through some other mechanism, and that input for it should be ignored.

global

Optional. Ignored unless parsing as a 4CAT configuration option.

Indicates that the option cannot be configured per user or tag, but only globally for the whole 4CAT server.