The package contains a mixture of classic decoding methods (Wiener Filter, Wiener Cascade, Kalman Filter, Support Vector Regression) and modern machine learning methods (XGBoost, Dense Neural Network, Recurrent Neural Net, GRU, LSTM).
The decoders are currently designed to predict continuously valued output. In the future, we will modify the functions to also allow classification.
This package accompanies a manuscript that compares the performance of these methods on several datasets. We would appreciate if you cite that manuscript if you use our code for your research.
In order to run all the decoders based on neural networks, you need to install Keras
In order to run the XGBoost Decoder, you need to install XGBoost
In order to run the Wiener Filter, Wiener Cascade, or Support Vector Regression you will need scikit-learn.
We have included jupyter notebooks that provide detailed examples of how to use the decoders. The file "Examples_kf_decoder" is for the Kalman filter decoder and the file "Examples_all_decoders" is for all other decoders.
Here we provide a basic example where we are using a LSTM decoder.
For this example we assume we have already loaded matrices:
- "neural_data": a matrix of size "total number of time bins" x "number of neurons," where each entry is the firing rate of a given neuron in a given time bin.
- "y": the output variable that you are decoding (e.g. velocity), and is a matrix of size "total number of time bins" x "number of features you are decoding."
We have provided a jupyter notebook, "Example_format_data" with an example of how to get Matlab data into this format.
First we will import the necessary functions
from decoders import LSTMDecoder #Import LSTM decoder
from preprocessing_funcs import get_spikes_with_history #Import function to get the covariate matrix that includes spike history from previous bins
Next, we will define the time period we are using spikes from (relative to the output we are decoding)
bins_before=13 #How many bins of neural data prior to the output are used for decoding
bins_current=1 #Whether to use concurrent time bin of neural data
bins_after=0 #How many bins of neural data after the output are used for decoding
Next, we will compute the covariate matrix that includes the spike history from previous bins
# Function to get the covariate matrix that includes spike history from previous bins
X=get_spikes_with_history(neural_data,bins_before,bins_after,bins_current)
In this basic example, we will ignore some additional preprocessing we do in the example notebooks. Let's assume we have now divided the data into a training set (X_train, y_train) and a testing set (X_test,y_test).
We will now finally train and test the decoder:
#Declare model and set parameters of the model
model_lstm=LSTMDecoder(units=400,num_epochs=5)
#Fit model
model_lstm.fit(X_train,y_train)
#Get predictions
y_test_predicted_lstm=model_lstm.predict(X_test)
There are 3 files with functions. An overview of the functions are below. More details can be found in the comments within the files.
This file provides all of the decoders. Each decoder is a class with functions "fit" and "predict".
First, we will describe the format of data that is necessary for the decoders
- For all the decoders, you will need to decide the time period of spikes (relative to the output) that you are using for decoding.
- For all the decoders other than the Kalman filter, you can set "bins_before" (the number of bins of spikes preceding the output), "bins_current" (whether to use the bin of spikes concurrent with the output), and "bins_after" (the number of bins of spikes after the output). Let "surrounding_bins" = bins_before+bins_current+bins_after. This allows us to get a 3d covariate matrix "X" that has size "total number of time bins" x "surrounding_bins" x "number of neurons." We use this input format for the recurrent neural networks (SimpleRNN, GRU, LSTM). We can also flatten the matrix, so that there is a vector of features for every time bin, to get "X_flat" which is a 2d matrix of size "total number of time bins" x "surrounding_bins x number of neurons." This input format is used for the Wiener Filter, Wiener Cascade, Support Vector Regression, XGBoost, and Dense Neural Net.
- For the Kalman filter, you can set the "lag" - what time bin of the neural data (relative to the output) is used to predict the output. The input format for the Kalman filter is simply the 2d matrix of size "total number of time bins" x "number of neurons," where each entry is the firing rate of a given neuron in a given time bin.
- The output, "y" is a 2d matrix of size "total number of time bins" x "number of output features."
Here are all the decoders within "decoders.py":
- WienerFilterDecoder
- The Wiener Filter is simply multiple linear regression using X_flat as an input.
- It has no input parameters
- WienerCascadeDecoder
- The Wiener Cascade (also known as a linear nonlinear model) fits a linear regression (the Wiener filter) followed by fitting a static nonlearity.
- It has parameter degree (the degree of the polynomial used for the nonlinearity)
- KalmanFilterDecoder
- We used a Kalman filter similar to that implemented in Wu et al. 2003. In the Kalman filter, the measurement was the neural spike trains, and the hidden state was the kinematics.
- We have one parameter C (which is not in the previous implementation). This parameter scales the noise matrix associated with the transition in kinematic states. It effectively allows changing the weight of the new neural evidence in the current update.
- SVRDecoder
- This decoder uses support vector regression using X_flat as an input.
- It has parameters C (the penalty of the error term) and max_iter (the maximum number of iterations).
- It works best when the output ("y") has been normalized
- XGBoostDecoder
- We used the Extreme Gradient Boosting XGBoost algorithm to relate X_flat to the outputs. XGBoost is based on the idea of boosted trees.
- It has parameters max_depth (the maximum depth of the trees), num_round (the number of trees that are fit), eta (the learning rate), and gpu (if you have the gpu version of XGBoost installed, you can select which gpu to use)
- DenseNNDecoder
- Using the Keras library, we created a dense feedforward neural network that uses X_flat to predict the outputs. It can have any number of hidden layers.
- It has parameters units (the number of units in each layer), dropout (the proportion of units that get dropped out), num_epochs (the number of epochs used for training), and verbose (whether to display progress of the fit after each epoch)
- SimpleRNNDecoder
- Using the Keras library, we created a neural network architecture where the spiking input (from matrix X) was fed into a standard recurrent neural network (RNN) with a relu activation. The units from this recurrent layer were fully connected to the output layer.
- It has parameters units, dropout, num_epochs, and verbose
- GRUDecoder
- Using the Keras library, we created a neural network architecture where the spiking input (from matrix X) was fed into a network of gated recurrent units (GRUs; a more sophisticated RNN). The units from this recurrent layer were fully connected to the output layer.
- It has parameters units, dropout, num_epochs, and verbose
- LSTMDecoder
- All methods were the same as for the GRUDecoder, except Long Short Term Memory networks (LSTMs; another more sophisticated RNN) were used rather than GRUs.
- It has parameters units, dropout, num_epochs, and verbose
When designing the XGBoost and neural network decoders, there were many additional parameters that could have been utilized (e.g. regularization). To simplify ease of use, we only included parameters that were sufficient for producing good fits.
The file has functions for metrics to evaluate model fit. It currently has functions to calculate:
The file contains functions for preprocessing data that may be useful for putting the neural activity and outputs in the correct format for our decoding functions
- bin_spikes: converts spike times to the number of spikes within time bins
- bin_output: converts a continuous stream of outputs to the average output within time bins
- get_spikes_with_history: using binned spikes as input, this function creates a covariate matrix of neural data that incorporates spike history