Simple Keras-inspired Deep Learning Framework implemented in Python with Numpy backend (using hand-written gradients) and Matplotlib plotting. For efficient (multithreaded) Einstein summation between tensors I use einsum2 repo.
As all my other repos, this is more an exercise for me to make sure I understand the main Deep Learning architectures and algorithms, rather than useful code to fit models. As well as a way to think of (relatively) efficient implementation of them. Hope this (super) simplified "Keras" re-implementation also helps you understand them!
Allows to Build, Train and Assess a modular Multi-Layer-Perceptron Squential architecture as you would do using Keras. The model (as for now) presents the following features:
- Layers:
- Trainable: Dense, Conv2D, VanillaRNN
- Activation: Relu, Softmax
- Regularization: Dropout, MaxPool2D
- Losses:
- CrossEntropy
- CategoricalHinge
- Optimization: Minibatch SGD BackProp Training with customizable:
- Batch Size
- Epochs / Iterations
- Momentum
- L2 Regularization Term
- Callbacks:
- Learning Rate Scheduler: Constant, Linear, Cyclic
- Loss & Metrics tracker
- Early Stopper
Code Example:
# Imports
from mlp.callbacks import MetricTracker, LearningRateScheduler
from mlp.layers import Conv2D, Dense, MaxPool2D, Softmax, Relu, Dropout
from mlp.losses import CrossEntropy
from mlp.models import Sequential
from mlp.metrics import Accuracy
# Define model
model = Sequential(loss=CrossEntropy(), metric=Accuracy())
model.add(Conv2D(num_filters=32, kernel_shape=(3, 3), stride=2, input_shape=(32, 32, 3)))
model.add(Relu())
model.add(Conv2D(num_filters=64, kernel_shape=(3, 3)))
model.add(Relu())
model.add(MaxPool2D(kernel_shape=(2, 2), stride=2))
model.add(Conv2D(num_filters=128, kernel_shape=(2, 2)))
model.add(Relu())
model.add(MaxPool2D(kernel_shape=(2, 2)))
model.add(Flatten())
model.add(Dense(nodes=200))
model.add(Relu())
model.add(Dropout(0.8))
model.add(Dense(nodes=10))
model.add(Softmax())
# Define callbacks
mt = MetricTracker() # Stores training evolution info (losses and metrics)
lrs = LearningRateScheduler(evolution="cyclic", lr_min=1e-4, lr_max=1e-1)
callbacks = [mt, lrs]
# Fit model
model.fit(X=x_train, Y=y_train, X_val=x_val, Y_val=y_val,
batch_size=100, epochs=100, l2_reg=0.01, momentum=0.8,
callbacks=callbacks)
mt.plot_training_progress()
# Test model
test_acc, test_loss = model.get_metric_loss(x_test, y_test)
print("Test accuracy:", test_acc)
Example of metrics tracked during training:
NOTE: More architectures, layers and features (LSTM, RBF, SOM, DBF) comming soon
Metaparameter Optimization is commonly used when training these kind of models. To ease the process I implemented a MetaParamOptimizer class with methods such as Grid Search, additionally I jointly wrote a wrapper around scikit-optimize with Federico Taschin, to perform Bayesian Optimization here).
- Define the search space and fixed args and a of your model in two diferent dictionaries
- Define an evaluator function which trains and evaluates your model in joined arguments, this function should return a
dictionary
with at least the key "value" (which MetaParamOptimizer will optimize).
Code example:
from mpo.metaparamoptimizer import MetaParamOptimizer
from util.misc import dict_to_string
search_space = { # Optimization will be performed on all combinations of these
"batch_size": [100, 200, 400], # Batch sizes
"lr": [0.001, 0.01, 0.1], # Learning rates
"l2_reg": [0.01, 0.1] # L2 Regularization terms
}
fixed_args = { # These will be kept constant
"x_train" : x_train,
"y_train" : y_train,
"x_val" : x_val,
"y_val" : y_val,
"epochs" : 100,
"momentum" : 0.1,
}
def evaluator(x_train, y_train, x_val, y_val, **kwargs):
# Define model (ex: SVM)
model = Sequential(loss=CategoricalHinge())
model.add(Dense(nodes=10, input_dim=x_train.shape[0]))
model.add(Softmax())
# Fit model
model.fit(X=x_train, Y=y_train, X_val=x_val, Y_val=y_val, **kwargs)
model.plot_training_progress(show=False, save=True, name="figures/" + dict_to_string(kwargs)
model.save("models/" + dict_to_string(kwargs))
# Evaluator result (add model to retain best)
value = model.get_classification_metrics(x_val, y_val)[0] # Get accuracy
result = {"value": value, "model": model} # MetaParamOptimizer will maximize value
return result
# Get best model and best prams
mpo = MetaParamOptimizer(save_path="models/")
best_model = mpo.grid_search(evaluator=evaluator,
search_space=search_space,
fixed_args=fixed_args)
# This will run your evaluator function 3x3x3 = 27 times on all combinations of search_space params
Example of Gaussian Process Regression Optimizer hyperparameter analysis:
Clone repo and install requirements:
git clone https://github.com/OleguerCanal/Toy-DeepLearning-Framework.git
cd Toy-DeepLearning-Framework
pip install -r requirements.txt
[OPTIONAL] To parallelize Einstein sumations between tensors install einsum2, if not found will use numpy single-thread version instead (SLOWER).