Skip to content

Commit

Permalink
Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jloveric committed Dec 30, 2023
1 parent 360f013 commit 7868473
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ a link is active for each input so the network sparsity is determined by the num

I'm interested in creating larger language models from an ensemble of smaller models. This would give better flexibility in adding or removing specific sources.

Currently working on sparse high-order transformers
Working models for High Order MLPs, Mamba (SSM).

# Dataset

Expand Down Expand Up @@ -94,7 +94,7 @@ Using conv layers (not done too much here, see below for a possibly better netwo
python examples/high_order_interpolation.py data.type=sequence net=conv max_epochs=100 optimizer.lr=1e-4 batch_size=1000 data.add_channel_dimension=true
```
### mamba
Work in progress

```
python examples/high_order_interpolation.py data.type=sequence net=mamba optimizer.lr=1e-4 data.max_features=16 batch_size=1024
```
Expand Down

0 comments on commit 7868473

Please sign in to comment.