Initial attempt at implementing `mlx-nn` NN modules #100

minghuaw · 2024-08-02T19:34:56Z

This is my initial attempt at implementing the neural net modules and optimizers. The following are included in this PR

nn/activation
nn/convolution
nn/dropout
nn/linear
nn/sequential
nn/losses/
optimizer/sgd
optimizer/rmsprop

minghuaw · 2024-08-11T01:37:30Z

This is just an initial attempt at implementing some of the neural net components. I think some feedback on the overall API design and ergonomics would be nice @dcvz

minghuaw · 2024-08-17T03:50:33Z

Ended up changing Module trait to take &Array. The only cost (so far) is that an emptySequential would end up deep_clone the input array, but this should make overall usage more flexible as most (if not all) ops only require a ref input

…ad instead

* remove mut ref and smart ptr on rust side * use clone instead of mlx_retain * removed OwnedOrRef

…ad instead

dcvz · 2024-10-14T20:41:34Z

mlx-macros/src/lib.rs

@@ -142,3 +145,63 @@ pub fn generate_test_cases(input: TokenStream) -> TokenStream {

    TokenStream::from(tests)
 }
+
+/// Derive the `ModuleParameters` trait for a struct. Mark a field with `#[param]` attribute to


Do you think we'll eventually want to have a derive(Module)?

I don't think so because the Module implementation would include the forward method. The associated error type was an attempt to allow user to have more customized errors but is currently limited because the c ffi can only take an mlx Exception. One thing I think that we could change to the Module trait is to provide an default noop training_mode implementation, but I didn't do this because I felt like user might forget if this is not mandatory.

dcvz · 2024-10-14T20:46:17Z

mlx-nn/src/optimizers/rmsprop.rs

+    /// # Panics
+    ///
+    /// Panics if `alpha` is negative.
+    pub fn with_alpha(mut self, alpha: impl Into<Option<f32>>) -> Self {


Should we type this to not be negative? I'm always a bit torn on that kind of typing

Is there an existing crate that does this for f32 or should we add a new type?

dcvz · 2024-10-14T20:49:25Z

mlx-nn/src/optimizers/rmsprop.rs

+        // let v = alpha * state + (1 - alpha) * square(gradient)
+        // return (parameter - learningRate * gradient / (sqrt(v) + eps), v)


We can get rid of this right?

That was me trying to make sure the math is correct, but yeah we could get rid of this

dcvz · 2024-10-14T20:55:01Z

mlx-nn/src/utils.rs

+/// A custom type to indicate whether a `Module` should include a bias or not.
+/// Default to `Yes`.
+#[derive(Debug, Clone, Copy, Default)]
+pub enum WithBias {


Will be nice to eventually be able to define bias as Tensor<Shape>.. I wonder if there's some way we might already be able to do something similar with Array

This is a part that I'm not too sure about. This, on the on hand, is not consistent with how we handle optional arg in other Modules or functions, but on the other hand the bias would either be zeros or random::uniform. But we could probably get rid of this and just define an associated pub const for the conv layers and linear layers (these are the only Modules that use this type right now

mlx-rs/src/transforms/mod.rs

dcvz

Looks good overall! Let's align on the optionals so we can have one way of doing things but I like the direction for them too.

I have the gut feeling we may have a better way to do some of the utils for types we're using, but i'm happy to try and improve that once we start writing more examples and see where it could be more ergonomic.

minghuaw · 2024-10-15T12:18:49Z

mlx-nn/src/convolution.rs

+
+/// Optional parameters for the `Conv1d` module.
+#[derive(Debug, Clone, Default)]
+pub struct Conv1dBuilder {


@dcvz What about this kind of builder pattern? We could get rid of the WithBias type this way, and we could probably apply this on all Module's that take optional args? For cross_entropy, we could make it a struct and then apply this approach?

~~Or we could have two different way of handling optional args? Use this builder pattern on all Module structs, and keep functions like cross_entropy as it is?~~ Added a builder pattern impl of cross entropy, see that part for more details

mlx-macros/tests/test_generate_builder.rs

dcvz · 2024-10-20T17:52:24Z

mlx-macros/tests/test_generate_builder.rs

+struct TestStruct {
+    #[optional(default_value = TestStruct::DEFAULT_OPT_FIELD_1)]
+    opt_field_1: i32,
+    #[optional(default_value = TestStruct::DEFAULT_OPT_FIELD_2)]


Do you think it's still useful to have consts for defaults here? The fact that we can annotate means we can quickly see what those defaults are going to be without the indirection. Or are you worried about perf?

Also because this is a macro, IDEs won't always be able to CMD+click into the definition.

mlx-nn/src/activation.rs

mlx-macros/src/lib.rs

…m doc

minghuaw added 2 commits August 2, 2024 11:26

add Module and Sequential

91b45d9

added some activation fn

ee278f3

minghuaw marked this pull request as draft August 2, 2024 19:34

minghuaw added 5 commits August 8, 2024 19:47

added compiled activation functions

ecf0de9

removed swift references

3aed175

added more activation functions

285a602

added docs

0fcdfa7

added remaining activation functions

6a4233e

minghuaw changed the title ~~Implement neural net layers~~ Implement activation functions Aug 11, 2024

minghuaw changed the title ~~Implement activation functions~~ Implement mlx-nn activation functions Aug 11, 2024

minghuaw marked this pull request as ready for review August 11, 2024 01:36

minghuaw requested a review from dcvz August 11, 2024 01:36

minghuaw added 2 commits August 16, 2024 20:46

renamed clone to deep_clone to avoid confusion and changed to public

a3c2124

changed Module trait to take ref

76bba05

cargo fmt

d5efac8

minghuaw mentioned this pull request Aug 17, 2024

Tracking issue for mlx-nn implementation #101

Open

17 tasks

minghuaw added 6 commits September 2, 2024 11:08

make argument_numbers optional

b22faa9

improve doc

e2dd202

use ValueAndGrad trait instead of just value_and_grad fn

0713ce0

remove ValueAndGrad, use value_and_grad and value_and_grad_with_paylo…

7ed5d31

…ad instead

add docs

0d0d7ee

Module::update is complicated by ownership

0d74779

minghuaw marked this pull request as draft September 4, 2024 18:12

minghuaw added 5 commits September 4, 2024 11:15

Trust synchronization on the cpp side (#102)

5102a99

* remove mut ref and smart ptr on rust side * use clone instead of mlx_retain * removed OwnedOrRef

changed Module trait to take ref

6ab7835

cargo fmt

53198a3

use ValueAndGrad trait instead of just value_and_grad fn

c662dd8

remove ValueAndGrad, use value_and_grad and value_and_grad_with_paylo…

b31fb94

…ad instead

dcvz reviewed Oct 14, 2024

View reviewed changes

mlx-rs/src/transforms/mod.rs Show resolved Hide resolved

dcvz reviewed Oct 14, 2024

View reviewed changes

minghuaw added 3 commits October 15, 2024 04:50

remove commented code

23edc2d

added missing #[option_builder]

c78cbbd

new attempt at builder pattern on Module

c7f5dbf

minghuaw commented Oct 15, 2024

View reviewed changes

minghuaw added 10 commits October 15, 2024 06:22

attempt new optional args handling on cross entropy

f58cce0

attemp builder pattern on RmsProp

031e5bf

impl basic generate_builder

e1835c8

moved to builder pattern for activation, losses and optimizer

a1b96f4

removed WithBias and added builder pattern for conv and linear

1fb76af

cargo clippy and fmt

f7e1355

added builder for dropout modules

33c0cb5

cargo clippy and fmt

d5f6101

removed unused generic builder

e2f665d

fixed wrong default due to derive

7cea781

dcvz reviewed Oct 20, 2024

View reviewed changes

mlx-macros/tests/test_generate_builder.rs Outdated Show resolved Hide resolved

dcvz reviewed Oct 20, 2024

View reviewed changes

mlx-nn/src/activation.rs Outdated Show resolved Hide resolved

dcvz reviewed Oct 20, 2024

View reviewed changes

mlx-macros/src/lib.rs Outdated Show resolved Hide resolved

minghuaw added 4 commits October 20, 2024 11:14

fix error caused by mixing operator and arith functions

ca0a952

generate Default impl if no mandatory field & hide internal macro fro…

019359d

…m doc

added doc for GenerateBuilder macro

289971c

moved internal macros into separate crate

a8f9e32

minghuaw merged commit 9279d3d into main Oct 20, 2024
3 checks passed

minghuaw deleted the api/mlx-nn-impl branch October 20, 2024 21:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial attempt at implementing `mlx-nn` NN modules #100

Initial attempt at implementing `mlx-nn` NN modules #100

minghuaw commented Aug 2, 2024 •

edited

Loading

minghuaw commented Aug 11, 2024 •

edited

Loading

minghuaw commented Aug 17, 2024

dcvz Oct 14, 2024

minghuaw Oct 15, 2024

dcvz Oct 14, 2024

minghuaw Oct 15, 2024

dcvz Oct 14, 2024

minghuaw Oct 15, 2024

dcvz Oct 14, 2024

minghuaw Oct 15, 2024

dcvz left a comment

minghuaw Oct 15, 2024

minghuaw Oct 15, 2024 •

edited

Loading

dcvz Oct 20, 2024

dcvz Oct 20, 2024

		// let v = alpha * state + (1 - alpha) * square(gradient)
		// return (parameter - learningRate * gradient / (sqrt(v) + eps), v)

Initial attempt at implementing mlx-nn NN modules #100

Initial attempt at implementing mlx-nn NN modules #100

Conversation

minghuaw commented Aug 2, 2024 • edited Loading

minghuaw commented Aug 11, 2024 • edited Loading

minghuaw commented Aug 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcvz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minghuaw Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Initial attempt at implementing `mlx-nn` NN modules #100

Initial attempt at implementing `mlx-nn` NN modules #100

minghuaw commented Aug 2, 2024 •

edited

Loading

minghuaw commented Aug 11, 2024 •

edited

Loading

minghuaw Oct 15, 2024 •

edited

Loading