Purpose of "smart_batching_collate" #2956

ShengYun-Peng · 2024-09-24T13:44:09Z

What is "smart_batching_collate" mainly used for and how is it different from the default collate function that only takes the tokenizer and pad the sentences in a batch?

tomaarsen · 2024-09-25T10:15:52Z

Hello!

Good question. smart_batching_collate is implemented in fit_mixin.py, a file that exists to inject some inject some old methods into the SentenceTransformer class. To clarify, training SentenceTransformer models used to be done using a fit method. In the v3 refactor we moved to primarily using a SentenceTransformerTrainer instead, but I didn't want to break backwards compatibility, so I updated the fit method to use the new Trainer.

But, because I thought there might still be some edge cases where people really want to use the original fit method, I also kept the original fit method, now named old_fit. This old_fit method will be removed in a future version, and fit will also be deprecated at one point, then we'll fully move to only the Trainer.

The smart_batching_collate method is a leftover from the original fit method, now implemented in old_fit. To be specific, it's used here:

sentence-transformers/sentence_transformers/fit_mixin.py

Lines 544 to 546 in 7290448

    
           # Use smart batching 
        
           for dataloader in dataloaders: 
        
               dataloader.collate_fn = self.smart_batching_collate

So, this collator is used in the DataLoader instances. It is used to convert the InputExample instances that the DataLoader would return into torch tensors. The DataLoader (from torch) has no idea what to do with InputExample instances otherwise, so this collator was required.
Why it was called smart? No idea. It doesn't seem particularly smart to me, haha, it seems pretty standard.

In short: it exists so that the old training method (now under old_fit) should still work. It'll be removed in the future, perhaps with the v4 release or some point sooner.

Hope that clears it up a bit.

Tom Aarsen

ShengYun-Peng · 2024-10-01T15:36:41Z

Thanks for the clear explanation! Haha, it is good to learn the development history. I'll close the issue.

ShengYun-Peng closed this as completed Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Purpose of "smart_batching_collate" #2956

Purpose of "smart_batching_collate" #2956

ShengYun-Peng commented Sep 24, 2024

tomaarsen commented Sep 25, 2024

ShengYun-Peng commented Oct 1, 2024

Purpose of "smart_batching_collate" #2956

Purpose of "smart_batching_collate" #2956

Comments

ShengYun-Peng commented Sep 24, 2024

tomaarsen commented Sep 25, 2024

ShengYun-Peng commented Oct 1, 2024