[mieb] Investigate voc2007 vlm2vec #1825

isaac-chung · 2025-01-16T13:58:08Z

lrap scores increase with increasing samples_per_label
was able to get 72% for both lora and full (full slightly higher than lora) when samples_per_label=64.
using samples_per_label=64 yields 0.787 lrap for voyage-multimodal-3

Checklist

Run tests locally to make sure nothing is broken using make test.
Run the formatter to format the code using make lint.

isaac-chung

@gowitheflow-1998 I've left some notes below to supplement findings. It seems like when LoRA was run, samples_per_label might have been low, e.g. 8, instead of 64.

isaac-chung · 2025-01-16T15:32:28Z

mteb/abstasks/Image/AbsTaskImageMultilabelClassification.py

+        _unique_train_embeddings = normalize_embeddings_to_numpy(
+            model.get_image_embeddings(
+                unique_train_images,
+                **encode_kwargs,
+            )


No change from adding normalize_embeddings_to_numpy. Will revert before merging.

isaac-chung · 2025-01-16T15:32:46Z

mteb/abstasks/Image/AbsTaskImageMultilabelClassification.py

+        X_test = normalize_embeddings_to_numpy(
+            model.get_image_embeddings(test_images, **encode_kwargs)
+        )


same here, will revert this change.

isaac-chung · 2025-01-16T15:33:46Z

results/TIGER-Lab__VLM2Vec-Full/e9afa98002097ac2471827ba23ea1f2ddd229480/VOC2007.json

+        "languages": [
+          "eng-Latn"
+        ],
+        "lrap": 0.7205710375157255,


Using commits to track lrap scores for both models at different samples_per_label. Will remove before merging.

isaac-chung · 2025-01-16T15:33:58Z

mteb/tasks/Image/ImageMultilabelClassification/eng/PascalVOC2007.py

@@ -55,3 +55,5 @@ class VOC2007Classification(AbsTaskImageMultilabelClassification):

    # To be removed when we want full results
    n_experiments: int = 5
+
+    samples_per_label: int = 64


Main change.

isaac-chung added 5 commits January 16, 2025 12:43

baseline

0c0a923

normalize embeddings did not change results

849704d

using 16 samples_per_label improve results to 0.26-0.27

4251c67

using 32 samples_per_label improve results to 0.4

afed64d

using 64 samples_per_label improve results to 0.72

b2b1844

isaac-chung commented Jan 16, 2025

View reviewed changes

using samples_per_label=64 yields 0.787 lrap for voyage-multimodal-3

bd3bd2c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mieb] Investigate voc2007 vlm2vec #1825

[mieb] Investigate voc2007 vlm2vec #1825

isaac-chung commented Jan 16, 2025 •

edited

Loading

isaac-chung left a comment

isaac-chung Jan 16, 2025 •

edited

Loading

isaac-chung Jan 16, 2025

isaac-chung Jan 16, 2025

isaac-chung Jan 16, 2025

[mieb] Investigate voc2007 vlm2vec #1825

Are you sure you want to change the base?

[mieb] Investigate voc2007 vlm2vec #1825

Conversation

isaac-chung commented Jan 16, 2025 • edited Loading

Checklist

isaac-chung left a comment

Choose a reason for hiding this comment

isaac-chung Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

isaac-chung Jan 16, 2025

Choose a reason for hiding this comment

isaac-chung Jan 16, 2025

Choose a reason for hiding this comment

isaac-chung Jan 16, 2025

Choose a reason for hiding this comment

isaac-chung commented Jan 16, 2025 •

edited

Loading

isaac-chung Jan 16, 2025 •

edited

Loading