You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Question 2 - full contained condition may not be satisfied **
In the following code, the second branch condition "output_ratio / input_ratio * ctx > 1" will never meet because output_ratio / input_ratio <= 1.0 and ctx < 1.0. And in the "fully contained" branch, "full contained" condition may either not be satisfied because new_h = new_w / output_ratio
= w * (ctx * output_ratio / input_ratio) ** 0.5 / output_ratio
= h * input_radio * (ctx * output_ratio / input_ratio) ** 0.5 / output_ratio
= h * (ctx * input_radio / out_radio) ** 0.5
although ctx < 1.0, input_radio / out_radio is larger than 1.0, so that new_h may be larger than h, then the crop cannot be fully contained.
ifoutput_ratio<=input_ratio: # out like 4:3 in like kitti
if (
ctx>=1
): # fully in -> use just max_length with sqrt(ctx), here max is width
new_w=w*ctx**0.5
# sporge un po in una sola dim
# we know that in_width will stick out before in_height, partial overshoot (sporge)
# new_h > old_h via area -> new_h ** 2 * ratio_new = old_h ** 2 * ratio_old * ctx
elifoutput_ratio/input_ratio*ctx>1:
new_w=w*ctx
else: # fully contained -> use area
new_w=w* (ctx*output_ratio/input_ratio) **0.5
new_h=new_w/output_ratio
Question 3 - Only half of the batched data are used for training?
nsteps_accumulation_gradient is set to 1 in the training config, and batch_chunk is equal to batch_size, but the batched data obtained from dataloader (batches["data"]) in fact have a double batch size of 2 * batch_size since a pair of data items are loaded in ConcatDataset.__get_item__ (self.pairs is configured to 2) . Thus, only the first half data in batches["data"] are used for training.
**Question 4 - SequenceDataset is not suitable for training because of the calculation of SelfDistill loss **
The SelfDistill loss is under the assumption that each pair of image from the chunks is at the same view, which in fact originates from the same image but with different crop/resize operations. However, given certain idx, super(ConcatDataset, self).__getitem__(idx) may load two different images from the underlying image sequence (start is a random value between 0 and num_samples_sequence).
**Question 1 - the padding may be assigned in incorrect order **
During ContextCrop process, the padding is assigned as follows
UniDepth/unidepth/datasets/pipelines/transforms.py
Lines 1275 to 1280 in 5afc0dc
which seem to be in the order of [left, bottom, right, top], not match the following code
UniDepth/unidepth/datasets/pipelines/transforms.py
Line 1307 in 5afc0dc
but I do not determine whether it matters because I found the transform operations are performed as follows, which is correct
UniDepth/unidepth/datasets/pipelines/transforms.py
Lines 1315 to 1320 in 5afc0dc
(It indeed matters in the validation process which uses "paddings" , where "paddings" assume in the order of left, right, top. bottom)
UniDepth/unidepth/models/unidepthv1/unidepthv1.py
Lines 157 to 165 in 5afc0dc
UniDepth/unidepth/utils/misc.py
Lines 606 to 610 in 5afc0dc
**Question 2 - full contained condition may not be satisfied **
In the following code, the second branch condition "output_ratio / input_ratio * ctx > 1" will never meet because output_ratio / input_ratio <= 1.0 and ctx < 1.0. And in the "fully contained" branch, "full contained" condition may either not be satisfied because
new_h = new_w / output_ratio
= w * (ctx * output_ratio / input_ratio) ** 0.5 / output_ratio
= h * input_radio * (ctx * output_ratio / input_ratio) ** 0.5 / output_ratio
= h * (ctx * input_radio / out_radio) ** 0.5
although ctx < 1.0, input_radio / out_radio is larger than 1.0, so that new_h may be larger than h, then the crop cannot be fully contained.
UniDepth/unidepth/datasets/pipelines/transforms.py
Lines 1213 to 1225 in 5afc0dc
Question 3 - Only half of the batched data are used for training?
nsteps_accumulation_gradient is set to 1 in the training config, and batch_chunk is equal to batch_size, but the batched data obtained from dataloader (batches["data"]) in fact have a double batch size of 2 * batch_size since a pair of data items are loaded in ConcatDataset.__get_item__ (self.pairs is configured to 2) . Thus, only the first half data in batches["data"] are used for training.
UniDepth/scripts/train.py
Line 258 in 5afc0dc
UniDepth/scripts/train.py
Lines 423 to 430 in 5afc0dc
UniDepth/unidepth/datasets/utils.py
Lines 49 to 55 in 5afc0dc
**Question 4 - SequenceDataset is not suitable for training because of the calculation of SelfDistill loss **
The SelfDistill loss is under the assumption that each pair of image from the chunks is at the same view, which in fact originates from the same image but with different crop/resize operations. However, given certain idx, super(ConcatDataset, self).__getitem__(idx) may load two different images from the underlying image sequence (start is a random value between 0 and num_samples_sequence).
UniDepth/unidepth/ops/losses/distill.py
Line 27 in 5afc0dc
UniDepth/unidepth/datasets/utils.py
Lines 49 to 55 in 5afc0dc
UniDepth/unidepth/datasets/sequence_dataset.py
Lines 219 to 224 in 5afc0dc
The text was updated successfully, but these errors were encountered: