Preparing batch execution for Zau for the odometry paper #986

brunofavs · 2024-09-20T22:40:11Z

This issue comes as a follow up of #983, as it was getting too long.

We found some issues that were preventing us from successfully calibrating Zau when using dataset splits containing only latter collections. This was fixed in #984. It had to do with the way we were computing the initial estimate of the patterns, which was prior to the odometry being corrected.

Regardless, to assure the problem in #984 is now fixed I conducted calibrations by splitting the dataset in groups of successive collections and assessed if the results were similar. They were similar.

The command used to run the calibrations was the following :

export CSF="lambda x: int(x) in range(0,11)" \
export DATASET=$ATOM_DATASETS/zau/inesc_day2_5_full/dataset_corrected_with_odometry_and_depth_and_rgb_and_pattern_poses.json && rosrun atom_calibration calibrate \
-json $DATASET -uic -v  \
-csf "$CSF" -ssf "lambda x: x in ['rgb_body_left','rgb_body_right','rgbd_hand_color','rgbd_hand_depth','lidar_body']" \
 -ftol 1e-4 -xtol 1e-4 -gtol 1e-4

(The CSF function was the only thing modified between runs)

And these were the results :

Collection 0 to 10

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0109     |       7.3352       |       10.1972       |        2.9100        |        0.0078       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

Collection 11 to 20

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0243     |       3.3670       |       27.4123       |       26.8969        |        0.0258       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

Collection 21 to 30

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0143     |      17.8234       |       11.4869       |        3.7402        |        0.0101       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

Collection 31 to 40

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0228     |      26.7431       |       22.5269       |        8.9211        |        0.0193       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

Collection 40 to end

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0123     |       7.0928       |        7.5322       |        6.1527        |        0.0097       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

The results have some disparity between them, but this is expected as each run is only using 10 collections and amongst these are still some bad collections.

Running with all the collections

Afterwards, I did successive runs and filtered the bad collections and we can see the error is lower now :

Filtered collections : [19,24,31,33,47,0,12,34]

In comparison to the last comment made in #983 :

lidar_body: 0.0137 m -> 0.0107 m (3mm better)
rgb_body_left: 5.6301 pix -> 5.3352 pix (improves slightly)
rgb_body_right: 8.1524 pix -> 4.6795 pix (improves greatly)
rgbd_hand_color: 5.2223 pix -> 3.2660 pix (improves)
rgbd_hand_depth: 0.0086 pix -> 0.0079 (improves ever so slightly)

Total error in pix : 18.9pix -> 13.4pix

Also, we can see on the print of the table that there are no significant outliars, thus further filtering should not be necessary.

With this, I'm pretty sure we are ready to move to the batchs. What do you think @miguelriemoliveira @Kazadhum ?

The text was updated successfully, but these errors were encountered:

miguelriemoliveira · 2024-09-23T09:48:30Z

Regardless, to assure the problem in #984 is now fixed I conducted calibrations by splitting the dataset in groups of successive collections and assessed if the results were similar. They were similar.

This is great news. Thas means #984 is working well.

Total error in pix : 18.9pix -> 13.4pix

Also, we can see on the print of the table that there are no significant outliars, thus further filtering should not be necessary.

This seems ok. Please create a new dataset without these collections.

With this, I'm pretty sure we are ready to move to the batchs. What do you think @miguelriemoliveira @Kazadhum ?

One additional request we perhaps can look at. Its the normalizer of rgb w.r.t. the normalizer for metric sensors.

Now we get results like 1 centimeter in distance sensors, and 5 pixels in rgb sensors.
Perhaps we would prefer results like 1 pixel and 2 centimeters.

I created a new flag in calibrate called rgb_normalizer_multiplier which will do this.
If you set the normalizer multiplier to <1.0 it will increase the weight of the erros in pixels.

@brunofavs can you try to see if we get (even) better results?

miguelriemoliveira · 2024-09-23T09:49:45Z

Pushed with wrong issue number. 52af175

Kazadhum · 2024-09-23T10:38:48Z

Hi @miguelriemoliveira and @brunofavs!

I think we can continue towards batch executions as the results are indeed better than they were a week ago. But I don't think we're quite there yet maybe.

First, if you can achieve even better results by experimenting with the normalizer multiplier then I think that's a great idea (and let us know what value you used because it might yield some interesting conclusions!)

But secondly, what we're looking at in these tables are residuals. These being low tells us the calibration is probably good, but we still need to evaluate these results. I'd say before running batch executions, you should run some experiments where you randomly split the dataset into a train and a test datasets and then run the evaluations. Effectively, this is what each run in the batch executions will be doing. I think it would be good to make sure the evaluations are returning good results before moving onto batch executions.

So, to summarize, I'd do this:

Create a new "clean" or, rather, filtered dataset, without collections [19,24,31,33,47,0,12,34];
Write down the command you will use for splitting the dataset, calibrating the train dataset and then evaluating it (this will be similar to the template file for the batch executions);
Run it, and if you get good results, then I think we can move onto batches!

Do you agree @miguelriemoliveira @brunofavs?

miguelriemoliveira · 2024-09-23T17:16:35Z

Hey,

I think @Kazadhum gave a very good suggestion. We need to see if the evaluations are ok before moving to the batch processing.

brunofavs · 2024-09-25T10:55:53Z

Hey, sorry for the abstinence for the past 2 days, I had some personal things.

Now we get results like 1 centimeter in distance sensors, and 5 pixels in rgb sensors.
Perhaps we would prefer results like 1 pixel and 2 centimeters.

Yeah 1 pixel and 2 centimeters looks more appealing in a paper than 5 pixels and 1 centimeter for sure.

Create a new "clean" or, rather, filtered dataset, without collections [19,24,31,33,47,0,12,34];

Write down the command you will use for splitting the dataset, calibrating the train dataset and then evaluating it (this will be similar to the template file for the batch executions);

Run it, and if you get good results, then I think we can move onto batches!

I agree with what @Kazadhum said as well. I will be doing the tasks @Kazadhum summarized today.

I will give feedback as soon as I have something to show :)

This script works in a counter intuitive way. The -csf function should select the collections the user wants to remove, not the ones he wants to keep. Because the name is "remove_collections_from_atom_dataset", not "preserve_collections_from_atom_dataset".

brunofavs · 2024-09-25T17:58:28Z

Hey.

I filtered the dataset already.

About the normalizer multiplier.

export DATASET=$ATOM_DATASETS/zau/filtered/dataset_corrected_with_odometry_and_depth_and_rgb_and_pattern_poses_filtered.json && rosrun atom_
calibration calibrate \
-json $DATASET -uic -v  \
-ssf "lambda x: x in ['rgb_body_left','rgb_body_right','rgbd_hand_color','rgbd_hand_depth','lidar_body']" \
 -ftol 1e-6 -xtol 1e-6 -gtol 1e-6 -rnm 1

These first results are with ftol xtol gtol 1e-6 to be quick and assess if the normalizer works. Even like this each calibration is already taking 16 minutes each(running simultaneously).

-rnm 1 (in theory without any effect)

Normalizer for lidar3d: 0.199703513281753
Normalizer for depth: 0.14716268425435777
Normalizer for rgb: 164.97977481201676

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0107     |       4.9642       |        4.4450       |        3.2032        |        0.0079       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

-rnm 0.5

Modifiers :

Normalizer for depth: 0.14716268425435783
Normalizer for rgb: 82.48988740600838
Normalizer for lidar3d: 0.19970351328175295

+------------+----------------+--------------------+---------------------+----------------------+---------------------+
| Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+
|  Averages  |     0.0107     |       4.0851       |        3.5622       |        3.0086        |        0.0080       |
+------------+----------------+--------------------+---------------------+----------------------+---------------------+

The normalizer seems to be working! Interestingly, the metric errors didn't seem to change, but the pixel errors did. I expected the metric errors to increase as a tradeoff.

About the evaluations.

I was talking with @Kazadhum about which evaluations to do since I don't think the full evaluation script is working wel, but we didn't come to any good conclusion.

With the 5 sensors we have, its already 10 different combinations possible, so I don't know if its worth to test every single one of them. I suggested to try one evaluation between every possible modality but @Kazadhum argued that it might not be conclusive due to the very different pose of the different sensors.

What is your opinion on this @miguelriemoliveira ?

miguelriemoliveira · 2024-09-26T09:31:07Z

Hi Bruno. About the evaluations let's talk in person. About the rmn why not try more vales? How about 0.1 or 0.01? What are the results in those cases?

…

On Wed, Sep 25, 2024, 18:58 Bruno Silva ***@***.***> wrote: Hey. I filtered the dataset already. About the normalizer multiplier. export DATASET=$ATOM_DATASETS/zau/filtered/dataset_corrected_with_odometry_and_depth_and_rgb_and_pattern_poses_filtered.json && rosrun atom_ calibration calibrate \ -json $DATASET -uic -v \ -ssf "lambda x: x in ['rgb_body_left','rgb_body_right','rgbd_hand_color','rgbd_hand_depth','lidar_body']" \ -ftol 1e-6 -xtol 1e-6 -gtol 1e-6 -rnm 1 These first results are with ftol xtol gtol 1e-6 to be quick and assess if the normalizer works. Even like this each calibration is already taking 16 minutes each(running simultaneously). -rnm 1 (in theory without any effect) - Normalizer for lidar3d: 0.199703513281753 - Normalizer for depth: 0.14716268425435777 - Normalizer for rgb: 164.97977481201676 +------------+----------------+--------------------+---------------------+----------------------+---------------------+ | Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] | +------------+----------------+--------------------+---------------------+----------------------+---------------------+ | Averages | 0.0107 | 4.9642 | 4.4450 | 3.2032 | 0.0079 | +------------+----------------+--------------------+---------------------+----------------------+---------------------+ -rnm 0.5 Modifiers : - Normalizer for depth: 0.14716268425435783 - Normalizer for rgb: 82.48988740600838 - Normalizer for lidar3d: 0.19970351328175295 +------------+----------------+--------------------+---------------------+----------------------+---------------------+ | Collection | lidar_body [m] | rgb_body_left [px] | rgb_body_right [px] | rgbd_hand_color [px] | rgbd_hand_depth [m] | +------------+----------------+--------------------+---------------------+----------------------+---------------------+ | Averages | 0.0107 | 4.0851 | 3.5622 | 3.0086 | 0.0080 | +------------+----------------+--------------------+---------------------+----------------------+---------------------+ The normalizer seems to be working! Interestingly, the metric errors didn't seem to change, but the pixel errors did. I expected the metric errors to increase as a tradeoff. About the evaluations. I was talking with @Kazadhum <https://github.com/Kazadhum> about which evaluations to do since I don't think the full evaluation script is working wel, but we didn't come to any good conclusion. With the 5 sensors we have, its already 10 different combinations possible, so I don't know if its worth to test every single one of them. I suggested to try one evaluation between every possible modality but @Kazadhum <https://github.com/Kazadhum> argued that it might not be conclusive due to the very different pose of the different sensors. What is your opinion on this @miguelriemoliveira <https://github.com/miguelriemoliveira> ? — Reply to this email directly, view it on GitHub <#986 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVXD4P5ROC2JMBQ24L3ZYL2VVAVCNFSM6AAAAABOS6RG72VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZUG44TQMJSG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

brunofavs · 2024-09-28T11:56:22Z

Hey @miguelriemoliveira @Kazadhum !

About the evaluations let's talk in person.

Are you both available to discuss this sometime Monday or Tuesday?

How about 0.1 or 0.01? What are the results in those cases?

Will do these soon.

Kazadhum · 2024-09-28T14:21:28Z

Hi @brunofavs! I can meet on either of those days!

miguelriemoliveira · 2024-09-30T08:30:38Z

Good morning. How about today Monday at 14h?

Kazadhum · 2024-09-30T08:32:12Z

Good morning! Sounds good to me!

brunofavs · 2024-09-30T09:32:00Z

Works for me as well.

Lets do it at 14h then.

brunofavs closed this as completed Oct 3, 2024

brunofavs added the discussion Issue for discussing a topic label Oct 3, 2024

brunofavs mentioned this issue Oct 3, 2024

Zau batch preparations #987

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preparing batch execution for Zau for the odometry paper #986

Preparing batch execution for Zau for the odometry paper #986

brunofavs commented Sep 20, 2024

miguelriemoliveira commented Sep 23, 2024

miguelriemoliveira commented Sep 23, 2024

Kazadhum commented Sep 23, 2024

miguelriemoliveira commented Sep 23, 2024

brunofavs commented Sep 25, 2024

brunofavs commented Sep 25, 2024

miguelriemoliveira commented Sep 26, 2024 via email

brunofavs commented Sep 28, 2024

Kazadhum commented Sep 28, 2024

miguelriemoliveira commented Sep 30, 2024

Kazadhum commented Sep 30, 2024

brunofavs commented Sep 30, 2024

Preparing batch execution for Zau for the odometry paper #986

Preparing batch execution for Zau for the odometry paper #986

Comments

brunofavs commented Sep 20, 2024

Collection 0 to 10

Collection 11 to 20

Collection 21 to 30

Collection 31 to 40

Collection 40 to end

Running with all the collections

miguelriemoliveira commented Sep 23, 2024

miguelriemoliveira commented Sep 23, 2024

Kazadhum commented Sep 23, 2024

miguelriemoliveira commented Sep 23, 2024

brunofavs commented Sep 25, 2024

brunofavs commented Sep 25, 2024

About the normalizer multiplier.

-rnm 1 (in theory without any effect)

-rnm 0.5

About the evaluations.

miguelriemoliveira commented Sep 26, 2024 via email

brunofavs commented Sep 28, 2024

Kazadhum commented Sep 28, 2024

miguelriemoliveira commented Sep 30, 2024

Kazadhum commented Sep 30, 2024

brunofavs commented Sep 30, 2024