Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a user, I want to throw an WARNING when Time_Coordinates.start_time does not match or come before Time_Coordinates.stop_time #964

Open
jstone-psi opened this issue Aug 2, 2024 · 15 comments

Comments

@jstone-psi
Copy link

jstone-psi commented Aug 2, 2024

Checked for duplicates

Yes - I've already checked

πŸ§‘β€πŸ”¬ User Persona(s)

Data Archivist, Data Provider

πŸ’ͺ Motivation

...so that I can ensure the start time comes before the stop time.

πŸ“– Additional Details

From original bug ticket: from @jstone-psi:

I was asked by one of our archivists if validate ensures that start_date_time occurs before stop_date_time. I tested this, and found that it does not enforce it.

I expected either an error or a warning if the stop_date_time occurred before the start_date_time.

There are cases where this may be difficult due to precision mismatches in both dates. In that case I would expect a warning regarding the difference in precision, or an attempt to compare them after accounting for the difference.

Acceptance Criteria

Given a label with a start_time > stop_time
When I perform label validation
Then I expect validate to raise an WARNING

βš™οΈ Engineering Details

@matthewtiscareno
Copy link

Do you mean if the stop time attribute appears in the label before the start time attribute? Or do you mean if the stop time is later in time than the start time?

@jordanpadams
Copy link
Member

jordanpadams commented Aug 3, 2024

@jstone-psi per @matthewtiscareno comment, if it is the latter, this opens a bit of a can of worms. Per this comment:

There are cases where this may be difficult due to precision mismatches in both dates. In that case I would expect a warning regarding the difference in precision, or an attempt to compare them after accounting for the difference.

The problem here is there is nowhere in the standard that requires the precision be the same, we can add that check, but we will also need to add a flag to disable it since some people may not care about that and/or past data sets didn't take that into account.

We will need a set of test cases from the nodes in order to accurately test this (different time formats, different time precisions, invalid times, etc.). I could imagine we implement this and then get 5 bugs submitted because of corner cases we were unaware of.

@jstone-psi
Copy link
Author

@matthewtiscareno @jordanpadams I did mean the latter.

I understand this is a can of worms, and that there are a lot of corner cases that we would have to deal with. This would still be valuable to have in the general case, when everything matches up correctly. (I'd like to believe that this would happen most of the time).

What would be the next step for this? Somewhere like DDWG or SWG, perhaps?

@jordanpadams
Copy link
Member

@jstone-psi I will add this to our next SWG agenda. If you wouldn't mind speaking to it, and then we can ask the nodes for some of those corner cases.

@jstone-psi
Copy link
Author

Understood. I will do my best to be there.

@jordanpadams jordanpadams added p.could-have icebox and removed bug Something isn't working labels Aug 9, 2024
@jordanpadams jordanpadams changed the title Validate does not enforce date order in Time_Coordinates As a user, I want to throw an error/warning when Time_Coordinates.start_time does not come before Time_Coordinates.stop_time Aug 9, 2024
@jordanpadams jordanpadams changed the title As a user, I want to throw an error/warning when Time_Coordinates.start_time does not come before Time_Coordinates.stop_time As a user, I want to throw an error/warning when Time_Coordinates.start_time does not match or come before Time_Coordinates.stop_time Aug 9, 2024
@jordanpadams jordanpadams added the requirement New requirements label Aug 9, 2024
@jordanpadams
Copy link
Member

@jstone-psi FYI, I revised this ticket to be a new requirement vs. a bug fix

@jstone-psi
Copy link
Author

Here are some notes that I prepared for the discussion:
PDS4DateOrderValidation.md

@jordanpadams jordanpadams changed the title As a user, I want to throw an error/warning when Time_Coordinates.start_time does not match or come before Time_Coordinates.stop_time As a user, I want to throw an WARNING when Time_Coordinates.start_time does not match or come before Time_Coordinates.stop_time Aug 14, 2024
@jstone-psi
Copy link
Author

Here are some examples I found in our archive:

differing precisions:

https://sbnarchive.psi.edu/pds4/non_mission/compil.ast.radar.shape-models/bundle_compil.ast.radar.shape-models.xml
https://sbnarchive.psi.edu/pds4/non_mission/compil.tno-centaur.lightcurves/bundle_compil.tno-centaur.lightcurves.xml
https://sbnarchive.psi.edu/pds4/non_mission/gaskell.ast-eros.shape-model/bundle_gaskell.ast-eros.shape-model.xml
https://sbnarchive.psi.edu/pds4/non_mission/gaskell.ast-eros.shape-model_V1_1/bundle_gaskell.ast-eros.shape-model.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast-1999td10.images-lightcurves/bundle_gbo.ast-1999td10.images-lightcurves.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast-1999td10.images-lightcurves_V2_0/bundle_gbo.ast-1999td10.images-lightcurves.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast-mb.reddy.spectra/bundle_gbo.ast-mb.reddy.spectra.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast-neo.ondrejov.lightcurves/bundle_gbo.ast-neo.ondrejov.lightcurves.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast.chamberlain.sub-mm-lightcurves/bundle_gbo.ast.chamberlain.sub-mm-lightcurves.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast.fieber-beyer.spectra/bundle_gbo.ast.fieber-beyer.spectra.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast.fieber-beyer.spectra_V2_0/bundle_gbo.ast.fieber-beyer.spectra.xml
https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast.primass-l.spectra_V1_0/bundle_gbo.ast.primass-l.spectra.xml
https://sbnarchive.psi.edu/pds4/non_mission/hst.ast-ceres.images-albedo-shape_V1_0/bundle_hst.ast-ceres.images-albedo-shape.xml
https://sbnarchive.psi.edu/pds4/non_mission/ast-sat.thomas.shape-models_V1_0/bundle_ast-sat.thomas.shape-models.xml
https://sbnarchive.psi.edu/pds4/non_mission/ast_spectra_reddy_neos_marscrossers_V1_0/bundle_ast_spectra_reddy_neos_marscrossers.xml

nil start date

https://sbnarchive.psi.edu/pds4/non_mission/gbo.ast.24-color-survey/bundle_gbo.ast.24-color-survey.xml

nil stop date

https://sbnarchive.psi.edu/pds4/non_mission/ast-high-inclination.gil-hutton.families/bundle_ast-high-inclination.gil-hutton.families.xml

@jstone-psi
Copy link
Author

Here's one that incudes negative dates. It looks like this is from an earlier version of PDS that didn't support them, so the start date in the label is 0. All of the dates in this product are precise only to the year, so we won't be able to exercise comparing a negative date with different precision.

https://pdssbn.astro.umd.edu/holdings/pds4-compil-comet:phys_char-v1.0/data/

@matthewtiscareno
Copy link

@jstone-psi: I only spot-checked your examples, but I didn't find any where start_date_time was later than stop_date_time. Is there any reason to worry about the precision mismatch in these cases?

It seems to me that there should be an error (not a warning) if start_date_time is later than stop_date_time. I can't imagine a precision mismatch being an excuse for this, though I'm interested in any counter-examples that someone might offer. On the other hand, a precision issue could be a good reason for start_date_time to be equal to stop_date_time.

Your original proposal was simply to throw an error or a warning (I favor the former) if start_date_time is later than stop_date_time. Why not simply go forward with that and not worry so much about precision issues?

@matthewtiscareno
Copy link

By the way, your two examples of nil start_date_time or nil stop_date_time are troubling and perhaps should be fixed if not made into a validation error.

@jstone-psi
Copy link
Author

@matthewtiscareno: The examples I posted were only to illustrate real-world labels where the precision was different (primarily for the sake of developing test cases). If I understood the issue from our archiving team correctly, then the instances where the start_date_time was later was data that was still being developed.

I agree that situations where start_date_time is later than stop_date time should be warnings/errors. The problem is that when the precision doesn't match, it becomes difficult to say definitively whether it actually is later. The example that I used in the notes above was 2023 vs 2023-12-31. Is 2023 earlier or later? It's hard to say. And this is why the issue isn't straightforward.

I would agree about the nil start_date_time or stop_date_time alone being an issue. I think that we've been ok with it when they are both nil for derived data.

@matthewtiscareno
Copy link

@jstone-psi: I suggest that such a test be implemented by simply reducing the more precise value to the precision of the less precise value. If they are equal (as in your example), then it's no problem.

So is the conclusion that Validate does not test for start time later than stop time (and probably should), but that no actual examples of such a thing are currently known?

@matthewtiscareno
Copy link

I see from recent Software WG notes that you are also interested in discussing what to do more generally when there is a precision mismatch. That's a more nuanced topic, perhaps. Should it be a separate issue?

@jstone-psi
Copy link
Author

@matthewtiscareno

  • Agreed, reducing the precision should work. I believe that the main drawback to that issue is performance. Realistically, the impact on performance is probably small.

  • I am not aware of any data that has made it to production where the start time is later than the stop time. This was an issue because we had a close call, where data was submitted that had this error, but it was caught through manual review. If you would like an example of that, then @beatricemueller may be able to provide some.

  • I think that the precision mismatch issue is only here because edge/corner cases tend to dominate software development. I would say that it's reasonable for most observational products to have matching precisions, but there are enough cases where they can be mismatched (especially at SBN) where I wouldn't actively push for any enforcement on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ToDo
Development

No branches or pull requests

3 participants