Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One off error in decompress_backfill #23

Open
drfraser opened this issue Sep 10, 2021 · 1 comment
Open

One off error in decompress_backfill #23

drfraser opened this issue Sep 10, 2021 · 1 comment

Comments

@drfraser
Copy link

Hi,

Timescale version 2.4.1
Postgres: version 12

I decided to see if decompress_backfill() would handle appending new data - we are not using timescale in the usual way, but using it to store floating point values associated with a date (time and timestamps not involved). For various reasons, it would be best if the last chunk of a hypertable is compressed like the rest of them. so something that can automatically uncompress and add new data to the last block would be useful.

Everything works great, in general. But trying to append a new set of data to a hypertable, decompress_backfill is leaving off the last subset of the new data which has the very last date. e.g. the hypertable contains dates from 1950-01-01 to 2020-12-31 and data from 2021-01-01 to < 2091-12-31 is appended. Everything is added correctly, and a new chunk created as necessary, except for the last set of data with a date of 2091-12-31

select count(*) from temp_era5_ta_0002m_adm0_ins_01m; -- the staging table
returns 238

I would have expected to see 0 rows

Turning on debug and notices in the decompress_backfill shows:

DEBUG....
DEBUG: building index "pg_toast_237927_index" on table "pg_toast_237927" serially
DEBUG: building index "compress_hyper_36_73_chunk__compressed_hypertable_36_geocode__t" on table "compress_hyper_36_73_chunk" serially
DEBUG: building index "_hyper_35_69_chunk_era5_ta_0002m_adm0_ins_01m_geocode_dateof_id" on table "_hyper_35_69_chunk" serially
DEBUG: building index "compress_hyper_36_73_chunk__compressed_hypertable_36_geocode__t" on table "compress_hyper_36_73_chunk" serially
DEBUG: verifying table "_hyper_35_74_chunk"
DEBUG: EventTriggerInvoke 127623
DEBUG: building index "_hyper_35_74_chunk_era5_ta_0002m_adm0_ins_01m_geocode_dateof_id" on table "_hyper_35_74_chunk" serially
NOTICE: 65450 rows moved in range '2068-07-25' to '2091-12-31'

Looking at the script, obviously r_end is not quite right - but as far as I can tell without debugging the procedure, I don't think the fact we are using dates instead of time or timestamps should matter. Shouldn't the range_end be the max acceptable value for the chunk, not the max time value in the data?

If decompress_backfill is not supposed to handle appending data, then please close this issue. Otherwise a new parameter for the routine to make the transferred data inclusive of the last time/date would be useful. Or fix the issue, if there is one..

If I figure this out, I will update this issue

@drfraser
Copy link
Author

ok, I think I figured it out - r_end is translated into a date, i.e. 2091-12-31. Any sort of cleanup to catch the last set of rows (i.e. see line 275) adds a 1 to r_end - but that still gets translated to the string '2091-12-31'.

The value used in that line ought to be dependent on the unit of time of the time column. e.g. for me, an addition of 86400 seconds is needed, not 1 second. Or rather in microseconds, it seems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant