Skip to content

Releases: AliceO2Group/Control

v0.68.1

09 Jun 11:32
@teo teo
Compare
Choose a tag to compare

This patch release fixes an issue with default values for resource limits.

  • [core] Explicitly default to infinite cpu/mem resource limits

v0.68.0

07 Jun 12:20
@teo teo
Compare
Choose a tag to compare

This release includes cgroups support for CPU and memory limits, a new feature that requires a Mesos agent configuration change and ensures a misbehaving task won't block the whole FLP.

  • Resource limits:
    • [core] Support Mesos task resource limits specification for task classes
    • [core] Prevent crash in incomplete limits
    • [core] Print limits to IL
    • [core] Avoid triggering dead or inactive hooks on teardown
    • [core] Proceed with task kill even if some cannot be killed
    • [core] Explicit handling of executor/agent failed events
    • [core] Only perform a STOP transition for ACTIVE tasks
    • [core] Wait for 500ms for ERROR states to settle before GO_ERROR/STOP

v0.67.2

07 Jun 12:14
@teo teo
Compare
Choose a tag to compare

This release includes a crash fix and a build fix for FairMQ 1.5.x.

  • [build] Compile with FairMQ 1.5.x
  • [core] Fix crash caused by map contention in Bookkeeping plugin

v0.67.1

25 May 08:01
@teo teo
Compare
Choose a tag to compare

This patch release increases the command timeout for the CONFIGURE transition and fixes a run number acquisition issue in the CCDB plugin.

  • [core] do not complain when ccdb plugin cannot get a run number
  • [core] Increase CONFIGURE transition timeout to 120s

v0.67.0

15 May 13:28
@teo teo
Compare
Choose a tag to compare

This release includes support for internal task error events being raised by tasks. Such an event immediately transitions the environment to the ERROR state.

  • Task error events:
    • [core] React to TASK_INTERNAL_ERROR with STOP_ACTIVITY attempt
    • [core] Build TaskInternalError event
    • [executor] Support TASK_INTERNAL_ERROR event
    • [occ] Push TASK_INTERNAL_ERROR event
    • [occ] Only emit task internal error event once

v0.66.0

04 May 10:08
@teo teo
Compare
Choose a tag to compare

This release includes crash fixes and improvements to integration plugins.

  • Integration:

    • [apricot] PEDESTALS is now PEDESTAL
    • [core] Add pdp_epn_shm_sizes param to ODC plugin
  • Bug fixes:

    • [core] Prevent rare crash in resourceOffers tasksDeployed access
    • [core] Bail early if a critical error occurs in a transition step
    • [core] Prevent crash in Kafka plugin concurrent map access

v0.65.1

25 Apr 11:50
@teo teo
Compare
Choose a tag to compare

This patch release includes an improvement to the deployment sequence, adding a general task cleanup before most other operations.

  • [core] Perform cleanup right before environment deployment
  • [executor] Notify InfoLogger on task END_OF_STREAM event

v0.65.0

18 Apr 16:33
@teo teo
Compare
Choose a tag to compare

This release brings important changes to the behavior of the scheduler component of the AliECS core. Specifically, a new UNDEPLOYABLE status has been added for tasks that cannot be deployed due to cluster conditions, and the task scheduling algorithm has been reworked to fail early when possible.

  • AliECS core scheduler improvements:

    • [core] Treat undeployable task separately from plain inactive
    • [core] Complete product operation for UNDEPLOYABLE status
    • [core] Incoming offers preprocessing for early failure
  • Miscellaneous:

    • [executor] Log end of life

v0.64.2

13 Apr 12:00
@teo teo
Compare
Choose a tag to compare

The present release includes bug fixes for DCS integration, for the executor and for a race in the repository access layer.

  • [core] Do not write back to varSpecMap coming from repos backend
  • [core] Update run types enum in Bookkeeping client
  • [core] Regenerate DCS protofile (AGD detector)
  • [core] Inform user if offer includes multiple executors
  • [executor] Do not cause executor disconnect on unprocessable MESSAGE

v0.64.1

30 Mar 12:52
@teo teo
Compare
Choose a tag to compare

The present release includes miscellaneous bug fixes.

  • [core] Prevent crash on bad traits in configureTasks
  • [core] set EOR time also before DESTROY
  • [doc] update the SM diagram and document SM callbacks
  • [OCTRL-770][core] set run end time after trg end time