Skip to content

Automatic stageout of tape data

Stefano Belforte edited this page Jul 22, 2024 · 7 revisions

Automatic stageout of tape data

The idea is to allow users to access data on tape. Once the user submit a task we put it in the TAPERECALL state and create a rule in Rucio to get the input data on disk. When the rule is OK, the task is set in state NEW again and go through data discovery step again, this time finding data and disk and proceeding to submission.

Implementation

  • recall rules are created by crab_tape_recall Rucio account
  • recall rules have activity Analysis TapeRecall and are charged to the Rucio account corresponding to the username. In this way Rucio can keep track of how much data each user is recalling and we can set limits
  • recall rules are submitted with "AutomaticApproval", Rucio only enforces a global limit on crab_tape_recall user, enforcing of limits on each task is done by CRAB
  • CRAB enforces policies in executeTapeRecallPolicy method of TaskWorker/Actions/DBSDataDiscovery

Policies

  • Limit to how much data a user can have in recall at any time: maxRecallPerUserTB
  • Limit to how large a dataset can be recalled, depending on data tier:
    • if datatier is in the list tiersToRecall : no limit
    • all other tiers: maxAnyTierRecallSizeTB
  • When dataset is too large, users have the option to give a list of blocks to recall
    • Limit on recall size when providing a list of blocks: maxTierToBlockRecallSizeTB
  • When user request is above limits, they are told to: "contact Data Transfer team via https://its.cern.ch/jira/browse/CMSTRANSF"

Parameters

The above parameters are set in TaskWorkerConfig.py file and can be modified via e.g. puppet template

Parameter Current Value (July 2024)
tiersToRecall ['AOD', 'AODSIM', 'MINIAOD', 'MINIAODSIM', 'NANOAOD', 'NANOAODSIM']
maxAnyTierRecallSizeTB 50
maxTierToBlockRecallSizeTB 50
maxRecallPerUserTB 100

Be aware that this table may be obsolete, to know current parameter value, look at current/TaskWorkerConfig.py in the current production TaskWorker container