Timing Measurements

Pilot Timing

The pilot sends a timing string to the server during the final job update with the following condense format:

pilotTiming = time_getjob | time_stagein | time_payload | time_stageout | time_total_setup

where

time_getjob: time for getJob curl operation to finish.
time_stagein: time for entire stage-in to complete, including replica lookup. Note: the pilot cannot measure the time for direct i/o as this operation is handled by the transform.
time_payload: time for payload execution. Note: this includes any pre- or post-processing.
time_stageout: time for stage-out to complete, including log transfer.
time_total_setup: the total setup time is the time measured from pilot startup to the get job operation. During this time the pilot downloads queue data, checks the proxy lifetime, etc.

CPU Consumption

The Pilot reports CPU timing information on every server update. The measurements (system+user time for all child processes) are done during running approximately once a minute (using /prod/pid/stat) and a final measurement done immediately after the payload has finished (using os.times()).

Given an initial t0, user+system time is calculated like so:

t1 = os.times()
user_time = t1[2] - t0[2]
system_time = t1[3] - t0[3]

The instant CPU timing calculation extracts the system+user time from /proc/pid/stat for a given pid (using os.sysconf_names['SC_CLK_TCK'] for conversion) and loops over all child process stat files.

Overview

Introduction
Pilot Architecture
Pilot Workflows
- Standard Workflow
- HPC Workflow
Event service
Metadata
Direct Access
Signal Handling
Error Codes
Containers
Special Algorithms
Pilot Configuration
Timing Measurements
Copy Tools
Pilot release procedure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timing Measurements

Pilot Timing

CPU Consumption

Overview

Developer pages

Related links

Clone this wiki locally