Skip to content

Commit

Permalink
extra image pulling for selected "compute device" (#274)
Browse files Browse the repository at this point in the history
For the applications that uses "llama-cpp" it will be not enough to just
reinstall PyTorch during "healthcheck"

In this case we allow authors to build custom images for ComputeDevice

---------

Signed-off-by: Alexander Piskun <bigcat88@icloud.com>
  • Loading branch information
bigcat88 authored Apr 17, 2024
1 parent 7391eee commit bbb0490
Show file tree
Hide file tree
Showing 3 changed files with 125 additions and 73 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
- Different compute device configuration for Daemon (NVIDIA, AMD, CPU). #267
- Ability to add optional parameters when registering a daemon, for example *OVERRIDE_APP_HOST*. #269
- Correct support of the Docker `HEALTHCHECK` instruction. #273
- Support of pulling "custom" images for the selected compute device. #274

### Fixed

Expand Down
22 changes: 21 additions & 1 deletion docs/tech_details/InstallationFlow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,25 @@
App Installation Flow
=====================

Image Pulling(Docker)
---------------------

AppAPI **2.5.0+** will always first try to pull a docker image with a ``suffix`` equal to value of *computeDevice*.

Let us remind you that ``computeDevice`` can take the following values: ``cpu``, ``cuda``, ``rocm``

The suffix will be added as follows:

.. code::
return $imageParams['image_src'] . '/' .
$imageParams['image_name'] . '-' . $daemonConfig['computeDevice']['id'] . ':' . $imageParams['image_tag'];
For ``cpu`` AppAPI will first try to get the image from ``ghcr.io/cloud-py-api/skeleton-cpu:latest``.
In case the image is not found, ``ghcr.io/cloud-py-api/skeleton:latest`` will be pulled.

If you as an application developer want to have a custom images for any of these values, you can push that extended images to registry in addition to the based one.

Heartbeat
---------

Expand All @@ -12,7 +31,8 @@ In the case of ``Docker``, this is:

#. 1. performing an image pull
#. 2. creating container from the docker image
#. 3. waiting until the “/heartbeat” endpoint becomes available with a ``GET`` request.
#. 3. if the container supports `healthcheck` - AppAPI waits for the `healthy` status
#. 4. waiting until the “/heartbeat” endpoint becomes available with a ``GET`` request

The application, in response to the request "/heartbeat", should return json: ``{"status": "ok"}``.

Expand Down
175 changes: 103 additions & 72 deletions lib/DeployActions/DockerActions.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,18 +47,16 @@ public function deployExApp(ExApp $exApp, DaemonConfig $daemonConfig, array $par
if (!isset($params['image_params'])) {
return 'Missing image_params.';
}
$imageParams = $params['image_params'];

if (!isset($params['container_params'])) {
return 'Missing container_params.';
}
$containerParams = $params['container_params'];

$dockerUrl = $this->buildDockerUrl($daemonConfig);
$this->initGuzzleClient($daemonConfig);

$this->exAppService->setAppDeployProgress($exApp, 0);
$result = $this->pullImage($dockerUrl, $imageParams, $exApp, 0, 94);
$imageId = '';
$result = $this->pullImage($dockerUrl, $params['image_params'], $exApp, 0, 94, $daemonConfig, $imageId);
if ($result) {
return $result;
}
Expand All @@ -72,7 +70,7 @@ public function deployExApp(ExApp $exApp, DaemonConfig $daemonConfig, array $par
}
}
$this->exAppService->setAppDeployProgress($exApp, 96);
$result = $this->createContainer($dockerUrl, $imageParams, $containerParams);
$result = $this->createContainer($dockerUrl, $imageId, $params['container_params']);
if (isset($result['error'])) {
return $result['error'];
}
Expand All @@ -93,18 +91,27 @@ public function buildApiUrl(string $dockerUrl, string $route): string {
return sprintf('%s/%s/%s', $dockerUrl, self::DOCKER_API_VERSION, $route);
}

public function buildImageName(array $imageParams): string {
return $imageParams['image_src'] . '/' . $imageParams['image_name'] . ':' . $imageParams['image_tag'];
public function buildBaseImageName(array $imageParams): string {
return $imageParams['image_src'] . '/' .
$imageParams['image_name'] . ':' . $imageParams['image_tag'];
}

public function createContainer(string $dockerUrl, array $imageParams, array $params = []): array {
public function buildExtendedImageName(array $imageParams, DaemonConfig $daemonConfig): ?string {
if (empty($daemonConfig->getDeployConfig()['computeDevice']['id'])) {
return null;
}
return $imageParams['image_src'] . '/' .
$imageParams['image_name'] . '-' . $daemonConfig->getDeployConfig()['computeDevice']['id'] . ':' . $imageParams['image_tag'];
}

public function createContainer(string $dockerUrl, string $imageId, array $params = []): array {
$createVolumeResult = $this->createVolume($dockerUrl, $this->buildExAppVolumeName($params['name']));
if (isset($createVolumeResult['error'])) {
return $createVolumeResult;
}

$containerParams = [
'Image' => $this->buildImageName($imageParams),
'Image' => $imageId,
'Hostname' => $params['hostname'],
'HostConfig' => [
'NetworkMode' => $params['net'],
Expand Down Expand Up @@ -200,83 +207,107 @@ public function removeContainer(string $dockerUrl, string $containerId): string
return sprintf('Failed to remove container: %s', $containerId);
}

public function pullImage(string $dockerUrl, array $params, ExApp $exApp, int $startPercent, int $maxPercent): string {
public function pullImage(
string $dockerUrl, array $params, ExApp $exApp, int $startPercent, int $maxPercent, DaemonConfig $daemonConfig, string &$imageId
): string {
$imageId = $this->buildExtendedImageName($params, $daemonConfig);
if ($imageId === null) {
$imageId = $this->buildBaseImageName($params);
$this->logger->info(sprintf('Pulling "base" image: %s', $imageId));
}
try {
$r = $this->pullImageInternal($dockerUrl, $exApp, $startPercent, $maxPercent, $imageId);
} catch (GuzzleException $e) {
$r = sprintf('Failed to pull image, GuzzleException occur: %s', $e->getMessage());
}
if (($r === '') || ($imageId === $this->buildBaseImageName($params))) {
return $r;
}
$this->logger->info(sprintf('Failed to pull "extended" image for %s: %s', $imageId, $r));
$this->logger->info(sprintf('Pulling "base" image: %s', $imageId));
$imageId = $this->buildBaseImageName($params);
try {
$r = $this->pullImageInternal($dockerUrl, $exApp, $startPercent, $maxPercent, $imageId);
} catch (GuzzleException $e) {
$r = sprintf('Failed to pull image, GuzzleException occur: %s', $e->getMessage());
}
return $r;
}

/**
* @throws GuzzleException
*/
public function pullImageInternal(
string $dockerUrl, ExApp $exApp, int $startPercent, int $maxPercent, string $imageId
): string {
# docs: https://github.com/docker/compose/blob/main/pkg/compose/pull.go
$layerInProgress = ['preparing', 'waiting', 'pulling fs layer', 'download', 'extracting', 'verifying checksum'];
$layerFinished = ['already exists', 'pull complete'];
$disableProgressTracking = false;
$imageId = $this->buildImageName($params);
$url = $this->buildApiUrl($dockerUrl, sprintf('images/create?fromImage=%s', urlencode($imageId)));
$this->logger->info(sprintf('Pulling ExApp Image: %s', $imageId));
try {
if ($this->useSocket) {
$response = $this->guzzleClient->post($url);
} else {
$response = $this->guzzleClient->post($url, ['stream' => true]);
}
if ($response->getStatusCode() !== 200) {
return sprintf('Pulling ExApp Image: %s return status code: %d', $imageId, $response->getStatusCode());
}
if ($this->useSocket) {
return '';
}
$lastPercent = $startPercent;
$layers = [];
$buffer = '';
$responseBody = $response->getBody();
while (!$responseBody->eof()) {
$buffer .= $responseBody->read(1024);
try {
while (($newlinePos = strpos($buffer, "\n")) !== false) {
$line = substr($buffer, 0, $newlinePos);
$buffer = substr($buffer, $newlinePos + 1);
$jsonLine = json_decode(trim($line));
if ($jsonLine) {
if (isset($jsonLine->id) && isset($jsonLine->status)) {
$layerId = $jsonLine->id;
$status = strtolower($jsonLine->status);
foreach ($layerInProgress as $substring) {
if (str_contains($status, $substring)) {
$layers[$layerId] = false;
break;
}
if ($this->useSocket) {
$response = $this->guzzleClient->post($url);
} else {
$response = $this->guzzleClient->post($url, ['stream' => true]);
}
if ($response->getStatusCode() !== 200) {
return sprintf('Pulling ExApp Image: %s return status code: %d', $imageId, $response->getStatusCode());
}
if ($this->useSocket) {
return '';
}
$lastPercent = $startPercent;
$layers = [];
$buffer = '';
$responseBody = $response->getBody();
while (!$responseBody->eof()) {
$buffer .= $responseBody->read(1024);
try {
while (($newlinePos = strpos($buffer, "\n")) !== false) {
$line = substr($buffer, 0, $newlinePos);
$buffer = substr($buffer, $newlinePos + 1);
$jsonLine = json_decode(trim($line));
if ($jsonLine) {
if (isset($jsonLine->id) && isset($jsonLine->status)) {
$layerId = $jsonLine->id;
$status = strtolower($jsonLine->status);
foreach ($layerInProgress as $substring) {
if (str_contains($status, $substring)) {
$layers[$layerId] = false;
break;
}
foreach ($layerFinished as $substring) {
if (str_contains($status, $substring)) {
$layers[$layerId] = true;
break;
}
}
foreach ($layerFinished as $substring) {
if (str_contains($status, $substring)) {
$layers[$layerId] = true;
break;
}
}
} else {
$this->logger->warning(
sprintf("Progress tracking of image pulling(%s) disabled, error: %d, data: %s", $exApp->getAppid(), json_last_error(), $line)
);
$disableProgressTracking = true;
}
} else {
$this->logger->warning(
sprintf("Progress tracking of image pulling(%s) disabled, error: %d, data: %s", $exApp->getAppid(), json_last_error(), $line)
);
$disableProgressTracking = true;
}
} catch (Exception $e) {
$this->logger->warning(
sprintf("Progress tracking of image pulling(%s) disabled, exception: %s", $exApp->getAppid(), $e->getMessage()), ['exception' => $e]
);
$disableProgressTracking = true;
}
if (!$disableProgressTracking) {
$completedLayers = count(array_filter($layers));
$totalLayers = count($layers);
$newLastPercent = intval($totalLayers > 0 ? ($completedLayers / $totalLayers) * ($maxPercent - $startPercent) : 0);
if ($lastPercent != $newLastPercent) {
$this->exAppService->setAppDeployProgress($exApp, $newLastPercent);
$lastPercent = $newLastPercent;
}
} catch (Exception $e) {
$this->logger->warning(
sprintf("Progress tracking of image pulling(%s) disabled, exception: %s", $exApp->getAppid(), $e->getMessage()), ['exception' => $e]
);
$disableProgressTracking = true;
}
if (!$disableProgressTracking) {
$completedLayers = count(array_filter($layers));
$totalLayers = count($layers);
$newLastPercent = intval($totalLayers > 0 ? ($completedLayers / $totalLayers) * ($maxPercent - $startPercent) : 0);
if ($lastPercent != $newLastPercent) {
$this->exAppService->setAppDeployProgress($exApp, $newLastPercent);
$lastPercent = $newLastPercent;
}
}
return '';
} catch (GuzzleException $e) {
$this->logger->error('Failed to pull image', ['exception' => $e]);
error_log($e->getMessage());
return 'Failed to pull image, GuzzleException occur.';
}
return '';
}

public function inspectContainer(string $dockerUrl, string $containerId): array {
Expand Down

0 comments on commit bbb0490

Please sign in to comment.