Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune unused artifacts from non-static builds #59

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

Conversation

henriquesimoes
Copy link
Collaborator

Both dynamic-link and no-build targets generate unnecessarily large IOC images due to the trivial COPY done from /opt. Clean up everything from there and /usr/local directory before doing the COPY to the IOC image.

Clean up strategy proposed here is the following:

  • Remove entirely any module that does not have a shared library linked against binaries under the IOC target directories (i.e. /opt/$REPONAME or $RUNDIR);
  • Prune directories from used modules and EPICS base not containing shared libraries, EPICS databases or autosave requirement files;
  • Remove any shared library not linked against binaries under IOC target directories;
  • Remove all statically linked libraries;

For pruning module directories, another approach that could be followed is to delegate this responsibility to install_module. This way, we can have contextualized pruning for modules. On the other hand, this cannot be as aggressive as implemented in this proposal, as build-dependent files (the ones defining the build-system itself, for instance) must exist until the IOC build is complete.

I have tested this with ad-aravis-epics-ioc, and it shrinks the image size from 1.33GB down to 322MB.

This should be further tested and extensively reviewed, so that no artifact is removed unexpectedly.

@henriquesimoes
Copy link
Collaborator Author

I'm waiting for #57 to be merged to rebase it. As of now, this would mean another step is introduced which removes all IOCs installed in the base image.

@henriquesimoes henriquesimoes changed the title Prune unused artifacts from IOC images Prune unused artifacts from non-static builds Apr 16, 2024
@henriquesimoes henriquesimoes mentioned this pull request Apr 16, 2024
@henriquesimoes henriquesimoes linked an issue Apr 17, 2024 that may be closed by this pull request
Copy link
Collaborator Author

@henriquesimoes henriquesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added comments to points that I wasn't sure what is the best way to deal with, so that we can discuss if there are relevant or not and how to improve, if that's the case.

In general, I feel like I could have written helper functions that could be more easily shared among the steps. But that's not a concrete thing to propose anything. Let me know if you feel the same.

@@ -66,3 +66,5 @@ RUN ./install_area_detector.sh

COPY install_motor.sh .
RUN ./install_motor.sh

COPY lnls-prune-artifacts.sh /usr/local/bin/lnls-prune-artifacts
Copy link
Collaborator Author

@henriquesimoes henriquesimoes Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this last so that image cache is not invalidated every time for nothing. It can be moved to be side-by-side with lnls-run after the script ready to be merged.

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Comment on lines 89 to 96
for candidate in $(find $module -type d); do
[ -d $candidate ] || continue

if [[ ! $keep_paths =~ "$candidate".* ]]; then
size=$(du -hs $candidate | cut -f 1)

printf "Removing directory '$candidate' ($size)...\n"
rm -rf $candidate
fi
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this loop, I'm removing only directories as a whole if no important file (listed in keep_paths) is stored in it.

Another approach is to do this in a file basis, which could prune modules even further. I'm not sure if that's the best way to handle it, though.

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
clean_up_epics_modules $target_paths
remove_static_libs /opt
remove_unused_shared_libs $target_paths
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if remove_static_libs should receive an argument or not. Their companion functions receive directories which contain important binaries whose dependencies must be preserved. On the other hand, remove_static_libs currently receives the directories to prune the libraries off.

Maybe both of them should take /opt (and /usr/local?) from a shared constant variable instead.

Opinions on that?

Copy link
Collaborator

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first two commits are correct and can be merged separately, if you'd like to split them.

Copy link
Collaborator

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile parts of the third commit are ok.


Commit message nits:

It's not the caching that's the issue, it's simply how container images are layered:

The clean up phase must be executed in the build stage (before the final
copy), otherwise the COPY layer would still make the image size big.
This implies `no-build` target needs to copy from a pruned version of
the base image, which is a new stage.

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Comment on lines 36 to 112
if [[ -d $module && ! $used_modules =~ "$module" ]]; then
size=$(du -hs $module | cut -f 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is -d $module required? Shouldn't they always exist? Or is this covering some case like an AreaDetector or motor submodule being in the list after the super-repository has already been cleaned? I think this should be explained somewhere.

I find =~ somewhat non obvious to parse. And it means "motor" won't be removed if "motorX" is in used modules, right? I think it might make sense to print the list into a file (in /tmp, which can be removed afterwards), so it's split by newlines, and use grep to match the whole line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is this covering some case like an AreaDetector or motor submodule being in the list after the super-repository has already been cleaned? I think this should be explained somewhere.

That's exactly the case. We might have cleaned them up already.

I find =~ somewhat non obvious to parse. And it means "motor" won't be removed if "motorX" is in used modules, right?

Yes, and it can't be removed. I agree the expression is confusing.

I think it might make sense to print the list into a file (in /tmp, which can be removed afterwards), so it's split by newlines, and use grep to match the whole line.

Okay. Let me try it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it means "motor" won't be removed if "motorX" is in used modules

Because of such issues with prefix matching, I moved to checking the each level of the path at a time, with whole line matching instead. This turned out to be handy for the following patches as well.

Still, don't feel like my implementation explores all bash-ims it could.

Copy link
Collaborator Author

@henriquesimoes henriquesimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'm starting to think that we should provide a way to explicitly skip a pruning step, possibly for a given target directory.

This would give users some tooling for managing corner cases, without relying on immediate fixes in lnls-prune-artifacts.

REPONAME: mca
REPONAME: epics/modules/mca
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't like very much the idea of reusing REPONAME. It seems awkward to me. I'd rather have them specified by a variable whose name makes sense at all.

I did so because it was easier to test other things first. Let me know what you think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Do you think it would lead to special casing these modules in the image Dockerfile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would lead to special casing these modules in the image Dockerfile?

They kinda already have such special handling... all of them are no-build targets. They of course currently have almost the same (compose variables) API of other targets. I say "almost" because they diverged by not using REPONAME before, and now by not meaning "base name to where to clone the repository" but rather "location inside /opt where important artifacts live".

I'm not sure if I answered your question at all, but I do feel like we have always being dealing with a special case.

Copy link
Collaborator

@ericonr ericonr Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am essentially wondering if we shouldn't use some other name instead of REPONAME, since these images are a special case.

Copy link
Collaborator Author

@henriquesimoes henriquesimoes Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to APP_DIRS. It should encode full paths now. It is named after a plural noun; after all, we can have multiple directories that we want to ensure are properly working. This should work as a palliative measure for spurious cases where we clean up too much. Any directory listed there will have its belonging module removed from pruning (both module-based and directory-based).

Maybe we could be even more explicit about its usage regarding "not pruning it and keeping its executable binaries working". Any opinion and suggestion about that?

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Copy link
Collaborator

@ericonr ericonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing isn't clear to me. From targets, you are obtaining unused modules, to then obtain used modules, to finally obtain unused modules again, and remove them. Is that logic necessary, can we simplify it? If you already have these answers, you can add them to the commit message.

Commit message nit:

EPICS base is assumed to be always needed,

always be needed

It is assumed we are using glibc implementation

we are using glibc's ldd

For the second commit, since we are removing stuff after building, static libraries can always be removed, notjust for non-static builds. I think it's only the commit message that's wrong.

Nit:

Static libraries are not used neither during compile time nor runtime for non-static builds.

Static libraries are not used during compile nor runtime.

Static libraries are used neither during compile nor runtime.

All AR archives are assumed to be static libraries,

All files with .a extension are assumed to be static libraries,

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
Comment on lines 46 to 57
# Final binary may be actually runnable, since rpath of another binary may
# pull those not found libraries
found="$(echo "$linked" | grep -v "not found")"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get any binaries that actually ran despite "not found"? ldd should perform the same resolution as launching the binary 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. For libopcua.so that comes in OPCUA module, both EPICS libCom and libdbCore won't be found if we simply ldd it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EPICS libCom and libdbCore won't be found if we simply ldd it.

Oh, that's annoying. Is that the only case?

I'm wondering if we should make it clear it's due to OPCUA, and remove it once/if we move to the open source impl, or if it makes sense to keep for any other similar cases with proprietary libraries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the only case?

That's the only one that I spotted. Considering that EPICS modules and IOC by default relies on those custom rpath entries, I'm sure any other already-compiled library shipped into module is susceptible to this issue. This probably falls into what you mention of proprietary libraries in most cases.

If we were using system installation for EPICS base, that wouldn't have been an issue even for OPCUA.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pending towards mentioning in a comment that the "not-found" case is due to OPCUA, then.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the commit message. Let me know if you think that's not enough.

base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
base/lnls-prune-artifacts.sh Outdated Show resolved Hide resolved
@gustavosr8
Copy link
Contributor

nit from 13072de:
s/clean-up/cleanup
s/clean up/cleanup
s/so that it contains a valid mapping of all modules/so it contains a valid mapping of all modules
s/assumes the inspected binaries/assumes that the inspected binaries
s/It is assumed we are using glibc's/It is assumed that we are using glibc's

Remove EPICS modules that do not have a dynamic library linked to any
executables in the target IOC. This shrinks the final image size by
removing useless artifacts. We introduce APP_DIRS (application
directories) variable to specify paths where relevant binaries are, when
no-build target is used, so the proper cleanup can happen in such cases
as well. For dynamic-build, it encodes any additional directories
besides the base COPY'ed directory to be considered.

Cleanup phase must be executed in the build stage (before the final
copy), otherwise the COPY layer would still make the image size big.
This implies `no-build` target needs to copy from a pruned version of
the base image, which is a new stage.

The list of modules to be removed is taken from the RELEASE file, so
that it contains a valid mapping of all modules that have been
installed. EPICS base is assumed to always be needed, even though some
binaries in it can certainly be thrown away given the IOC linkage (for
instance, PVAccess binaries when the IOC uses only Channel Access).

Linkage information is taken from ldd(1), which assumes that inspected
binaries are safe to be potentially executed. It is assumed that we are
using glibc's ldd, which provides a straightforward way to query all
binaries at once. Moreover, we ignore not-found library entries to
properly cover binaries already shipped in repositories, such as
libopcua.so, which might not contain (properly) defined rpaths.

To avoid implementing a function that returns "all EPICS modules whose
path is a prefix from the used library paths", we instead reuse
`filter_out_paths()` by exploting the fact that it filters *out*
everything we care about when given `linked_libs`. Taking its difference
to the original set gives us what we need without implementing the
variant of `filter_out_paths()`.
Static libraries are not used during runtime, and not even compile-time
for non-static builds. On the other hand, they take up a large amount of
storage: about 312MB for EPICS base and 222MB for modules. Remove them
all right before finishing non-static build stages, avoiding their copy
to IOC images. This is not needed for static-link targets, since only
the cleaned IOC directory is copied, already leaving behind all static
libraries.

All files with .a are assumed to be static libraries, which should be a
good assumption in GNU/Linux. Other static artifacts, such as object
files, are correctly removed by the build system clean target and are
not handled explictly here for simplification. A lint script for the
build system can be implemented in the future if this eventually becomes
false.
Shared libraries may be installed either in /opt or /usr/local/lib but
may not be used by the target binaries from the IOC image. Remove them
during the prune phase before copying both directories to resulting
image to shrink the final image size. Both actual versioned binary and
its symbolic links are removed to keep the filesystem consistent.

When deciding the set of "used libraries", we assume all libraries
inside any APP_DIRS are used. This allows one to specify a dlopen(3)'ed
library path in APP_DIRS, and ensure it is kept in the resulting image.
Modules may contain several artifacts, including configuration files,
graphical interface files and other repository artifacts that do not
need to be in the IOC image.

Remove them all except the ones containing EPICS database (`.db` and
`.template`) or autosave requirement (`.req`) files, besides shared
libraries. Binaries directory (`bin`) is also removed, as only $REPONAME
and $RUNDIR should contain target executables, which are filtered out
from the list.
@henriquesimoes
Copy link
Collaborator Author

Thanks, @gustavosr8. I've also made remove_unused_shared_libs respect libraries inside APP_DIRS. This allows users to specify the path where their dlopen(3)ed libraries live. Can you test this again with the pyDevSup-based IOC?

EPICS base has rather large configuration files for build, and other
repository files, which are not needed in the IOC images. Prune them
after building the IOCs, shrinking by 40MB the final image size.

Prune is performed with the same script as modules, which discards all
executables in `bin` (~15MB), as well as Perl scripts. This should be
fine considering that `static-link` target also does not preserve EPICS
binaries in the resulting image.
@gustavosr8
Copy link
Contributor

Thanks, @gustavosr8. I've also made remove_unused_shared_libs respect libraries inside APP_DIRS. This allows users to specify the path where their dlopen(3)ed libraries live. Can you test this again with the pyDevSup-based IOC?

Now it seems to work! Thanks :)
Just a reminder to document the use of APP_DIRS ;)

Copy link
Contributor

@gustavosr8 gustavosr8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add instructions to put /opt/epics/modules/pyDevSup/python3.9 under APP_DIRS (or any analogous var) when building an IOC that uses the pyDevSup module.

When pruning modules, some of them might contain special cases where it
is harder to automatically detect in the pruning algorithm. While users
could workaround this by specifying such special paths under APP_DIRS,
this can be handled here by adding extra tooling to be used by modules.

Introduce a way for modules to specify extra paths that must be kept,
besides those detected by lnls-prune-artifacts. Do this through a
.lnls-keep-paths file that specifies all relative paths (directories or
files) whose contents must be kept. Paths may be globs, so that they are
resolved based on their bash expansion, avoiding hard-coded options when
ambiguity can be automatically resolved.

These exceptions are considered for all ancestors of a candidate to
removal, so that they are not restricted to EPICS modules themselves.
This eases implementation because we won't need to detect which parent
is a module and should contain .lnls-keep-paths, and instead simply
traverse the whole ancestor list looking if any of them defines one.
Some of the dynamic libraries built by pyDevSup are not directly linked
into the IOC binary, but rather dlopen(3)'ed by Python interpreter when
a "import" statement is reached. To avoid having all users of pyDevSup
manually specify which ones to keep, define everything inside
python3.<minor-version>/linux-<architecture> to be kept by
lnls-prune-artifacts.

We use globs here (whose expansion is deferred) to avoid changing the
minor version every distro upgrade.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prune unused modules from dynamic-link
3 participants