-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune unused artifacts from non-static builds #59
base: main
Are you sure you want to change the base?
Conversation
I'm waiting for #57 to be merged to rebase it. As of now, this would mean another step is introduced which removes all IOCs installed in the base image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments to points that I wasn't sure what is the best way to deal with, so that we can discuss if there are relevant or not and how to improve, if that's the case.
In general, I feel like I could have written helper functions that could be more easily shared among the steps. But that's not a concrete thing to propose anything. Let me know if you feel the same.
@@ -66,3 +66,5 @@ RUN ./install_area_detector.sh | |||
|
|||
COPY install_motor.sh . | |||
RUN ./install_motor.sh | |||
|
|||
COPY lnls-prune-artifacts.sh /usr/local/bin/lnls-prune-artifacts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this last so that image cache is not invalidated every time for nothing. It can be moved to be side-by-side with lnls-run
after the script ready to be merged.
base/lnls-prune-artifacts.sh
Outdated
for candidate in $(find $module -type d); do | ||
[ -d $candidate ] || continue | ||
|
||
if [[ ! $keep_paths =~ "$candidate".* ]]; then | ||
size=$(du -hs $candidate | cut -f 1) | ||
|
||
printf "Removing directory '$candidate' ($size)...\n" | ||
rm -rf $candidate | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this loop, I'm removing only directories as a whole if no important file (listed in keep_paths
) is stored in it.
Another approach is to do this in a file basis, which could prune modules even further. I'm not sure if that's the best way to handle it, though.
base/lnls-prune-artifacts.sh
Outdated
clean_up_epics_modules $target_paths | ||
remove_static_libs /opt | ||
remove_unused_shared_libs $target_paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if remove_static_libs
should receive an argument or not. Their companion functions receive directories which contain important binaries whose dependencies must be preserved. On the other hand, remove_static_libs
currently receives the directories to prune the libraries off.
Maybe both of them should take /opt
(and /usr/local
?) from a shared constant variable instead.
Opinions on that?
4e79a7a
to
cec36cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first two commits are correct and can be merged separately, if you'd like to split them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Dockerfile parts of the third commit are ok.
Commit message nits:
It's not the caching that's the issue, it's simply how container images are layered:
The clean up phase must be executed in the build stage (before the final
copy), otherwise the COPY layer would still make the image size big.
This implies `no-build` target needs to copy from a pruned version of
the base image, which is a new stage.
base/lnls-prune-artifacts.sh
Outdated
if [[ -d $module && ! $used_modules =~ "$module" ]]; then | ||
size=$(du -hs $module | cut -f 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is -d $module
required? Shouldn't they always exist? Or is this covering some case like an AreaDetector or motor submodule being in the list after the super-repository has already been cleaned? I think this should be explained somewhere.
I find =~
somewhat non obvious to parse. And it means "motor" won't be removed if "motorX" is in used modules, right? I think it might make sense to print the list into a file (in /tmp, which can be removed afterwards), so it's split by newlines, and use grep to match the whole line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this covering some case like an AreaDetector or motor submodule being in the list after the super-repository has already been cleaned? I think this should be explained somewhere.
That's exactly the case. We might have cleaned them up already.
I find =~ somewhat non obvious to parse. And it means "motor" won't be removed if "motorX" is in used modules, right?
Yes, and it can't be removed. I agree the expression is confusing.
I think it might make sense to print the list into a file (in /tmp, which can be removed afterwards), so it's split by newlines, and use grep to match the whole line.
Okay. Let me try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it means "motor" won't be removed if "motorX" is in used modules
Because of such issues with prefix matching, I moved to checking the each level of the path at a time, with whole line matching instead. This turned out to be handy for the following patches as well.
Still, don't feel like my implementation explores all bash-ims it could.
0356565
to
a0fcdf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I'm starting to think that we should provide a way to explicitly skip a pruning step, possibly for a given target directory.
This would give users some tooling for managing corner cases, without relying on immediate fixes in lnls-prune-artifacts
.
images/docker-compose-mca.yml
Outdated
REPONAME: mca | ||
REPONAME: epics/modules/mca |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't like very much the idea of reusing REPONAME
. It seems awkward to me. I'd rather have them specified by a variable whose name makes sense at all.
I did so because it was easier to test other things first. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. Do you think it would lead to special casing these modules in the image Dockerfile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would lead to special casing these modules in the image Dockerfile?
They kinda already have such special handling... all of them are no-build targets. They of course currently have almost the same (compose variables) API of other targets. I say "almost" because they diverged by not using REPONAME before, and now by not meaning "base name to where to clone the repository" but rather "location inside /opt where important artifacts live".
I'm not sure if I answered your question at all, but I do feel like we have always being dealing with a special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I am essentially wondering if we shouldn't use some other name instead of REPONAME, since these images are a special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to APP_DIRS
. It should encode full paths now. It is named after a plural noun; after all, we can have multiple directories that we want to ensure are properly working. This should work as a palliative measure for spurious cases where we clean up too much. Any directory listed there will have its belonging module removed from pruning (both module-based and directory-based).
Maybe we could be even more explicit about its usage regarding "not pruning it and keeping its executable binaries working". Any opinion and suggestion about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing isn't clear to me. From targets, you are obtaining unused modules, to then obtain used modules, to finally obtain unused modules again, and remove them. Is that logic necessary, can we simplify it? If you already have these answers, you can add them to the commit message.
Commit message nit:
EPICS base is assumed to be always needed,
always be needed
It is assumed we are using glibc implementation
we are using glibc's ldd
For the second commit, since we are removing stuff after building, static libraries can always be removed, notjust for non-static builds. I think it's only the commit message that's wrong.
Nit:
Static libraries are not used neither during compile time nor runtime for non-static builds.
Static libraries are not used during compile nor runtime.
Static libraries are used neither during compile nor runtime.
All AR archives are assumed to be static libraries,
All files with .a extension are assumed to be static libraries,
base/lnls-prune-artifacts.sh
Outdated
# Final binary may be actually runnable, since rpath of another binary may | ||
# pull those not found libraries | ||
found="$(echo "$linked" | grep -v "not found")" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you get any binaries that actually ran despite "not found"? ldd
should perform the same resolution as launching the binary 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. For libopcua.so
that comes in OPCUA module, both EPICS libCom and libdbCore won't be found if we simply ldd
it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EPICS libCom and libdbCore won't be found if we simply ldd it.
Oh, that's annoying. Is that the only case?
I'm wondering if we should make it clear it's due to OPCUA, and remove it once/if we move to the open source impl, or if it makes sense to keep for any other similar cases with proprietary libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the only case?
That's the only one that I spotted. Considering that EPICS modules and IOC by default relies on those custom rpath entries, I'm sure any other already-compiled library shipped into module is susceptible to this issue. This probably falls into what you mention of proprietary libraries in most cases.
If we were using system installation for EPICS base, that wouldn't have been an issue even for OPCUA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pending towards mentioning in a comment that the "not-found" case is due to OPCUA, then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the commit message. Let me know if you think that's not enough.
c34d5c2
to
d74b331
Compare
nit from 13072de: |
Remove EPICS modules that do not have a dynamic library linked to any executables in the target IOC. This shrinks the final image size by removing useless artifacts. We introduce APP_DIRS (application directories) variable to specify paths where relevant binaries are, when no-build target is used, so the proper cleanup can happen in such cases as well. For dynamic-build, it encodes any additional directories besides the base COPY'ed directory to be considered. Cleanup phase must be executed in the build stage (before the final copy), otherwise the COPY layer would still make the image size big. This implies `no-build` target needs to copy from a pruned version of the base image, which is a new stage. The list of modules to be removed is taken from the RELEASE file, so that it contains a valid mapping of all modules that have been installed. EPICS base is assumed to always be needed, even though some binaries in it can certainly be thrown away given the IOC linkage (for instance, PVAccess binaries when the IOC uses only Channel Access). Linkage information is taken from ldd(1), which assumes that inspected binaries are safe to be potentially executed. It is assumed that we are using glibc's ldd, which provides a straightforward way to query all binaries at once. Moreover, we ignore not-found library entries to properly cover binaries already shipped in repositories, such as libopcua.so, which might not contain (properly) defined rpaths. To avoid implementing a function that returns "all EPICS modules whose path is a prefix from the used library paths", we instead reuse `filter_out_paths()` by exploting the fact that it filters *out* everything we care about when given `linked_libs`. Taking its difference to the original set gives us what we need without implementing the variant of `filter_out_paths()`.
Static libraries are not used during runtime, and not even compile-time for non-static builds. On the other hand, they take up a large amount of storage: about 312MB for EPICS base and 222MB for modules. Remove them all right before finishing non-static build stages, avoiding their copy to IOC images. This is not needed for static-link targets, since only the cleaned IOC directory is copied, already leaving behind all static libraries. All files with .a are assumed to be static libraries, which should be a good assumption in GNU/Linux. Other static artifacts, such as object files, are correctly removed by the build system clean target and are not handled explictly here for simplification. A lint script for the build system can be implemented in the future if this eventually becomes false.
Shared libraries may be installed either in /opt or /usr/local/lib but may not be used by the target binaries from the IOC image. Remove them during the prune phase before copying both directories to resulting image to shrink the final image size. Both actual versioned binary and its symbolic links are removed to keep the filesystem consistent. When deciding the set of "used libraries", we assume all libraries inside any APP_DIRS are used. This allows one to specify a dlopen(3)'ed library path in APP_DIRS, and ensure it is kept in the resulting image.
Modules may contain several artifacts, including configuration files, graphical interface files and other repository artifacts that do not need to be in the IOC image. Remove them all except the ones containing EPICS database (`.db` and `.template`) or autosave requirement (`.req`) files, besides shared libraries. Binaries directory (`bin`) is also removed, as only $REPONAME and $RUNDIR should contain target executables, which are filtered out from the list.
Thanks, @gustavosr8. I've also made |
EPICS base has rather large configuration files for build, and other repository files, which are not needed in the IOC images. Prune them after building the IOCs, shrinking by 40MB the final image size. Prune is performed with the same script as modules, which discards all executables in `bin` (~15MB), as well as Perl scripts. This should be fine considering that `static-link` target also does not preserve EPICS binaries in the resulting image.
Now it seems to work! Thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add instructions to put /opt/epics/modules/pyDevSup/python3.9
under APP_DIRS
(or any analogous var) when building an IOC that uses the pyDevSup module.
When pruning modules, some of them might contain special cases where it is harder to automatically detect in the pruning algorithm. While users could workaround this by specifying such special paths under APP_DIRS, this can be handled here by adding extra tooling to be used by modules. Introduce a way for modules to specify extra paths that must be kept, besides those detected by lnls-prune-artifacts. Do this through a .lnls-keep-paths file that specifies all relative paths (directories or files) whose contents must be kept. Paths may be globs, so that they are resolved based on their bash expansion, avoiding hard-coded options when ambiguity can be automatically resolved. These exceptions are considered for all ancestors of a candidate to removal, so that they are not restricted to EPICS modules themselves. This eases implementation because we won't need to detect which parent is a module and should contain .lnls-keep-paths, and instead simply traverse the whole ancestor list looking if any of them defines one.
Some of the dynamic libraries built by pyDevSup are not directly linked into the IOC binary, but rather dlopen(3)'ed by Python interpreter when a "import" statement is reached. To avoid having all users of pyDevSup manually specify which ones to keep, define everything inside python3.<minor-version>/linux-<architecture> to be kept by lnls-prune-artifacts. We use globs here (whose expansion is deferred) to avoid changing the minor version every distro upgrade.
Both
dynamic-link
andno-build
targets generate unnecessarily large IOC images due to the trivialCOPY
done from/opt
. Clean up everything from there and/usr/local
directory before doing theCOPY
to the IOC image.Clean up strategy proposed here is the following:
/opt/$REPONAME
or$RUNDIR
);For pruning module directories, another approach that could be followed is to delegate this responsibility to
install_module
. This way, we can have contextualized pruning for modules. On the other hand, this cannot be as aggressive as implemented in this proposal, as build-dependent files (the ones defining the build-system itself, for instance) must exist until the IOC build is complete.I have tested this with ad-aravis-epics-ioc, and it shrinks the image size from 1.33GB down to 322MB.
This should be further tested and extensively reviewed, so that no artifact is removed unexpectedly.