Currently the `gsutil rsync` command does not support the `-z` or `-Z`
options available in `gsutil cp` to compress files locally via gzip
before uploading
(https://github.com/GoogleCloudPlatform/gsutil/issues/579). As
https://cloud.google.com/storage/docs/gsutil/commands/cp states:
When you specify the -z option, the data from your files is
compressed before it is uploaded, but your actual files are left
uncompressed on the local disk. The uploaded objects retain the
Content-Type and name of the original files, but have their
Content-Encoding metadata set to gzip to indicate that the object data
stored are compressed on the Cloud Storage servers and have their
Cache-Control metadata set to no-transform.
about.gitlab.com is currently serving uncompressed HTML files because
`Cache-Control: max-age=0` is set (see
https://gitlab.com/gitlab-com/www-gitlab-com/-/merge_requests/87045),
and Fastly has a custom rule to skip HTML files and therefore won't
cache them.
This patches the `rsync.py` using
https://github.com/GoogleCloudPlatform/gsutil/pull/1430 to support these
command-line options so local gzip compression can be performed.
Relates to
https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14852
With https://gitlab.com/gitlab-com/www-gitlab-com/-/merge_requests/81890
the last usage of pandoc / LaTeX was removed from the www-gitlab-com
repo. This happened more than 2 months ago, so removing it from the
image should be rather safe, especially considering we use merge trains.
It reduces the image size by around 500 MB.
Every package manager should clean up after themselves in order to keep
docker layers neat and tiny:
apt (Debian/Ubuntu package manager):
- unneeded dependencies are cleared (autoremove)
- caches are cleaned (clean)
- package lists are deleted
yum (CentOS package manager),
zypper (OpenSuse package manager)
- should clear caches after installing dependencies
pip (Python package manager),
apk (Alpine package manager)
- should use no cache for installing dependencies