| CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
Currently, the only way to load and save images in/from Docker is to use docker save/load commands that require a tarball to be created that contains all layers and manifests of the image.
For example, when doing builds and loading the build result into Docker, this full tarball needs to be created by the builder and transferred to Docker which then extracts it on disk before processing it. When doing these builds repeatedly, usually only a couple of these layers actually change but because of the inefficiency of this API full tarball needs to be transferred every time even though eventually Docker will discard most of it.
A similar case appears when you would want to transfer an image from one Docker instance to another(without a registry in between). If the target Docker instance already has the older version of the image and just needs an updated version it is again likely that only one layer actually needs to be exported from the daemon and imported on the other side. But no API exists today that would allow running this efficiently.
With the move to containerd image backend that is currently in progress, we have some new options to solve this issue. I propose exposing containerd's contentstore API https://pkg.go.dev/github.com/containerd/containerd@v1.6.9/api/services/content/v1#ContentServer so it can be used with Docker, but do it in a safe manner.
These APIs allow creating individual blobs in content-addressable storage, checking if blob with a specific digest already exists and pulling the blob's contents.
We can use the existing /grpc endpoint that can be upgraded to gRPC connection(currently used by BuildKit).
Containerd APIs on their own are unsafe, and calling them with incorrect parameters can completely corrupt the internal storage, delete objects that are in use by other users, or leak objects forever, so they never release their storage. These cases should be avoided by the Docker API.
Before client would call the content API methods it would need to make a call that creates a lease Leases and GC labels are how containerd controls what objects can be deleted by the GC. All the subsequent content API calls need to contain a lease ID and will error otherwise. A lease has a timeout (for example 10 min by default). If client needs more time to finish its transactions, it can ask for the timeout to be extended. If the client crashes and terminates the connection to the daemon all objects associated with the lease get their reference count reduced when the timeout is reached. If the reference count reaches 0, the object is deleted by containerd GC.
With the lease, client can check which blobs daemon already knows about. If it needs to, it can upload the newer blobs that remain associated with the lease. Blobs containing the image manifest and config are uploaded in the same way. Finally, client will make a CreateImage() call that contains the digest of the root manifest(either image manifest or manifest index for multi-platform). This method will validate that the digest points to a valid manifest with all layers present in contentstore and create a new record under docker images. Now all the blobs are associated with the image itself and the lease is not needed anymore to keep the blobs from getting deleted.
For safety, the Delete() method in contentstore API in Moby is completely disabled. Only way to manage storage is to create leases and delete them.
These APIs enable clients to efficiently create new images when the daemon already knows about some of the layers(iterative builds, deployments) as well as only export the layers the client needs out of the image instead of downloading the full docker save tarball.
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status