❌

Reading view

There are new articles available, click to refresh the page.

Learn How to Optimize Docker Hub Costs With Our Usage Dashboards

Effective infrastructure management is crucial for organizations using Docker Hub. Without a clear understanding of resource consumption, unexpected usage can emerge and skyrocket. This is particularly true if pulls and storage needs are not budgeted and forecasted correctly. By implementing proactive post controls and monitoring usage patterns, development teams can sustain their Docker Hub usage while keeping expenses under control.Β 

To support these goals, we’ve introduced new Docker Hub Usage dashboards, offering organizations the ability to access and analyze their usage patterns for storage and pulls.Β 

Docker Hub’s Usage dashboards put you in control, giving visibility into every pull and image your Docker systems request. Each pull and cache becomes a deliberate choice β€” not a random event β€” so you can make every byte count. With clear insights into what’s happening and why, you can design more efficient, optimized systems.

2400x1260 generic hub blog c

Reclaim control and manage technical resources by kicking bad habits

hub usage f1
Figure 1: Docker Hub Usage dashboards.

The Docker Hub Usage dashboards (Figure 1) provide valuable insights, allowing teams to track peaks and valleys, detect high usage periods, and identify the images and repositories driving the most consumption. This visibility not only aids in managing usage but also strengthens continuous improvement efforts across your software supply chain, helping teams build applications more efficiently and sustainably.Β 

This information helps development teams to stay on top of challenges, such as:Β 

  • Redundant pulls and misconfigured repositories: These can quickly and quietly drive up technical expenses while falling out of scope of the most relevant or critical use cases. Docker Hub’s Usage dashboards can help development teams identify patterns and optimize accordingly. They let you view usage trends across IPs and users as well, which helps with pinpointing high consumption areas and ensuring accountability in an organization when it comes to resource management.Β 
  • Poor caching management: Repository insights and image tagging helps customers assess internal usage patterns, such as frequently accessed images, where there might be an opportunity to improve caching. With proper governance models, organizations can also establish policies and processes that reduce the variability of resource usage as a whole. This goal goes beyond keeping track of seasonality usage patterns to help you design more predictable usage patterns so you can budget accordingly.Β 
  • Accidental automation: Accidental automated system activities can really hurt your usage. Let’s say you are using a CI/CD pipeline or automated scripts configured to pull images more often than they should. They may pull on every build instead of the actual version change, for example.Β 

Usage dashboards can help you identify these inefficiencies by showing detailed pull data associated with automated tooling. This information can help your teams quickly identify and adjust misconfigured systems, fine-tune automations to only pull when needed, and ultimately focus on the most relevant use cases for your organization, avoiding accidental overuse of resources:

Details of Docker Hub Usage Dashboard with columns for Date/hour, Username, Repository library, IPs, Version checks, pulls, and more.
Figure 2: Details from the Usage dashboards.

Docker Hub’s Usage dashboards offer a comprehensive view of your usage data, including downloadable CSV reports that include metrics such as pull counts, repository names, IP addresses, and version checks (Figure 2). This granular approach allows your organization to gain valuable insights and trend data to help optimize your team’s workflows and inform policies.Β 

Integrate robust operational principles into your development pipeline by leveraging these data-driven reports and maintain control over resource consumption and operational efficiency with Docker Hub.Β 

Learn more

Docker Best Practices: Using Tags and Labels to Manage Docker Image Sprawl

With many organizations moving to container-based workflows, keeping track of the different versions of your images can become a problem. Even smaller organizations can have hundreds of container images spanning from one-off development tests, through emergency variants to fix problems, all the way to core production images. This leads us to the question: How can we tame our image sprawl while still rapidly iterating our images?

A common misconception is that by using the β€œlatest” tag, you are guaranteeing that you are pulling the β€œlatest” version of the image. Unfortunately, this assumption is wrong β€” all latest means is β€œthe last image pushed to this registry.”

Read on to learn more about how to avoid this pitfall when using Docker and how to get a handle on your Docker images.

Docker Best Practices: Using Tags and Labels to Manage Docker Image Sprawl

Using tags

One way to address this issue is to use tags when creating an image. Adding one or more tags to an image helps you remember what it is intended for and helps others as well. One approach is always to tag images with their semantic versioning (semver), which lets you know what version you are deploying. This sounds like a great approach, and, to some extent, it is, but there is a wrinkle.

Unless you’ve configured your registry for immutable tags, tags can be changed. For example, you could tag my-great-app as v1.0.0 and push the image to the registry. However, nothing stops your colleague from pushing their updated version of the app with tag v1.0.0 as well. Now that tag points to their image, not yours. If you add in the convenience tag latest, things get a bit more murky.

Let’s look at an example:

FROM busybox:stable-glibc

# Create a script that outputs the version
RUN echo -e "#!/bin/sh\n" > /test.sh && \
    echo "echo \"This is version 1.0.0\"" >> /test.sh && \
    chmod +x /test.sh

# Set the entrypoint to run the script
ENTRYPOINT ["/bin/sh", "/test.sh"]

We build the above with docker build -t tagexample:1.0.0 . and run it.

$ docker run --rm tagexample:1.0.0
This is version 1.0.0

What if we run it without a tag specified?

$ docker run --rm tagexample
Unable to find image 'tagexample:latest' locally
docker: Error response from daemon: pull access denied for tagexample, repository does not exist or may require 'docker login'.
See 'docker run --help'.

Now we build with docker build . without specifying a tag and run it.

$ docker run --rm tagexample
This is version 1.0.0

The latest tag is always applied to the most recent push that did not specify a tag. So, in our first test, we had one image in the repository with a tag of 1.0.0, but because we did not have any pushes without a tag, the latest tag did not point to an image. However, once we push an image without a tag, the latest tag is automatically applied to it.

Although it is tempting to always pull the latest tag, it’s rarely a good idea. The logical assumption β€” that this points to the most recent version of the image β€” is flawed. For example, another developer can update the application to version 1.0.1, build it with the tag 1.0.1, and push it. This results in the following:

$ docker run --rm tagexample:1.0.1
This is version 1.0.1

$ docker run --rm tagexample:latest
This is version 1.0.0

If you made the assumption that latest pointed to the highest version, you’d now be running an out-of-date version of the image.

The other issue is that there is no mechanism in place to prevent someone from inadvertently pushing with the wrong tag. For example, we could create another update to our code bringing it up to 1.0.2. We update the code, build the image, and push it β€” but we forget to change the tag to reflect the new version. Although it’s a small oversight, this action results in the following:

$ docker run --rm tagexample:1.0.1
This is version 1.0.2

Unfortunately, this happens all too frequently.

Using labels

Because we can’t trust tags, how should we ensure that we are able to identify our images? This is where the concept of adding metadata to our images becomes important.

The first attempt at using metadata to help manage images was the MAINTAINER instruction. This instruction sets the β€œAuthor” field (org.opencontainers.image.authors) in the generated image. However, this instruction has been deprecated in favor of the more powerful LABEL instruction. Unlike MAINTAINER, the LABEL instruction allows you to set arbitrary key/value pairs that can then be read with docker inspect as well as other tooling.

Unlike with tags, labels become part of the image, and when implemented properly, can provide a much better way to determine the version of an image. To return to our example above, let’s see how the use of a label would have made a difference.

To do this, we add the LABEL instruction to the Dockerfile, along with the key version and value 1.0.2.

FROM busybox:stable-glibc

LABEL version="1.0.2"

# Create a script that outputs the version
RUN echo -e "#!/bin/sh\n" > /test.sh && \
    echo "echo \"This is version 1.0.2\"" >> /test.sh && \
    chmod +x /test.sh

# Set the entrypoint to run the script
ENTRYPOINT ["/bin/sh", "/test.sh"]

Now, even if we make the same mistake above where we mistakenly tag the image as version 1.0.1, we have a way to check that does not involve running the container to see which version we are using.

$ docker inspect --format='{{json .Config.Labels}}' tagexample:1.0.1
{"version":"1.0.2"}

Best practices

Although you can use any key/value as a LABEL, there are recommendations. The OCI provides a set of suggested labels within the org.opencontainers.image namespace, as shown in the following table:

LabelContent
org.opencontainers.image.createdThe date and time on which the image was built (string, RFC 3339 date-time).
org.opencontainers.image.authorsContact details of the people or organization responsible for the image (freeform string).
org.opencontainers.image.urlURL to find more information on the image (string).
org.opencontainers.image.documentationURL to get documentation on the image (string).
org.opencontainers.image.sourceURL to the source code for building the image (string).
org.opencontainers.image.versionVersion of the packaged software (string).
org.opencontainers.image.revisionSource control revision identifier for the image (string).
org.opencontainers.image.vendorName of the distributing entity, organization, or individual (string).
org.opencontainers.image.licensesLicense(s) under which contained software is distributed (string, SPDX License List).
org.opencontainers.image.ref.nameName of the reference for a target (string).
org.opencontainers.image.titleHuman-readable title of the image (string).
org.opencontainers.image.descriptionHuman-readable description of the software packaged in the image (string).

Because LABEL takes any key/value, it is also possible to create custom labels. For example, labels specific to a team within a company could use the com.myorg.myteam namespace. Isolating these to a specific namespace ensures that they can easily be related back to the team that created the label.

Final thoughts

Image sprawl is a real problem for organizations, and, if not addressed, it can lead to confusion, rework, and potential production problems. By using tags and labels in a consistent manner, it is possible to eliminate these issues and provide a well-documented set of images that make work easier and not harder.

Learn more

❌