Reading view

There are new articles available, click to refresh the page.

Docker Desktop 4.38: New AI Agent, Multi-Node Kubernetes, and Bake in GA

By: Yiwen Xu

At Docker, we’re committed to simplifying the developer experience and empowering enterprises to scale securely and efficiently. With the Docker Desktop 4.38 release, teams can look forward to improved developer productivity and enterprise governance. 

We’re excited to announce the General Availability of Bake, a powerful feature for optimizing build performance and multi-node Kubernetes testing to help teams “shift left.” We’re also expanding availability for several enterprise features designed to boost operational efficiency. And last but not least, Docker AI Agent (formerly Project: Agent Gordon) is now in Beta, delivering intelligent, real-time Docker-related suggestions across Docker CLI, Desktop, and Hub. It’s here to help developers navigate Docker concepts, fix errors, and boost productivity.

1920x1080 4.38 docker desktop release

Docker’s AI Agent boosts developer productivity  

We’re thrilled to introduce Docker AI Agent (also known as Project: Agent Gordon) — an embedded, context-aware assistant seamlessly integrated into the Docker suite. Available within Docker Desktop and CLI, this innovative agent delivers real-time, tailored guidance for tasks like container management and Docker-specific troubleshooting — eliminating disruptive context-switching. Docker AI agent can be used for every Docker-related concept and technology, whether you’re getting started, optimizing an existing Dockerfile or Compose file, or understanding Docker technologies in general. By addressing challenges precisely when and where developers encounter them, Docker AI Agent ensures a smoother, more productive workflow. 

The first iteration of Docker’s AI Agent is now available in Beta for all signed-in users. The agent is disabled by default, so user activation is required. Read more about Docker’s New AI Agent and how to use it to accelerate developer velocity here

blog DD AI agent 1110x806 1

Figure 1: Asking questions to Docker AI Agent in Docker Desktop

Simplify build configurations and boost performance with Docker Bake

Docker Bake is an orchestration tool that simplifies and speeds up Docker builds. After launching as an experimental feature, we’re thrilled to make it generally available with exciting new enhancements.

While Dockerfiles are great for defining build steps, teams often juggle docker build commands with various options and arguments — a tedious and error-prone process. Bake changes the game by introducing a declarative file format that consolidates all options and image dependencies (also known as targets) in one place. No more passing flags to every build command! Plus, Bake’s ability to parallelize and deduplicate work ensures faster and more efficient builds.

Key benefits of Docker Bake

  • Simplicity: Abstract complex build configurations into one simple command.
  • Flexibility: Write build configurations in a declarative syntax, with support for custom functions, matrices, and more.
  • Consistency: Share and maintain build configurations effortlessly across your team.
  • Performance: Bake parallelizes multi-image workflows, enabling faster and more efficient builds.

Developers can simplify multi-service builds by integrating Bake directly into their Compose files — Bake supports Compose files natively. It enables easy, efficient building of multiple images from a single repository with shared configurations. Plus, it works seamlessly with Docker Build Cloud locally and in CI. With Bake-optimized builds as the foundation, developers can achieve more efficient Docker Build Cloud performance and faster builds.

Learn more about streamlining build configurations, boosting performance, and improving team workflows with Bake in our announcement blog

Shift Left with Multi-Node Kubernetes testing in Docker Desktop

In today’s complex production environments, “shifting left”  is more essential than ever. By addressing concerns earlier in the development cycle, teams reduce costs and simplify fixes, leading to more efficient workflows and better outcomes. That’s why we continue to bring new features and enhancements to integrate feedback directly into the developer’s inner loop


Docker Desktop now includes Multi-Node Kubernetes integration, enabling easier and extensive testing directly on developers’ machines. While single-node clusters allow for quick verification of app deployments, they fall short when it comes to testing resilience and handling the complex, unpredictable issues of distributed systems. To tackle this, we’re updating our Kubernetes distribution with kind — a lightweight, fast, and user-friendly solution for local test and multi-node cluster simulations.

blog Multi Node K8 1083x775 1

Figure 2: Selecting Kubernetes version and cluster number for testing

Key Benefits:

  • Multi-node cluster support: Replicate a more realistic production environment to test critical features like node affinity, failover, and networking configurations.
  • Multiple Kubernetes versions: Easily test across different Kubernetes versions, which is a must for validating migration paths.
  • Up-to-date maintenance: Since kind is an actively maintained open-source project, developers can update to the latest version on demand without waiting for the next Docker Desktop release.

Head over to our documentation to discover how to use multi-node Kubernetes clusters for local testing and simulation.

General availability of administration features for Docker Business subscription

With the Docker Desktop 4.36 release, we introduced Beta enterprise admin tools to streamline administration, improve security, and enhance operational efficiency. And the feedback from our Early Access Program customers has been overwhelmingly positive. 

For instance, enforcing sign-in with macOS configuration files and across multiple organizations makes deployment easier and more flexible for large enterprises. Also, the PKG installer simplifies managing large-scale Docker Desktop deployments on macOS by eliminating the need to convert DMG files into PKG first.

Today, the features below are now available to all Docker Business customers.  

Looking ahead, Docker is dedicated to continue expanding enterprise administration capabilities. Stay tuned for more announcements!

Wrapping up 

Docker Desktop 4.38 reinforces our commitment to simplifying the developer experience while equipping enterprises with robust tools. 

With Bake now in GA, developers can streamline complex build configurations into a single command. The new Docker AI Agent offers real-time, on-demand guidance within their preferred Docker tools. Plus, with Multi-node Kubernetes testing in Docker Desktop, they can replicate realistic production environments and address issues earlier in the development cycle. Finally, we made a few new admin tools available to all our Business customers, simplifying deployment, management, and monitoring. 

We look forward to how these innovations accelerate your workflows and supercharge your operations! 

Learn more

Simplify AI Development with the Model Context Protocol and Docker

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

In December, we published The Model Context Protocol: Simplifying Building AI apps with Anthropic Claude Desktop and Docker. Along with the blog post, we also created Docker versions for each of the reference servers from Anthropic and published them to a new Docker Hub mcp namespace.

This provides lots of ways for you to experiment with new AI capabilities using nothing but Docker Desktop.

2400x1260 docker labs genai

For example, to extend Claude Desktop to use Puppeteer, update your claude_desktop_config.json file with the following snippet:

"puppeteer": {
    "command": "docker",
    "args": ["run", "-i", "--rm", "--init", "-e", "DOCKER_CONTAINER=true",          "mcp/puppeteer"]
  }

After restarting Claude Desktop, you can ask Claude to take a screenshot of any URL using a Headless Chromium browser running in Docker.

You can do the same thing for a Model Context Protocol (MCP) server that you’ve written. You will then be able to distribute this server to your users without requiring them to have anything besides Docker Desktop.

How to create an MCP server Docker Image

An MCP server can be written in any language. However, most of the examples, including the set of reference servers from Anthropic, are written in either Python or TypeScript and use one of the official SDKs documented on the MCP site.

For typical uv-based Python projects (projects with a pyproject.toml and uv.lock in the root), or npm TypeScript projects, it’s simple to distribute your server as a Docker image.

  1. If you don’t already have Docker Desktop, sign up for a free Docker Personal subscription so that you can push your images to others.
  2. Run docker login from your terminal.
  3. Copy either this npm Dockerfile or this Python Dockerfile template into the root of your project. The Python Dockerfile will need at least one update to the last line.
  4. Run the build with the Docker CLI (instructions below).

The two Dockerfiles shown above are just templates. If your MCP server includes other runtime dependencies, you can update the Dockerfiles to include these additions. The runtime of your MCP server should be self-contained for easy distribution.

If you don’t have an MCP server ready to distribute, you can use a simple mcp-hello-world project to practice. It’s a simple Python codebase containing a server with one tool call. Get started by forking the repo, cloning it to your machine, and then following the following instructions to build the MCP server image.

Building the image

Most sample MCP servers are still designed to run locally (on the same machine as the MCP client, communication over stdio). Over the next few months, you’ll begin to see more clients supporting remote MCP servers but for now, you need to plan for your server running on at least two different architectures (amd64 and arm64). This means that you should always distribute what we call multi-platform images when your target is local MCP servers. Fortunately, this is easy to do.

Create a multi-platform builder

The first step is to create a local builder that will be able to build both platforms. Don’t worry; this builder will use emulation to build the platforms that you don’t have. See the multi-platform documentation for more details.

docker buildx create \
  --name mcp-builder \
  --driver docker-container \
  --bootstrap

Build and push the image

In the command line below, substitute <your-account> and your mcp-server-name for valid values, then run a build and push it to your account.

docker buildx build \
  --builder=mcp-builder \
  --platform linux/amd64,linux/arm64 \
  -t <your-docker-account>/mcp-server-name \
  --push .

Extending Claude Desktop

Once the image is pushed, your users will be able to attach your MCP server to Claude Desktop by adding an entry to claude_desktop_config.json that looks something like:

"your-server-name": {
    "command": "docker",
    "args": ["run", "-i", "--rm", "--pull=always",
             "your-account/your-server-name"]
  }

This is a minimal set of arguments. You may want to pass in additional command-line arguments, environment variables, or volume mounts.

Next steps

The MCP protocol gives us a standard way to extend AI applications. Make sure your extension is easy to distribute by packaging it as a Docker image. Check out the Docker Hub mcp namespace for examples that you can try out in Claude Desktop today.

As always, feel free to follow along in our public repo.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

💾

This demo will use the Puppeteer MCP server to take a screenshot of a website and invert the colors using Claude Desktop and Docker Desktop. Doing this witho...

Meet Gordon: An AI Agent for Docker

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

In previous articles, we focused on how AI-based tools can help developers streamline tasks and offered ideas for enabling agentic workflows, like reviewing branches and understanding code changes.

In this article, we’ll explore our experiments around the idea of creating a Docker AI Agent — something that could both help new users learn about our tools and products and help power users get things done faster.

2400x1260 docker labs genai

During our explorations around this Docker Agent and AI-based tools, we noticed that the main pain points we encountered were often the same:

  • LLMs need good context to provide good answers (garbage in -> garbage out).
  • Using AI tools often requires context switching (moving to another app, to a different website, etc.).
  • We’d like agents to be able to suggest and perform actions on behalf of the users.
  • Direct product integrations with AI are often more satisfying to use than chat interfaces.

At first, we tried to see what’s possible using off-the-shelf services like ChatGPT or Claude. 

By using testing prompts such as “optimize the following Dockerfile, following all best practices” and providing the model with a sub-par but common Dockerfile, we could sometimes get decent answers. Often, though, the resulting Dockerfile had subtle bugs, hallucinations, or simply wasn’t optimized or didn’t use many of the best practices we would’ve hoped for. Thus, this approach was not reliable enough.

Data ended up being the main issue. Training data for LLM models is always outdated by some amount of time, and the number of bad Dockerfiles that you can find online vastly outnumbers the amount of up-to-date Dockerfiles using all best practices, etc.

After doing proof-of-concept tests using a RAG approach, including some documents with lots of useful advice for creating good Dockerfiles, we realized that the AI Agent idea was definitely possible. However, setting up all the things required for a good RAG would’ve taken too much bandwidth from our small team.

Because of this, we opted to use kapa.ai for that specific part of our agent. Docker already uses them to provide the AI docs assistant on Docker docs, so most of our high-quality documentation is already available for us to reference as part of our LLM usage through their service. Using kapa.ai allowed us to experiment more, getting high-quality results faster, and allowing us to try different ideas around the AI agent concept.

Enter Gordon

Out of this experimentation came a new product that you can try: Gordon. With Gordon, we’d like to tackle these pain points. By integrating Gordon into Docker Desktop and the Docker CLI (Figure 1), we can:

  • Access much more context that can be used by the LLMs to best understand the user’s questions and provide better answers or even perform actions on the user’s behalf.
  • Be where the users are. If you launch a container via Docker Desktop and it fails, you can quickly debug with Gordon. If you’re in the terminal hacking away, Docker AI will be there, too.
  • Avoid being a purely chat-based agent by providing Gordon-based features directly as part of Docker Desktop UI elements. If Gordon detects certain scenarios, like a container that failed to start, a button will appear in the UI to directly get suggestions, or run actions, etc. (Figure 2).
Screenshot of Docker Desktop showing the Gordon icon next to a container name in the list of containers.
Figure 1: Gordon icon on Docker Desktop.
Screenshot of Docker Desktop showing the Ask Gordon tab next to Logs, Inspect, Files, Stats and other options.
Figure 2: Ask Gordon (beta).

What Gordon can do

We want to start with Gordon by optimizing for Docker-related tasks — not general-purpose questions — but we are not excluding expanding the scope to more development-related tasks as work on the agent continues.

Work on Gordon is at an early stage and its capabilities are constantly evolving, but it’s already really good at some things (Figure 3). Here are things to definitely try out:

  • Ask general Docker-related questions. Gordon knows Docker well and has access to all of our documentation.
  • Get help debugging container build or runtime errors.
  • Remediate policy deviations from Docker Scout.
  • Get help optimizing Docker-related files and configurations.
  • Ask it how to run specific containers (e.g., “How can I run MongoDB?”).
Screenshot of results after asking Docker AI to explain a Dockerfile.
Figure 3: Using Gordon to understand a Dockerfile.

How Gordon works

The Gordon backend lives on Docker servers, while the client is a CLI that lives on the user’s machine and is bundled with Docker Desktop. Docker Desktop uses the CLI to access the local machine’s files, asking the user for the directory each time it needs that context to answer a question. When using the CLI directly, it has access to the working directory it’s executed in. For example, if you are in a directory with a Dockerfile and you run “Docker AI, rate my Dockerfile”, it will find the one that’s present in that directory

Currently, Gordon does not have write access to any files, so it will not edit any of your files. We’re hard at work on future features that will allow the agent to do the work for you, instead of only suggesting solutions. 

Figure 4 shows a rough overview of how we are thinking about things behind the scenes.

Illustration showing an overview of how Gordon works, with flow steps starting with "Understand user's input" and going to "Gather context" to "prepare final prompts" then "check results", "reply to user", and more.
Figure 4: Overview of Gordon.

The first step of this pipeline, “Understand the user’s input and figure out which action to perform”, is done using “tool calling” (also known as “function calling”) with the OpenAI API

Although this is a popular approach, we noticed that the documentation online isn’t very good, and general best practices aren’t well defined yet. This led us to experiment a lot with the feature and try to figure out what works for us and what doesn’t.

Things we noticed:

  • Tool descriptions are important, and we should prefer more in-depth descriptions with examples.
  • Testing around tool-detection code is also important. Adding new tools to a request could confuse the LLM and cause it to no longer trigger the expected tool.
  • The LLM model used influences how the whole tool calling functionality should be implemented, as different models might prefer descriptions written in a certain way, behave better/worse under certain scenarios (e.g. when using lots of tools), etc.

Try Gordon for yourself

Gordon is available as an opt-in Beta feature starting with Docker Desktop version 4.37. To participate in the closed beta, all you need to do is fill out the form on the site.

Initially, Gordon will be available for use both in Docker Desktop and the Docker CLI, but our idea is to surface parts of this tech in various other parts of our products as well.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

Unlocking Efficiency with Docker for AI and Cloud-Native Development

By: Yiwen Xu

The need for secure and high quality software becomes more critical every day as the impact of vulnerabilities increases and related costs continue to rise. For example, flawed software cost the U.S. economy $2.08 trillion in 2020 alone, according to the Consortium for Information and Software Quality (CISQ). And, a software defect that might cost $100 to fix if found early in the development process can grow exponentially to $10,000 if discovered later in production. 

Docker helps you deliver secure, efficient applications by providing consistent environments and fast, reliable container management, building on best practices that let you discover and resolve issues earlier in the software development life cycle (SDLC).

2400x1260 docker evergreen logo blog E

Shifting left to ensure fewer defects

In a previous blog post, we talked about using the right tools, including Docker’s suite of products to boost developer productivity. Besides having the right tools, you also need to implement the right processes to optimize your software development and improve team productivity. 

The software development process is typically broken into two distinct loops, the inner and the outer loops. At Docker, we believe that investing in the inner loop is crucial. This means shifting security left and identifying problems as soon as you can. This approach improves efficiency and reduces costs by helping teams find and fix software issues earlier.

Using Docker tools to adopt best practices

Docker’s products help you adopt these best practices — we are focused on enhancing the software development lifecycle, especially around refining the inner loop. Products like Docker Desktop allow your dev team in the inner loop to run, test, code, and build everything fast and consistently. This consistency eliminates the “it works on my machine” issue, meaning applications behave the same in both development and production.  

Shifting left lets your dev team identify problems earlier in your software project lifecycle. When you detect issues sooner, you increase efficiency and help ensure secure builds and compliance. By shifting security left with Docker Scout, your dev teams can identify vulnerabilities sooner and help avoid issues down the road. 

Another example of shifting left involves testing — doing testing earlier in the process leads to more robust software and faster release cycles. This is when Testcontainers Cloud comes in handy because it enables developers to run reliable integration tests, with real dependencies defined in code. 

Accelerate development within the hybrid inner loop

We see more and more companies adopting the so-called hybrid inner loop, which combines the best of two worlds — local and cloud. The results provide greater flexibility for your dev teams and encourage better collaboration. For example, Docker Build Cloud uses the power of the cloud to speed up build time without sacrificing the local development experience that developers love. 

By using these Docker products across the software development life cycle, teams get quick feedback loops and faster issue resolution, ensuring a smooth development flow from inception to deployment. 

Simplifying AI application development

When you’re using the right tools and processes to accelerate your application delivery and maximize efficiency throughout your SDLC, processes that were once cumbersome become your new baseline, freeing up time for true innovation. 

Docker also helps accelerate innovation by simplifying AI/ML development. We are continually investing in AI to help your developers deliver AI-backed applications that differentiate your business and enhance competitiveness.

Docker AI tools

Docker’s GenAI Stack accelerates the incorporation of large language models (LLMs) and AI/ML into your code, enabling the delivery of AI-backed applications. All containers work harmoniously and are managed directly from Docker Desktop, allowing your team to monitor and adjust components without leaving their development environment. Deploying the GenAI Stack is quick and easy, and leveraging Docker’s containerization technology helps speed setup and simplify scaling as applications grow.

Earlier this year, we announced the preview of Docker Extension for GitHub Copilot. By standardizing best practices and enabling integrations with tools like GitHub Copilot, Docker empowers developers to focus on innovation, closing the gap from the first line of code to production.

And, more recently, we launched the Docker AI Catalog in Docker Hub. This new feature simplifies the process of integrating AI into applications by providing trusted and ready-to-use content supported by comprehensive documentation. Your dev team will benefit from shorter development cycles, improved productivity, and a more streamlined path to integrating AI into both new and existing applications.

Wrapping up

Docker products help you establish sound processes and practices related to shifting left and discovering issues earlier to avoid headaches down the road. This approach ultimately unlocks developer productivity, giving your dev team more time to code and innovate. Docker also allows you to quickly use AI to close knowledge gaps and offers trusted tools to build AI/ML applications and accelerate time to market. 

To see how Docker continues to empower developers with the latest innovations and tools, check out our Docker 2024 Highlights.

Learn about Docker’s updated subscriptions and find the ideal plan for your team’s needs.

Learn more

Docker 2024 Highlights: Innovations in AI, Security, and Empowering Development Teams

In 2024, as developers and engineering teams focused on delivering high-quality, secure software faster, Docker continued to evolve with impactful updates and a streamlined user experience. This commitment to empowering developers was recognized in the annual Stack Overflow Developer Survey, where Docker ranked as one of the most loved and widely used tools for yet another year. Here’s a look back at Docker’s 2024 milestones and how we helped teams build, test, and deploy with greater ease, security, and control than ever.

2400x1260 docker evergreen logo blog D 1

Streamlining the developer experience

Docker focused heavily on streamlining workflows, creating efficiencies, and reducing the complexities often associated with managing multiple tools. One big announcement in 2024 is our upgraded Docker plans. With the launch of updated Docker subscriptions, developers now have access to the entire suite of Docker products under their existing subscription. 

The all-in-one subscription model enables seamless integration of Docker Desktop, Docker Hub, Docker Build Cloud, Docker Scout, and Testcontainers Cloud, giving developers everything they need to build efficiently. By providing easy access to the suite of products and flexibility to scale, Docker allows developers to focus on what matters most — building and innovating without unnecessary distractions.

For more details on Docker’s all-in-one subscription approach, check out our Docker plans announcement.

Build up to 39x faster with Docker Build Cloud

Docker Build Cloud, introduced in 2024, brings the best of two worlds — local development and the cloud to developers and engineering teams worldwide. It offloads resource-intensive build processes to the cloud, ensuring faster, more consistent builds while freeing up local machines for other tasks.

A standout feature is shared build caches, which dramatically improve efficiency for engineering teams working on large-scale projects. Shared caches allow teams to avoid redundant rebuilds by reusing intermediate layers of images across builds, accelerating iteration cycles and reducing resource consumption. This approach is especially valuable for collaborative teams working on shared codebases, as it minimizes duplicated effort and enhances productivity.

Docker Build Cloud also offers native support for multi-architecture builds, eliminating the need for setting up and maintaining multiple native builders. This support removes the challenges associated with emulation, further improving build efficiency.

We’ve designed Docker Build Cloud to be easy to set up wherever you run your builds, without requiring a massive lift-and-shift effort. Docker Build Cloud also works well with Docker Compose, GitHub Actions, and other CI solutions. This means you can seamlessly incorporate Docker Build Cloud into your existing development tools and services and immediately start reaping the benefits of enhanced speed and efficiency.

Check out our build time savings calculator to estimate your potential savings in hours and dollars. 

Optimizing development workflows with performance enhancements

In 2024, Docker Desktop introduced a series of enterprise-grade performance enhancements designed to streamline development workflows at scale. These updates cater to the unique needs of development teams operating in diverse, high-performance environments.

One notable feature is the Virtual Machine Manager (VMM) in Docker Desktop for Mac, which provides a robust alternative to the Apple Virtualization Framework. Available since Docker Desktop 4.35, VMM significantly boosts performance for native Arm-based images, delivering faster and more efficient workflows for M1 and M2 Mac users. For development teams relying on Apple’s latest hardware, this enhancement translates into reduced build times and a smoother experience when working with containerized applications.

Additionally, Docker Desktop expanded its platform support to include Red Hat Enterprise Linux (RHEL) and Windows on Arm architectures, enabling organizations to maintain a consistent Docker Desktop experience across a wide array of operating systems. This flexibility ensures that development teams can optimize their workflows regardless of the underlying platform, leveraging platform-specific optimizations while maintaining uniformity in their tooling.

These advancements reflect Docker’s unwavering commitment to speed, reliability, and cross-platform support, ensuring that development teams can scale their operations without bottlenecks. By minimizing downtime and enhancing performance, Docker Desktop empowers developers to focus on innovation, improving productivity across even the most demanding enterprise environments.

More options to improve file operations for large projects

We enhanced Docker Desktop with synchronized file shares (Figure 1), a feature that can significantly improve file operation speeds by 2-10x. This enhancement brings fast and flexible host-to-VM file sharing, offering a performance boost for developers dealing with extensive codebases.

Synchronized file sharing is ideal for developers who:

  • Develop on projects that consist of a significant number of files (such as PHP or Node projects).
  • Develop using large repositories or monorepos with more than 100,000 files, totaling significant storage.
  • Utilize virtual file systems (such as VirtioFS, gRPC FUSE, or osxfs) and face scalability issues with their workflows.
  • Encounter performance limitations and want a seamless file-sharing solution without worrying about ownership conflicts.

This integration streamlines workflows, allowing developers to focus more on coding and less on managing file synchronization issues and slow file read times. 

Screenshot of Docker Desktop showing Synchronized file shares within Resources.
Figure 1: Synchronized file shares.

Enhancing developer productivity with Docker Debug 

Docker Debug enhances the ability of developer teams to debug any container, especially those without a shell (that is, distroless or scratch images). The ability to peek into “secure” images significantly improves the debugging experience for both local and remote containerized applications. 

Docker Debug does this by attaching a dedicated debugging toolkit to any image and allows developers to easily install additional tools for quick issue identification and resolution. Docker Debug not only streamlines debugging for both running and stopped containers but also is accessible directly from both the Docker Desktop CLI and GUI (Figure 2). 

Screenshot of Docker Desktop showing Docker Debug.
Figure 2: Docker Debug.

Being able to troubleshoot images without modifying them is crucial for maintaining the security and performance of containerized applications, especially those images that traditionally have been hard to debug. Docker Debug offers:

  • Streamlined debugging process: Easily debug local and remote containerized applications, even those not running, directly from Docker Desktop.
  • Cross-device and cloud compatibility: Initiate debugging effortlessly from any device, whether local or in the cloud, enhancing flexibility and productivity.

Docker Debug improves productivity and seamless integration. The docker debug command simplifies attaching a shell to any container or image. This capability reduces the cognitive load on developers, allowing them to focus on solving problems rather than configuring their environment. 

Ensuring reliable image builds with Docker Build checks

Docker Desktop 4.33 was a big release because, in addition to including the GA release of Docker Debug, it included the GA release of Docker Build checks, a new feature that ensures smoother and more reliable image builds. Build checks automatically validate common issues in your Dockerfiles before the build process begins, catching errors like invalid syntax, unsupported instructions, or missing dependencies. By surfacing these issues upfront, Docker Build checks help developers save time and avoid costly build failures.

You can access Docker Build checks in the CLI and in the Docker Desktop Builds view. The feature also works seamlessly with Docker Build Cloud, both locally and through CI. Whether you’re optimizing your Dockerfiles or troubleshooting build errors, Docker Build checks let you create efficient, high-quality container images with confidence — streamlining your development workflow from start to finish.

Onboarding and learning resources for developer success  

To further reduce friction, Docker revamped its learning resources and integrated new tools to enhance developer onboarding. By adding beginner-friendly tutorials, Docker’s learning center makes it easier for developers to ramp up and quickly learn to use Docker tools, helping them spend more time coding and less time troubleshooting. 

As Docker continues to rank as a top developer tool globally, we’re dedicated to empowering our community with continuous learning support.

Built-in container security from code to production

In an era where software supply chain security is essential, Docker has raised the bar on container security. With integrated security measures across every phase of the development lifecycle, Docker helps teams build, test, and deploy confidently.

Proactive security insights with Docker Scout Health Scores

Docker Scout, launched in 2023,  has become a cornerstone of Docker’s security ecosystem, empowering developer teams to identify and address vulnerabilities in container images early in the development lifecycle. By integrating with Docker Hub, Docker Desktop, and CI/CD workflows, Scout ensures that security is seamlessly embedded into every build. 

Addressing vulnerabilities during the inner loop — the development phase — is estimated to be up to 100 times less costly than fixing them in production. This underscores the critical importance of early risk visibility and remediation for engineering teams striving to deliver secure, production-ready software efficiently.

In 2024, we announced Docker Scout Health Scores (Figure 3), a feature designed to better communicate the security posture of container images development teams use every day. Docker Scout Health Scores provide a clear, alphabetical grading system (A to F) that evaluates common vulnerabilities and exposures (CVEs) for software components within Docker Hub. This feature allows developers to quickly assess and wisely choose trusted content for a secure software supply chain. 

creenshot of Docker Scout health score page showing checks for high profile vulnerabilities, Supply chain attestations, unapproved images, outdated images, and more.
Figure 3: Docker Scout health score.

For a deeper dive, check out our blog post on enhancing container security with Docker Scout and secure repositories.

Air-gapped containers: Enhanced security for isolated environments

Docker introduced support for air-gapped containers in Docker Desktop 4.31, addressing the unique needs of highly secure, offline environments. Air-gapped containers enable developers to build, run, and test containerized applications without requiring an active internet connection. 

This feature is crucial for organizations operating in industries with stringent compliance and security requirements, such as government, healthcare, and finance. By allowing developers to securely transfer container images and dependencies to air-gapped systems, Docker simplifies workflows and ensures that even isolated environments benefit from the power of containerization.

Strengthening trust with SOC 2 Type 2 and ISO 27001 certifications

Docker also achieved two major milestones in its commitment to security and reliability: SOC 2 Type 2 attestation and ISO 27001 certification. These globally recognized standards validate Docker’s dedication to safeguarding customer data, maintaining robust operational controls, and adhering to stringent security practices. SOC 2 Type 2 attestation focuses on the effective implementation of security, availability, and confidentiality controls, while ISO 27001 certification ensures compliance with best practices for managing information security systems.

These certifications provide developers and organizations with increased confidence in Docker’s ability to support secure software supply chains and protect sensitive information. They also demonstrate Docker’s focus on aligning its services with the needs of modern enterprises.

Accelerating success for development teams and organizations

In 2024, Docker introduced a range of features and enhancements designed to empower development teams and streamline operations across organizations. From harnessing the potential of AI to simplifying deployment workflows and improving security, Docker’s advancements are focused on enabling teams to work smarter and build with confidence. By addressing key challenges in development, management, and security, Docker continues to drive meaningful outcomes for developers and businesses alike.

Docker Home: A central hub to access and manage Docker products

Docker introduced Docker Home (Figure 4), a central hub for users to access Docker products, manage subscriptions, adjust settings, and find resources — all in one place. This approach simplifies navigation for developers and admins. Docker Home allows admins to manage organizations, users, and onboarding processes, with access to dashboards for monitoring Docker usage.

Future updates will add personalized features for different roles, and business subscribers will gain access to tools like the Docker Support portal and organization-wide notifications.

Screenshot of Docker Home showing options to explore Docker products, Admin console, and more.
Figure 4: Docker Home.

Empowering AI innovation  

Docker’s ecosystem supports AI/ML workflows, helping developers work with these cutting-edge technologies while staying cloud-native and agile. Read the Docker Labs GenAI series to see how we’re innovating and experimenting in the open.

Through partnerships like those with NVIDIA and GitHub, Docker ensures seamless integration of AI tools, allowing teams to rapidly experiment, deploy, and iterate. This emphasis on enabling advanced tech aligns Docker with organizations looking to leverage AI and ML in containerized environments.

Optimizing AI application development with Docker Desktop and NVIDIA AI Workbench

Docker and NVIDIA partnered to integrate Docker Desktop with NVIDIA AI Workbench, streamlining AI development workflows. This collaboration simplifies setup by automatically installing Docker Desktop when selected as the container runtime in AI Workbench, allowing developers to focus on creating, testing, and deploying AI models without configuration hassles. By combining Docker’s containerization capabilities with NVIDIA’s advanced AI tools, this integration provides a seamless platform for model training and deployment, enhancing productivity and accelerating innovation in AI application development. 

Docker + GitHub Copilot: AI-powered developer productivity

We announced that Docker joined GitHub’s Partner Program and unveiled the Docker extension for GitHub Copilot (@docker). This extension is designed to assist developers in working with Docker directly within their GitHub workflows. This integration extends GitHub Copilot’s technology, enabling developers to generate Docker assets, learn about containerization, and analyze project vulnerabilities using Docker Scout, all from within the GitHub environment.

Accelerating AI development with the Docker AI catalog

Docker launched the AI Catalog, a curated collection of generative AI images and tools designed to simplify and accelerate AI application development. This catalog offers developers access to powerful models like IBM Granite, Llama, Mistral, Phi 2, and SolarLLM, as well as applications such as JupyterHub and H2O.ai. By providing essential tools for machine learning, model deployment, inference optimization, orchestration, ML frameworks, and databases, the AI Catalog enables developers to build and deploy AI solutions more efficiently. 

The Docker AI Catalog addresses common challenges in AI development, such as decision overload from the vast array of tools and frameworks, steep learning curves, and complex configurations. By offering a curated list of trusted content and container images, Docker simplifies the decision-making process, allowing developers to focus on innovation rather than setup. This initiative underscores Docker’s commitment to empowering developers and publishers in the AI space, fostering a more streamlined and productive development environment. 

Streamlining enterprise administration 

Simplified deployment and management with Docker’s MSI and PKG installers

Docker simplifies deploying and managing Docker Desktop with the new MSI Installer for Windows and PKG Installer for macOS. The MSI Installer enables silent installations, automated updates, and login enforcement, streamlining workflows for IT admins. Similarly, the PKG Installer offers macOS users easy deployment and management with standard tools. These installers enhance efficiency, making it easier for organizations to equip teams and maintain secure, compliant environments.

These new installers also align with Docker’s commitment to simplifying the developer experience and improving organizational management. Whether you’re setting up a few machines or deploying Docker Desktop across an entire enterprise, these tools provide a reliable and efficient way to keep teams equipped and ready to build.

New sign-in enforcement options enhance security and help streamline IT administration 

Docker simplifies IT administration and strengthens organizational security with new sign-in enforcement options for Docker Desktop. These features allow organizations to ensure users are signed in while using Docker, aligning local software with modern security standards. With flexible deployment options — including macOS Config Profiles, Windows Registry Keys, and the cross-platform registry.json file — IT administrators can easily enforce policies that prevent tampering and enhance security. These tools empower organizations to manage development environments more effectively, providing a secure foundation for teams to build confidently.

Desktop Insights: Unlocking performance and usage analytics

Docker introduced Desktop Insights, a powerful feature that provides developers and teams with actionable analytics to optimize their use of Docker Desktop. Accessible through the Docker Dashboard, Desktop Insights offers a detailed view of resource usage, build times, and performance metrics, helping users identify inefficiencies and fine-tune their workflows (Figure 5).

Whether you’re tracking the speed of container builds or understanding how resources like CPU and memory are being utilized, Desktop Insights empowers developers to make data-driven decisions. By bringing transparency to local development environments, this feature aligns with Docker’s mission to streamline container workflows and ensure developers have the tools to build faster and more effectively.

Screenshot of Docker Insights within Admin console, showing data for Total active users, Users with license, Total Builds, Total Containers run, and more
Figure 5: Desktop Insights dashboard.

New usage dashboards in Docker Hub

Docker introduced Usage dashboards in Docker Hub, giving organizations greater visibility into how they consume resources. These dashboards provide detailed insights into storage and image pull activity, helping teams understand their usage patterns at a granular level (Figure 6). 

By breaking down data by repository, tag, and even IP address, the dashboards make it easy to identify high-traffic images or repositories that might require optimization. With this added transparency, teams can better manage their storage, avoid unnecessary pull requests, and optimize workflows to control costs. 

Usage dashboards enhance accountability and empower organizations to fine-tune their Docker Hub usage, ensuring resources are used efficiently and effectively across all projects.

Screenshot of Docker Usage dashboard showing a graph of daily pulls over time.
Figure 6: Usage dashboard.

Enhancing security with organization access tokens

Docker introduced organization access tokens, which let teams manage access to Docker Hub repositories at an organizational level. Unlike personal access tokens tied to individual users, these tokens are associated with the organization itself, allowing for centralized control and reducing reliance on individual accounts. This approach enhances security by enabling fine-grained permissions and simplifying the management of automated processes and CI/CD pipelines. 

Organization access tokens offer several advantages, including the ability to set specific access permissions for each token, such as read or write access to selected repositories. They also support expiration dates, aligning with compliance requirements and bolstering security. By providing visibility into token usage and centralizing management within the Admin Console, these tokens streamline operations and improve governance for organizations of all sizes. 

Docker’s vision for 2025

Docker’s journey doesn’t end here. In 2025, Docker remains committed to expanding its support for cloud-native and AI/ML development, reinforcing its position as the go-to container platform. New integrations and expanded multi-cloud capabilities are on the horizon, promising a more connected and versatile Docker ecosystem.

As Docker continues to build for the future, we’re committed to empowering developers, supporting the open source community, and driving efficiency in software development at scale. 

2024 was a year of transformation for Docker and the developer community. With major advances in our product suite, continued focus on security, and streamlined experiences that deliver value, Docker is ready to help developer teams and organizations succeed in an evolving tech landscape. As we head into 2025, we invite you to explore Docker’s suite of tools and see how Docker can help your team build, innovate, and secure software faster than ever.

Learn more

How to Create and Use an AI Git Agent

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

In our past experiments, we started our work from the assumption that we had a project ready to work on. That means someone like a UI tech writer would need to understand Git operations in order to use the tools we built for them. Naturally, because we have been touching on Git so frequently, we wanted to try getting a Git agent started. Then, we want to use this Git agent to understand PR branches for a variety of user personas — without anyone needing to know the ins and outs of Git.

2400x1260 docker labs genai

Git as an agent

We are exploring the idea that tools are agents. So, what would a Git agent do? 

Let’s tackle our UI use case prompt. 

Previously:

You are at $PWD of /project, which is a git repo.
Force checkout {{branch}}
Run a three-dot diff of the files changed in {{branch}} compared to main using --name-only.

A drawback that isn’t shown here, is that there is no authentication. So, if you haven’t fetched that branch or pulled commits already, this prompt at best will be unreliable and more than likely will fail (Figure 1):

Screenshot of Logs showing failure to authenticate.
Figure 1: No authentication occurs.

Now:

You are a helpful assistant that checks a PR for user-facing changes.
1. Fetch everything and get on latest main.
2. Checkout the PR branch and pull latest.
3. Run a three-dot git diff against main for just files. Write the output to /thread/diff.txt.

This time around, you can see that we are being less explicit about the Git operations, we have the ability to export outputs to the conversation thread and, most importantly, we have authentication with a new prompt!

Preparing GitHub authentication

Note: These prompts should be easily adaptable to other Git providers, but we use GitHub at Docker.

Before we can do anything with GitHub, we have to authenticate. There are several ways to do this, but for this post we’ll focus on SSH-based auth rather than using HTTPS through the CLI. Without getting too deep into the Git world, we will be authenticating with keys on our machine that are associated with our account. These keys and configurations are commonly located at ~/.ssh on Linux/Mac. Furthermore, users commonly maintain Git config at ~/.gitconfig

The .gitconfig file is particularly useful because it lets us specify carriage return rules — something that can easily cause Git to fail when running in a Linux container. We will also need to modify our SSH config to remove UseKeychain. We found these changes are enough to authenticate using SSH in Alpine/Git. But we, of course, don’t want to modify any host configuration.

We came up with a fairly simple flow that lets us prepare to use Git in a container without messing with any host SSH configs.

  1. Readonly mounts: Git config and SSH keys are stored on specific folders on the host machine. We need to mount those in.
    a. Mount ~/.ssh into a container as /root/.ssh-base readonly.
    b. Mount ~/.gitconfig into the same container as /root/.gitconfig.
  2. Copy /root/.ssh-base to /root/.ssh and make the new file readwrite.
  3. Make necessary changes to config.
  4. For the LLM, we also need it to verify the config is in the thread and the changes were made to it. In the event that it fails to make the right changes, the LLM can self-correct.
  5. Copy the .ssh directory and .gitconfig to /thread.

All of this is baked in a prompt you can find and run on GitHub. Simply replace <username> with your system username (Figure 2).

Screenshot of mounts, showing storage of Git config and SSH keys.
Figure 2: Readonly mounts.

If you’re using the default behavior in our extension for VSCode, then the /thread/ volume is ephemeral, so you would need to make sure your agent runs it before any Git tools. However, if you use the Set prompt thread ID command in VSCode, you can keep the /thread volume around under your own tag (Figure 3). 

Tagging the thread persists everything after the prompt completes, so you would only need to run the prompt once before going on to develop your Git tools.

Screenshot of setting the prompt thread ID.
Figure 3: Set the prompt thread ID.

If you opt to persist the thread, then this prompt would only need to be run once and the Git credentials will stick around (Figure 4).

Screenshot showing the stored configuration info.
Figure 4: Stored credential data.

Now that we have our credentials ready, we need our Git tool to pick up on them.

container:
   image: alpine/git
   entrypoint:
     - /bin/sh
     - -c
   command:
     - "cp -r /thread/.ssh /root/.ssh && cp -r /thread/.gitconfig /root/.gitconfig && git --no-pager {{args|safe}}" 

Our container definition copies the SSH credentials and .gitconfig files to the correct directory before running Git commands.

Unpacking objects:  90% (9/10)
Unpacking objects: 100% (10/10)
Unpacking objects: 100% (10/10), 4.14 KiB | 141.00 KiB/s, done.
From github.com:docker/repo
   abcd123..abcd123  main           -> origin/main
 * [new tag]           branch/v2982.0.0 -> branch/v2982.0.0
# ROLE assistant
I have fetched all origin branches. The latest updates have been applied to your local repository, and any new tags have been downloaded.
```
Now, we can see our Git tool is authenticated as if it were us.
Running the agent now, it can:
## ROLE assistant
I have run the git fetch --all command, which retrieves updates from all remotes. If you need further assistance with your Git repository, feel free to ask!
```

# ROLE assistant
```json
{"request":"checkout main"}
```

## ROLE assistant
```json
{"args":"checkout main"}
```

## ROLE assistant
You are already on the 'main' branch, and it is up to date with 'origin/main'.
```

# ROLE assistant
```json
{"request":"checkout the branch branch and pull latest changes."}
```

...
## ROLE assistant
The branch branch has been successfully checked out, and it's already up to date with the latest changes.
...

# ROLE assistant
```json
{"request":"Run a three-dot git diff against main and write the output for just files to /thread/diff.txt."}
```

...

## ROLE assistant
```json
{"args":"diff main... --name-only","outfile":"/thread/diff.txt"}
```

*entering tool*

## ROLE tool (git)
file1.tsx
file2.tsx
...
## ROLE assistant

The three-dot Git diff against the main branch has been executed, and the output has been saved to /thread/diff.txt. This includes the names of files that have differences.
``

With the new Git agent, we have enabled our assistant to fetch new information from upstream, switch branches, and pull in new code to review. If we provide our agent with credentials that are limited to read-only access, this agent could be used by someone that does not have a local git install.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

How AI Assistants Can Decode GitHub Repos for UI Writers

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

Can an AI-powered assistant understand a GitHub repo enough to answer questions for UI writers?

2400x1260 docker labs genai

Across many projects, user-facing content is rendered based on some sort of client-side code. Whether a website, a game, or a mobile app, it’s critical to nail the text copy displayed to the user.

So let’s take a sample question: Do any open PRs in this project need to be reviewed for UI copy? In other words, we want to scan a GitHub repo’s PRs and gain intelligence about the changes included.

Disclaimer: The best practice to accomplish this at a mature organization would be to implement Localization (i18n), which would facilitate centralized user-facing text. However, in a world of AI-powered tools, we believe our assistants will help minimize friction for all projects, not just ones that have adopted i18n.

So, let’s start off by seeing what options we already have.

The first instinct someone might have is to open the new copilot friend in the GitHub nav

genai series 13 f1
Figure 1: Type / to search.

We tried to get it to answer basic questions, first: “How many PR’s are open?”

genai series 13 f2
Figure 2: How many PR’s are there open? The answer doesn’t give a number.

Despite having access to the GitHub repo, the Copilot agent provides less helpful information than we might expect.

genai series 13 f3
Figure 3: Copilot is powered by AI, so mistakes are possible.

We don’t even get a number like we asked, despite GitHub surfacing that information on the repository’s main page. Following up our first query with the main query we want to ask effectively just gives us the same answer

genai series 13 f4
Figure 4: The third PR is filesharing: add some missing contexts.

And, after inspecting the third PR in the list, it doesn’t contain user-facing changes. One great indicator for this web project is the lack of any clientside code being modified. This was a backend change so we didn’t want to see this one.

genai series 13 f5
Figure 5: The PR doesn’t contain user-facing changes.

So let’s try to improve this:

First prompt file

---
functions:
  - name: bash
	description: Run a bash script in the utilities container.
	parameters:
  	  type: object
  	  properties:
    	    command:
      	      type: string
      	description: The command to send to bash
	container:
    	  image: wbitt/network-multitool  
    	  command:
      	    - "bash"
      	    - "-c"
      	    - "{{command|safe}}"
  - name: git
	description: Run a git command.
	parameters:
  	  type: object
  	  properties:
    	    command:
      	      type: string
      	description: The git command to run, excluding the `git` command itself
	container:
  	  image: alpine/git
  	  entrypoint:
    	    - "/bin/sh"
  	  command:
    	    - "-c"
    	    - "git --no-pager {{command|safe}}"
---

# prompt system

You are a helpful assistant that helps the user to check if a PR contains any user-facing changes.

You are given a container to run bash in with the following tools:

  curl, wget, jq
and default alpine linux tools too.

# prompt user
You are at $PWD of /project, which is a git repo.

Checkout branch `{{branch}}`.

Diff the changes and report any containing user facing changes

This prompt was promising, but it ended up with a few blocking flaws. The reason is that using git to compare files is quite tricky for an LLM.

  • git diff uses a pager, and therefore needs the --no-pager arg to send stdout to the conversation.
  • The total number of files affected via git diff can be quite large.
  • Given each file, the raw diff output can be massive and difficult to parse.
  • The important files changed in a PR might be buried with many extra files in the diff output.
  • The container has many more tools than necessary, allowing the LLM to hallucinate.

The agent needs some understanding of the repo to determine the sorts of files that contain user-facing changes, and it needs to be capable of seeing just the important pieces of information.

Our next pass involves a few tweaks:

  • Switch to alpine git image and a file writer as the only tools necessary.
  • Use –files-only and –no-pager args.
# ROLE assistant


The following files are likely to contain user-facing changes as they mainly consist of UI components, hooks, and API functionalities.

```
file1.ts
fil2.tsx
file3.tsx
...
```
Remember that this isn't a guarantee of whether there are user-facing changes, but just an indication of where they might be if there are any.

Remember that this isn’t a guarantee of whether there are user-facing changes, but just an indication of where they might be if there are any.

Giving the agent the tool run-javascript-sandbox allowed our agent to write a script to save the output for later.

genai series 13 f6
Figure 6: Folder called user-changes with files.txt.

To check out the final prompt here, use our Gist.

Expert knowledge

This is a great start; however, we now need to inspect the files themselves for user-facing changes. When we started this, we realized that user-facing changes could manifest in a diverse set of “diff”s so we needed to include expert knowledge. We synced up with Mark Higson, a staff SWE currently working on the frontend platform here at Docker. Mark was able to help provide some key advice for what “user-facing” changes look like in many repos at Docker, so I baked the tips into the prompt.

Straightforward approaches

Looking for changes in text nodes found in a JSX tree is the easiest example.

JSX node with interpolation

<div>{functionReturningString()}</div>

If the result is a string, the result is probably user-facing, but the components that create the string could be elsewhere, so look for:

Nuanced indicators

  • Standard user-facing components. Example: notifications. If a notification’s props change, we can likely infer that it is a user-facing change.
  • Constructors for commonly used components. Example: errors. If an Error() is constructed with a different argument, we know that error could show up differently.

The key for UI reviewers is the overall amount of text is changed, rather than layout.

So, despite not being able to catch everything that could be rendered, we will be able to deliver value by focusing on these areas.

Diffing the files

There are a few approaches to finding changes in the files. To start, we’ll use git again.

Before including expert advice, git diff was hallucinating changes that weren’t actually represented in the output. After the expert advice, we are seeing much more focused outputs from the LLM.

Our second prompt just needs to read the user-facing files that we already saved, and then compare them to main. After a bit of trial and error with git, we found that a combination of --no-color and --minimal was able to give the LLM enough context without consuming too many tokens for each change.

There’s a lot of buzz around generating and reading diffs with LLMs. The teams at Aider and Cursor are both focusing on fine-tuning LLM’s using speculative decoding:

For our use cases, however, we found that the average “big” PR at Docker was not a problem for traditional LLMs when given good advice and the right tools.

So here’s the second prompt:

## User-facing content

The following pieces of code are what we want to focus on for user-facing content:

Text Nodes in any JSX or TSX files.

Text Nodes in any HTML files.

Standard user-facing components. Example: Definitions/Constructors for notifications, toasts, and errors.

# prompt user

Can you tell me if there are any user-facing changes in {{branch}}?

To check out the full prompt here, use our Gist.

Output

Note: Some full paths, branches, and diffs are trimmed or replaced with placeholders for this article.

# ROLE assistant

```json
{
  "args": ["checkout", "branch"]
}
```

# ROLE tool (git)


Already on 'branch'
Your branch is up to date with 'origin/branch'.

# ROLE assistant

```json
{
  "path": "/thread/user-changes/files.txt"
}
```

*entering tool*

# ROLE tool (read-file)

path1
path2
path3
...rest of paths

# ROLE assistant

```json
{
  "args": ["diff", "main", "--no-color", "--minimal", path1 path2 path3...]
}
```

*entering tool*

# ROLE tool (git)

<Diff output truncated>
# ROLE assistant

The following user-facing changes were detected between the "main" branch and the "branch" branch:

1. In `File1.tsx` file, the text "Sign In" was modified to "Sign in".

2. In `File2.ts` file, titles "Databases & Storage", "Monitoring & Observability", "Developer Tools", "New Extensions", were changed to "Databases & storage", "Monitoring & observability", "Developer tools", "New extensions" respectively.

3. In `File3.rtl.test.tsx.snap` and `ImageHeader/index.tsx files`, the text "Compressed Size " was changed to "Compressed size".

4. In `File4.rtl.test.tsx` and `ImageDetail/index.tsx files`, a test-case previously checking for the existence of "Compressed Size" now looks for "Compressed size".

5. In `File5.tsx file`, the button text "Add Collaborator" has been changed to "Add collaborator".

6. In `route.tsx file`, the collaborators table title "Current Collaborators" was changed to "Current collaborators".

7. In `File6.tsx` file, the description was updated.

8. In `File7.tsx`, `RepositoryOverview.tsx`, `RepositoryReadMe.tsx`, `TagsAndScans.tsx` files, the description when there is no category or description was updated.

9. In `File8.rtl.test.tsx`, `DeleteRepo.tsx`, the text for confirming repo deletion and the delete button text was adjusted.

10. In `route.tsx file`, the filter label "Trusted Content" was changed to "Trusted content".

11. In `File9.tsx`, the text "Your Organizations" has been changed to "Your organizations".

12. In `Component/index.tsx`, the search group label "Trusted Content" has been changed to "Trusted content".

13. In `file10.test.ts` file, a test-case was updated to check for the new text "Sign in".
```

Overall, this was a positive result:

genai series 13 f7
Figure 7: Mark’s response “Damn, that’s nice output (assuming it’s accurate).”

Follow up:

genai series 13 f8
Figure 8: Mark’s response, “I checked, it’s accurate.”

Another example from a different PR:

The changes in the `branch` branch compared to `main` include several modifications that affect user-facing content, particularly in JSX and TSX files. Here is a summary of the key changes:

...

These changes primarily involve simplification and refactoring of the existing UI logic, possibly to streamline the user interaction by removing complexities related to feature flags and reducing the use of modals or conditional rendering for specific purchasing flows.

Try it yourself

Here is a markdown file that you can paste into VSCode to try these prompts on your own branch. In the last line, update my-branch to one of your local branches that you’d like to review: https://gist.github.com/ColinMcNeil/2e8f25e2d4092f3c7a0ce8992d2e197c#file-readme-md

Next steps

This is already a promising flow. For example, a tech writer could clone the git repo and run this prompt to inspect a branch for user-facing changes. From here, we might extend the functionality:

  • Allow user input for PR to review without knowing the branch or git needing to use git.
  • Automatic git clone & pull with auth.
  • Support for larger >15 files changed PR by allowing agents to automate their tasks.
  • “Baking” the final flow into CI/CD so that it can automatically assign reviewers to relevant PRs.

If you’re interested in running this prompt on your own repo or just want to follow along with the code, watch our new public repo and reach out. We also appreciate your GitHub Stars.

Everything we’ve discussed in this blog post is available for you to try out on your own projects. 

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

Extending the Interaction Between AI Agents and Editors

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

We recently met up with developers at GitHub Universe at the Docker booth. In addition to demonstrating the upcoming Docker agent extension for GitHub Copilot, we also hosted a “Hack with Docker Labs” session. 

2400x1260 docker labs genai

To facilitate these sessions, we created a VSCode extension to explore the relationship between agents and tools. We encouraged attendees to think about how agents can change how we interact with tools by mixing tool definitions (anything you can package in a Docker container) with prompts using a simple Markdown-based canvas.

Many of these sessions followed a simple pattern.

  • Choose a tool and describe what you want it to do.
  • Let the agent interact with that tool.
  • Ask the agent to explain what it did or adjust the strategy and try again.

It was great to facilitate these discussions and learn more about how agents are challenging us to interact with tools in new ways.  

Figure 1 shows a short example of a session where we generated a QR code (qrencode). We start by defining both a tool and a prompt in the Markdown file. Then, we pass control over to the agent and let it interact with that tool (the output from the agent pops up on the right-hand side).

Animated gif showing generation of QR code using tool and prompt in Markdown file.
Figure 1: Generating a QR code.

Feel free to create an issue in our repo if you want to learn more.

Editors

This year’s trip to GitHub Universe also felt like an opportunity to reflect on how developer workflows are changing with the introduction of coding assistants. Developers may have had language services in the editor for a long time now, but coding assistants that can predict the next most likely tokens have taught us something new. We were all writing more or less the same programs (Figure 2).

Diagram showing process flow to and from Editor, Language Service, and LLM.
Figure 2: Language service interaction.

Other agents

Tools like Cursor, GitHub Copilot Chat, and others are also teaching us new ways in which coding assistants are expanding beyond simple predictions. In this series, we’ve been highlighting tools that typically work in the background. Agents armed with these tools will track other kinds of issues, such as build problems, outdated dependencies, fixable linting violations, and security remediations.

Extending the previous diagram, we can imagine an updated picture where agents send diagnostics and propose code actions, while still offering chat interfaces for other kinds of user input (Figure 3). If the ecosystem has felt closed, get ready for it to open up to new kinds of custom agents.

Diagram showing language service interaction on the left, with extension to Agent and Chat on the right.
Figure 3: Agent extension.

More to come

In the next few posts, we’ll take this series in a new direction and look at how agents are able to use LSPs to interact with developers in new ways. An agent that represents background tasks, such as updating dependencies, or fixing linting violations, can now start to use language services and editors as tools! We think this will be a great way for agents to start helping developers better understand the changes they’re making and open up these platforms to input from new kinds of tools.

GitHub Universe was a great opportunity to check in with developers, and we were excited to learn how many more tools developers wanted to bring to their workflows. As always, to follow along with this effort, check out the GitHub repository for this project.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

Accelerating AI Development with the Docker AI Catalog

Developers are increasingly expected to integrate AI capabilities into their applications but they also face many challenges. Namely, the steep learning curve, coupled with an overwhelming array of tools and frameworks, makes this process too tedious. Docker aims to bridge this gap with the Docker AI Catalog, a curated experience designed to simplify AI development and empower both developers and publishers.

2400x1260 generic hub blog f

Why Docker for AI?

Docker and container technology has been a key technology used by developers at the forefront of AI applications for the past few years. Now, Docker is doubling down on that effort with our AI Catalog. Developers using Docker’s suite of products are often responsible for building, deploying, and managing complex applications — and, now, they must also navigate generative AI (GenAI) technologies, such as large language models (LLMs), vector databases, and GPU support.

For developers, the AI Catalog simplifies the process of integrating AI into applications by providing trusted and ready-to-use content supported by comprehensive documentation. This approach removes the hassle of evaluating numerous tools and configurations, allowing developers to focus on building innovative AI applications.

Key benefits for development teams

The Docker AI Catalog is tailored to help users overcome common hurdles in the evolving AI application development landscape, such as:

  • Decision overload: The GenAI ecosystem is crowded with new tools and frameworks. The Docker AI Catalog simplifies the decision-making process by offering a curated list of trusted content and container images, so developers don’t have to wade through endless options.
  • Steep learning curve: With the rise of new technologies like LLMs and retrieval-augmented generation (RAG), the learning curve can be overwhelming. Docker provides an all-in-one resource to help developers quickly get up to speed.
  • Complex configurations preventing production readiness: Running AI applications often requires specialized hardware configurations, especially with GPUs. Docker’s AI stacks make this process more accessible, ensuring that developers can harness the full power of these resources without extensive setup.

The result? Shorter development cycles, improved productivity, and a more streamlined path to integrating AI into both new and existing applications.

Empowering publishers

For Docker verified publishers, the AI Catalog provides a platform to differentiate themselves in a crowded market. Independent software vendors (ISVs) and open source contributors can promote their content, gain insights into adoption, and improve visibility to a growing community of AI developers.

Key features for publishers include:

  • Increased discoverability: Publishers can highlight their AI content within a trusted ecosystem used by millions of developers worldwide.
  • Metrics and insights: Verified publishers gain valuable insights into the performance of their content, helping them optimize strategies and drive engagement.

Unified experience for AI application development

The AI Catalog is more than just a repository of AI tools. It’s a unified ecosystem designed to foster collaboration between developers and publishers, creating a path forward for more innovative approaches to building applications supported by AI capabilities. Developers get easy access to essential AI tools and content, while publishers gain the visibility and feedback they need to thrive in a competitive marketplace.

With Docker’s trusted platform, development teams can build AI applications confidently, knowing they have access to the most relevant and reliable tools available.

The road ahead: What’s next?

Docker will launch the AI Catalog in preview on November 12, 2024, alongside a joint webinar with MongoDB. This initiative will further Docker’s role as a leader in AI application development, ensuring that developers and publishers alike can take full advantage of the opportunities presented by AI tools.

Stay tuned for more updates and prepare to dive into a world of possibilities with the Docker AI Catalog. Whether you’re an AI developer seeking to streamline your workflows or a publisher looking to grow your audience, Docker has the tools and support you need to succeed.

Ready to simplify your AI development process? Explore the AI Catalog and get access to trusted content that will accelerate your development journey. Start building smarter, faster, and more efficiently.

For publishers, now is the perfect time to join the AI Catalog and gain visibility for your content. Become a trusted source in the AI development space and connect with millions of developers looking for the right tools to power their next breakthrough.

Learn more

Using AI Tools to Convert a PDF into Images

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

How do you turn a PDF into a set of images? A modern suggestion could be to get Claude or GPT to write you a CLI command for something like ImageMagick. Let’s follow that train of thought.

2400x1260 docker labs genai

If we enter a prompt like “How do I turn a PDF into a bunch of images using ImageMagick,” we will indeed get something that looks like it might be a solution.

Imagemagick includes a command-line tool called convert which you can use to convert PDF files into images. Here's how you can do it:

1. Open your terminal

2. Change directory to where your PDF is located:

```
cd /path/to/your/pdf

```
3. Use the following command to convert the PDF into images (in this case JPEG):

```
convert -density 300 -quality 100 your_file.pdf output_name.jpg

``` 

The agent will then explain how convert works and finish with a suggestion to “please make sure that you have installed ImageMagick before running these commands.”

Instead of worrying about whether ImageMagick is installed, however, let’s use Docker as our tool provider and prompt it to go ahead and run the command. If you’re following along in this series, you’ll have seen that we are using Markdown files to mix together tools and prompts.  Here’s the first prompt we tried:

---
tools:
  - name: imagemagick
---
# prompt user

Use Imagemagick to convert the family.pdf file into a bunch of jpg images.

After executing this prompt, the LLM generated a tool call, which we executed in the Docker runtime, and it successfully converted family.pdf into nine .jpg files (my family.pdf file had nine pages). 

Figure 1 shows the flow from our VSCode Extension.

Animated VSCode workflow showing the process of converting PDFs to images.
Figure 1: Workflow from VSCode Extension.

We have given enough context to the LLM that it is able to plan a call to this ImageMagick binary. And, because this tool is available on Docker Hub, we don’t have to “make sure that ImageMagick is installed.” This would be the equivalent command if you were to use docker run directly:

# family.pdf must be located in your $PWD

docker run --rm -v $PWD:/project --workdir /project vonwig/imageMagick:latest convert -density 300 -quality 300 family.pdf family.jpg 

The tool ecosystem

How did this work? The process relied on two things:

  • Tool distribution and discovery (pulling tools into Docker Hub for distribution to our Docker Desktop runtime).
  • Automatic generation of Agent Tool interfaces.

When we first started this project, we expected that we’d begin with a small set of tools because the interface for each tool would take time to design. We thought we were going to need to bootstrap an ecosystem of tools that had been prepared to be used in these agent workflows. 

However, we learned that we can use a much more generic approach. Most tools already come with documentation, such as command-line help, examples, and man pages. Instead of treating each tool as something special, we are using an architecture where an agent responds to failures by reading documentation and trying again (Figure 2).

Illustration of circular process showing "Run tool" leading to "Capture errors" leading to "Read docs" in a continuous loop.
Figure 2: Agent process.

We see a process of experimenting with tools that is not unlike what we, as developers, do on the command line. Try a command line, read a doc, adjust the command line, and try again.

The value of this kind of looping has changed our expectations. Step one is simply pulling the tool into Docker Hub and seeing whether the agent can use it with nothing more than its out-of-the-box documentation. We are also pulling open source software (OSS)  tools directly from nixpkgs, which gives us access to tens of thousands of different tools to experiment with. 

Docker keeps our runtimes isolated from the host operating system, while the nixpkgs ecosystem and maintainers provide a rich source of OSS tools.

As expected, packaging agents still run into issues that force us to re-plan how tools are packaged. For example, the prompt we showed above might have generated the correct tool call on the first try, but the ImageMagick container failed on the first run with this terrible-looking error message:

function call failed call exited with non-zero code (1): Error: sh: 1: gs: not found  

Fortunately, feeding that error back into the LLM resulted in the suggestion that convert needs another tool, called Ghostscript, to run successfully. Our agent was not able to fix this automatically today. However, we adjusted the image build slightly and now the “latest” version of the vonwig/imagemagick:latest no longer has this issue. This is an example of something we only need to learn once.

The LLM figured out convert on its own. But its agency came from the addition of a tool.

Read the Docker Labs GenAI series to see more of what we’ve been working on.

Learn more

Using Docker AI Tools for Devs to Provide Context for Better Code Fixes

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real-time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

At Docker Labs, we’ve been exploring how LLMs can connect different parts of the developer workflow, bridging gaps between tools and processes. A key insight is that LLMs excel at fixing code issues when they have the right context. To provide this context, we’ve developed a process that maps out the codebase using linting violations and the structure of top-level code blocks. 

By combining these elements, we teach the LLM to construct a comprehensive view of the code, enabling it to fix issues more effectively. By leveraging containerization, integrating these tools becomes much simpler.

2400x1260 docker labs genai

Previously, my linting process felt a bit disjointed. I’d introduce an error, run Pylint, and receive a message that was sometimes cryptic, forcing me to consult Pylint’s manual to understand the issue. When OpenAI released ChatGPT, the process improved slightly. I could run Pylint, and if I didn’t grasp an error message, I’d copy the code and the violation into GPT to get a better explanation. Sometimes, I’d ask it to fix the code and then manually paste the solution back into my editor.

However, this approach still required several manual steps: copying code, switching between applications, and integrating fixes. How might we improve this process?

Docker’s AI Tools for Devs prompt runner is an architecture that allows us to integrate tools like Pylint directly into the LLM’s workflow through containerization. By containerizing Pylint and creating prompts that the LLM can use to interact with it, we’ve developed a system where the LLM can access the necessary tools and context to help fix code issues effectively.

Understanding the cognitive architecture

For the LLM to assist effectively, it needs a structured way of accessing and processing information. In our setup, the LLM uses the Docker prompt runner to interact with containerized tools and the codebase. The project context is extracted using tools such as Pylint and Tree-sitter that run against the project. This context is then stored and managed, allowing the LLM to access it when needed.

By having access to the codebase, linting tools, and the context of previous prompts, the LLM can understand where problems are, what they are, and have the right code fragments to fix them. This setup replaces the manual process of finding issues and feeding them to the LLM with something automatic and more engaging.

Streamlining the workflow

Now, within my workflow, I can ask the assistant about code quality and violations directly. The assistant, powered by an LLM, has immediate access to a containerized Pylint tool and a database of my code through the Docker prompt runner. This integration allows the LLM to use tools to assist me directly during development, making the programming experience more efficient.

This approach helps us rethink how we interact with our tools. By enabling a conversational interface with tools that map code to issues, we’re exploring possibilities for a more intuitive development experience. Instead of manually finding problems and feeding them to an AI, we can convert our relationship with tools themselves to be conversational partners that can automatically detect issues, understand the context, and provide solutions.

Walking through the prompts

Our project is structured around a series of prompts that guide the LLM through the tasks it needs to perform. These prompts are stored in a Git repository and can be versioned, tracked, and shared. They form the backbone of the project, allowing the LLM to interact with tools and the codebase effectively. We automate this entire process using Docker and a series of prompts stored in a Git repository. Each prompt corresponds to a specific task in the workflow, and Docker containers ensure a consistent environment for running tools and scripts.

Workflow steps

An immediate and existential challenge we encountered was that this class of problem has a lot of opportunities to overwhelm the context of the LLM. Want to read a source code file? It has to be small enough to read. Need to work on more than one file? Your realistic limit is three to four files at once. To solve this, we can instruct the LLM to automate its own workflow with tools, where each step runs in a Docker container.

Again, each step in this workflow runs in a Docker container, which ensures a consistent and isolated environment for running tools and scripts. The first four steps prepare the agent to be able to extract the right context for fixing violations. Once the agent has the necessary context, the LLM can effectively fix the code issues in step 5.

1. Generate violations report using Pylint:

Run Pylint to produce a violation report.

2. Create a SQLite database:

Set up the database schema to store violation data and code snippets.

3. Generate and run INSERT statements:

  • Decouple violations from the range they represent.
  • Use a script to convert every violation and range from the report into SQL insert statements.
  • Run the statements against the database to populate it with the necessary data.

4. Index code in the database:

  • Generate an abstract syntax tree (AST) of the project with Tree-sitter (Figure 1).
Screenshot of syntax tree, showing files, with detailed look at Example .py.parsed.
Figure 1: Generating an abstract syntax tree.
  • Find all second-level nodes (Figure 2). In Python’s grammar, second-level nodes are statements inside of a module.
Expanded look at Example .py.parsed with highlighted statements.
Figure 2: Extracting content for the database.
  • Index these top-level ranges into the database.
  • Populate a new table to store the source code at these top-level ranges.

5. Fix violations based on context:

Once the agent has gathered and indexed the necessary context, use prompts to instruct the LLM to query the database and fix the code issues (Figure 3).

Illustration of instructions, for example, to "fix the violation "some violation" which occurs in file.py on line 1" with information on the function it occurs in.
Figure 3: Instructions for fixing violations.

Each step from 1 to 4 builds the foundation for step 5, where the LLM, with the proper context, can effectively fix violations. The structured preparation ensures that the LLM has all the information it needs to address code issues with precision.

Refining the context for LLM fixes

To understand how our system improves code fixes, let’s consider a specific violation flagged by Pylint. Say we receive a message that there’s a violation on line 60 of our code file block_listed_name.py:

{
  "type": "convention",
  "module": "block_listed_name",
  "obj": "do_front",
  "line": 60,
  "column": 4,
  "endLine": 60,
  "endColumn": 7,
  "path": "cloned_repo/naming_conventions/block_listed_name.py",
  "symbol": "disallowed-name",
  "message": "Disallowed name \"foo\"",
  "message-id": "C0104"
}

From this Pylint violation, we know that the variable foo is a disallowed name. However, if we tried to ask the LLM to fix this issue based solely on this snippet of information, the response wouldn’t be as effective. Why? The LLM lacks the surrounding context — the full picture of the function in which this violation occurs.

This is where indexing the codebase becomes essential

Because we’ve mapped out the codebase, we can now ask the LLM to query the index and retrieve the surrounding code that includes the do_front function. The LLM can even generate the SQL query for us, thanks to its knowledge of the database schema. Once we’ve retrieved the full function definition, the LLM can work with a more complete view of the problem:

def do_front(front_filename, back_filename):
   """
   Front strategy: loop over front image,
   detect blue pixels there,
   substitute in pixels from back.
   Return changed front image.
   """
   foo = SimpleImage(front_filename)
   back = SimpleImage(back_filename)
   for y in range(foo.height):xc
       for x in range(foo.width):
           pixel = foo.get_pixel(x, y)
           # Detect blue pixels in front and replace with back pixels
           if pixel[2] > 2 * max(pixel[0], pixel[1]):
               back_pixel = back.get_pixel(x, y)
               foo.set_pixel(x, y, back_pixel)
   return foo

Now that the LLM can see the whole function, it’s able to propose a more meaningful fix. Here’s what it suggests after querying the indexed codebase and running the fix:

def do_front(front_filename, back_filename):
 """
    Front strategy: loop over front image,
 detect blue pixels there,
    substitute in pixels from back.
    Return changed front image.
    """
    front_image = SimpleImage(front)
    back_image = SimpleImage(back_filename)
    for y in range(front_image.height):
        for x in range(front_image.width            pixel = front_image.get_pixel(x y)
            # Detect blue pixels in front and replace with back pixels
            if pixel[2 >  * max(pixel[0 pixel[1]):
                back_pixel = back_image.get_pixel(x,)
                front_image.set_pixel(x,, back_pixel)
    return front_image

Here, the variable foo has been replaced with the more descriptive front_image, making the code more readable and understandable. The key step was providing the LLM with the correct level of detail — the top-level range — instead of just a single line or violation message. With the right context, the LLM’s ability to fix code becomes much more effective, which ultimately streamlines the development process.

Remember, all of this information is retrieved and indexed by the LLM itself through the prompts we’ve set up. Through this series of prompts, we’ve reached a point where the assistant has a comprehensive understanding of the codebase. 

At this stage, not only can I ask for a fix, but I can even ask questions like “what’s the violation at line 60 in naming_conventions/block_listed_name.py?” and the assistant responds with:

On line 60 of naming_conventions/block_listed_name.py, there's a violation: Disallowed name 'foo'. The variable name 'foo' is discouraged because it doesn't convey meaningful information about its purpose.

Although Pylint has been our focus here, this approach points to a new conversational way to interact with many tools that map code to issues. By integrating LLMs with containerized tools through architectures like the Docker prompt runner, we can enhance various aspects of the development workflow.

We’ve learned that combining tool integration, cognitive preparation of the LLM, and a seamless workflow can significantly improve the development experience. This integration allows an LLM to use tools to directly help while developing, and while Pylint has been the focus here, this also points to a new conversational way to interact with many tools that map code to issues.

To follow along with this effort, check out the GitHub repository for this project.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

❌