Normal view

There are new articles available, click to refresh the page.
Today — 19 September 2024Main stream

10 Docker Myths Debunked

19 September 2024 at 20:59

Containers might seem like a relatively recent technological breakthrough, but their origins trace back to the 1970s when Unix systems first used container-like concepts to isolate applications. Fast-forward to 2013, and Docker revolutionized this idea by introducing a portable, user-friendly container platform, sparking widespread adoption. In 2015, Docker was instrumental in creating the Open Container Initiative (OCI) to promote open standards within the container ecosystem. With the stability provided by the OCI, container technology spread throughout the tech world.

Although Docker Desktop is the leading tool for creating containerized applications, Docker remains surrounded by numerous misconceptions. In this article, we’ll debunk the top Docker myths and explain the capabilities and benefits of this transformative technology.

2400x1260 evergreen docker blog e

Myth #1: Docker is no longer open source

Docker consists of multiple components, most of which are open source. The core Docker Engine is open source and licensed under the Apache 2.0 license, so developers can continue to use and contribute to it freely. Other vital parts of the Docker ecosystem, like the Docker CLI and Docker Compose, also remain open source. This allows the community to maintain transparency, contribute improvements, and customize their container solutions.

Docker’s commitment to open source is best illustrated by the Moby Project. In 2017, Moby was spun out of the then-monolithic Docker codebase to provide a set of “building blocks” to create containerized solutions and platforms. Docker uses the Moby project for the free Docker Engine project and our commercial Docker Desktop.

Users can also find Trusted Open Source Content on Docker Hub. These Docker-Sponsored Open Source and Docker Official Images offer trusted versions of open source projects and reliable building blocks for better development.

Docker is a founder and remains a crucial contributor to the OCI, which defines container standards. This initiative ensures that Docker and other container technologies remain interoperable and maintain a commitment to open source principles.

Myth #2: Docker containers are virtual machines 

Docker containers are often mistaken for virtual machines (VMs), but the technologies operate quite differently. Unlike VMs, Docker containers don’t include an entire operating system (OS). Instead, they share the host operating system kernel, making them more lightweight and efficient. VMs require a hypervisor to create virtual hardware for the guest OS, which introduces significant overhead. Docker only packages the application and its dependencies, allowing for faster startup times and minimal performance overhead.

By utilizing the host operating system’s resources efficiently, Docker containers use fewer resources overall than VMs, which need substantial resources to run multiple operating systems concurrently. Docker’s architecture efficiently runs numerous isolated applications on a single host, optimizing infrastructure and development workflows. Understanding this distinction is crucial for maximizing Docker’s lightweight and scalable potential.

However, when running on non-Linux systems, Docker needs to emulate a Linux environment. For example, Docker Desktop uses a fully managed VM to provide a consistent experience across Windows, Mac, and Linux by running its Linux components inside this VM.

Myth #3: Docker Engine vs. Docker Desktop vs. Docker Enterprise Edition — They’re all the same

Considerable confusion surrounds the different Docker options that are available, which include:

  • Mirantis Container Runtime: Docker Enterprise Edition (Docker EE) was sold to Mirantis in 2019 and rebranded as Mirantis Container Runtime. This software, which is managed and sold by Mirantis, is designed for production container deployments and offers a lightweight alternative to existing orchestration tools.
  • Docker Engine: Docker Engine is the fully open source version built from the Moby Project, providing the Docker Engine and CLI.
  • Docker Desktop: Docker Desktop is a commercial offering sold by Docker that combines Docker Engine with additional features to enhance developer productivity. The Docker Business subscription includes advanced security and governance features for enterprises.

All of these variants are OCI-compliant, differing mainly in features and experiences. Docker Engine caters to the open source community, Docker Desktop elevates developer workflows with a comprehensive suite of tools for building and scaling applications, and Mirantis Container Runtime provides a specialized solution for enterprise production environments with advanced management and support. Understanding these distinctions is crucial for selecting the appropriate Docker variant to meet specific project requirements and organizational goals.

Myth #4: Docker is the same thing as Kubernetes

This myth arises from the fact that both Docker and Kubernetes are associated with containerized environments. Although they are both key players in the container ecosystem, they serve different roles.

Kubernetes (K8s) is an orchestration system for managing container instances at scale. This container orchestration tool automates the deployment, scaling, and operations of multiple containers across clusters of hosts. Other orchestration technologies include Nomad, serverless frameworks, Docker’s Swarm mode, and Apache Mesos. Each offers different features for managing containerized workloads.

Docker is primarily a platform for developing, shipping, and running containerized applications. It focuses on packaging applications and their dependencies in a portable container and is often used for local development where scaling is not required. Docker Desktop includes Docker Compose, which is designed to orchestrate multi-container deployments locally

In many organizations, Docker is used to develop applications, and the resulting Docker images are then deployed to Kubernetes for production. To support this workflow, Docker Desktop includes an embedded Kubernetes installation and the Compose Bridge tool for translating Compose format into Kubernetes-friendly code.

Myth #5: Docker is not secure

The belief that Docker is not secure is often a result of misunderstandings around how security is implemented within Docker. To help reduce security vulnerabilities and minimize the attack surface, Docker offers the following measures:

Opt-in security configuration 

Except for a few components, Docker operates on an opt-in basis for security. This approach removes friction for new users, but means Docker can still be configured to be more secure for enterprise considerations and for security-conscious users with sensitive data.

“Rootless” mode capabilities 

Docker Engine can run in rootless mode, where the Docker daemon runs without root permissions. This capability reduces the potential blast radius of malicious code escaping a container and gaining root permissions on the host. Docker Desktop takes security further by offering Enhanced Container Isolation (ECI), which provides advanced isolation features beyond what rootless mode can offer.

Built-in security features

Additionally, Docker security includes built-in features such as namespaces, control groups (cgroups), and seccomp profiles that provide isolation and limit the capabilities of containers.

SOC 2 Type 2 Attestation and ISO 27001 Certification

It’s important to note that, as an open source tool, Docker Engine is not in scope for SOC 2 Type 2 Attestation or ISO 27001 Certification. These certifications pertain to Docker, Inc.’s paid products, which offer additional enterprise-grade security and compliance features. These paid features, outlined in a Docker security blog post, focus on enhancing security and simplifying compliance for SOC 2, ISO 27001, FedRAMP, and other standards.  

Along with these security measures, Docker also provides best practices in the Docker documentation and training materials to help users learn how to secure their containers effectively. Recognizing and implementing these features reduces security risks and ensures that Docker can be a secure platform for containerized applications.

Myth #6: Docker is dead

This myth stems from the rapid growth and changes within the container ecosystem over the past decade. To keep pace with these changes, Docker is actively developed and is also widely adopted. In fact, the Stack Overflow community chose Docker as the most-used and most-desired developer tool in the 2024 Developer Survey for the second year in a row and recognized it as the most-admired developer tool. 

Docker Hub is one of the world’s largest repositories of container images. According to the 2024 Docker State of Application Development Report, tools like Docker Desktop, Docker Scout, Docker Build Cloud, and Docker Debug are integral to more than two-thirds of container development workflows. And, as a founding member of the OCI and steward of the Moby project, Docker continues to play a guiding role in containerization.

In the automation space, Docker is crucial for building OCI images and creating lightweight runners for build queues. With the rise of data science and AI/ML, Docker images facilitate the exchange of models, notebooks, and applications, supported by GPU workload capabilities in Docker Desktop. Additionally, Docker is widely used for quickly and cost-effectively mocking up test scenarios as an alternative to deploying actual hardware or VMs.

Myth #7: Docker is hard to learn

The belief that Docker is difficult to learn often comes from the perceived complexity of container concepts and Docker’s many features. However, Docker is a foundational technology used by more than 20 million developers worldwide, and countless resources are available to make learning Docker accessible.

Docker, Inc. is committed to the developer experience, creating intuitive and user-friendly product design for Docker Desktop and supporting products. Documentation, workshops, training, and examples are accessible through Docker Desktop, the Docker website and blog, and the Docker Navigator newsletter. Additionally, the Docker documentation site offers comprehensive guides and learning paths, and Udemy courses co-produced with Docker help new users understand containerization and Docker usage.

The thriving Docker community also contributes a wealth of content and resources, including video tutorials, how-tos, and in-person talks.

Myth #8: Docker and container technology are only for developers

The idea that Docker is only for developers is a common misconception. Docker and containers are used across various fields beyond development. Docker Desktop’s ability to run containerized workloads on Windows, macOS, or Linux requires minimal technical knowledge from users. Its integration features — synchronized host filesystems, network proxy support, air-gapped containers, and resource controls — ensure administrators can enforce governance and security.

  • Data science: Docker provides consistent environments, enabling data scientists to share models, datasets, and development setups seamlessly.
  • Healthcare: Docker deploys scalable applications for managing patient data and running simulations, such as medical imaging software across different hospital systems.
  • Education: Educators and students use Docker to create reproducible research environments, which facilitate collaboration and simplify coding project setups.

Docker’s versatility extends beyond development, providing consistent, scalable, and secure environments for various applications.

Myth #9: Docker Desktop is just a GUI

The myth that Docker Desktop is merely a graphical user interface (GUI) overlooks its extensive features designed to enhance developer experience, streamline container management, and accelerate productivity, such as:

Cross-platform support

Docker is Linux-based, but most developer workstations run Windows or macOS. Docker Desktop enables these platforms to run Docker tooling inside a fully managed VM integrated with the host system’s networking, filesystem, and resources.

Developer tools

Docker Desktop includes built-in Kubernetes, Docker Scout for supply chain management, Docker Build Cloud for faster builds, and Docker Debug for container debugging.

Security and governance

For administrators, Docker Desktop offers Registry Access Management and Image Access Management, Enhanced Container Isolation, single sign-on (SSO) for authorization, and Settings Management, making it an essential tool for enterprise deployment and management.

Myth #10: Docker containers are for microservices only

Although Docker containers are popular for microservices architectures, they can be used for any type of application. For example, monolithic applications can be containerized, allowing them and their dependencies to be isolated into a versioned image that can run across different environments. This approach enables gradual refactoring into microservices if desired.

Additionally, Docker is excellent for rapid prototyping, allowing quick deployment of minimum viable products (MVPs). Containerized prototypes are easier to manage and refactor compared to those deployed on VMs or bare metal.

Now you know

Now that you have the facts, it’s clear that adopting Docker can significantly enhance productivity, scalability, and security for a variety of use cases. Docker’s versatility, combined with extensive learning resources and robust security features, makes it an indispensable tool in modern software development and deployment. Adopting Docker and its true capabilities can significantly enhance productivity, scalability, and security for your use case.

For more detailed insights, refer to the 2024 Docker State of Application Development Report or dive into Docker Desktop now to start your Docker journey today

Learn more

Before yesterdayMain stream

Secure by Design for AI: Building Resilient Systems from the Ground Up

16 September 2024 at 21:23

As artificial intelligence (AI) has erupted, Secure by Design for AI has emerged as a critical paradigm. AI is integrating into every aspect of our lives — from healthcare and finance to developers to autonomous vehicles and smart cities — and its integration into critical infrastructure has necessitated that we move quickly to understand and combat threats. 

Necessity of Secure by Design for AI

AI’s rapid integration into critical infrastructure has accelerated the need to understand and combat potential threats. Security measures must be embedded into AI products from the beginning and evolve as the model evolves. This proactive approach ensures that AI systems are resilient against emerging threats and can adapt to new challenges as they arise. In this article, we will explore two polarizing examples — the developer industry and the healthcare industry.

Black padlock on light blue digital background

Complexities of threat modeling in AI

AI brings forth new challenges and conundrums when working on an accurate threat model. Before reaching a state in which the data has simple edit and validation checks that can be programmed systematically, AI validation checks need to learn with the system and focus on data manipulation, corruption, and extraction. 

  • Data poisoning: Data poisoning is a significant risk in AI, where the integrity of the data used by the system can be compromised. This can happen intentionally or unintentionally and can lead to severe consequences. For example, bias and discrimination in AI systems have already led to issues, such as the wrongful arrest of a man in Detroit due to a false facial recognition match. Such incidents highlight the importance of unbiased models and diverse data sets. Testing for bias and involving a diverse workforce in the development process are critical steps in mitigating these risks.

In healthcare, for example, bias may be simpler to detect. You can examine data fields based on areas such as gender, race, etc. 

In development tools, bias is less clear-cut. Bias could result from the underrepresentation of certain development languages, such as Clojure. Bias may even result from code samples based on regional differences in coding preferences and teachings. In developer tools, you likely won’t have the information available to detect this bias. IP addresses may give you information about where a person is living currently, but not about where they grew up or learned to code. Therefore, detecting bias will be more difficult. 

  • Data manipulation: Attackers can manipulate data sets with malicious intent, altering how AI systems behave. 
  • Privacy violations: Without proper data controls, personal or sensitive information could unintentionally be introduced into the system, potentially leading to privacy violations. Establishing strong data management practices to prevent such scenarios is crucial.
  • Evasion and abuse: Malicious actors may attempt to alter inputs to manipulate how an AI system responds, thereby compromising its integrity. There’s also the potential for AI systems to be abused in ways developers did not anticipate. For example, AI-driven impersonation scams have led to significant financial losses, such as the case where an employee transferred $26 million to scammers impersonating the company’s CFO.

These examples underscore the need for controls at various points in the AI data lifecycle to identify and mitigate “bad data” and ensure the security and reliability of AI systems.

Key areas for implementing Secure by Design in AI

To effectively secure AI systems, implementing controls in three major areas is essential (Figure 1):

Illustration showing flow of data from Users to Data Management to Model Tuning to Model Maintenance.
Figure 1: Key areas for implementing security controls.

1. Data management

The key to data management is to understand what data needs to be collected to train the model, to identify the sensitive data fields, and to prevent the collection of unnecessary data. Data management also involves ensuring you have the correct checks and balances to prevent the collection of unneeded data or bad data.

In healthcare, sensitive data fields are easy to identify. Doctors offices often collect national identifiers, such as drivers licenses, passports, and social security numbers. They also collect date of birth, race, and many other sensitive data fields. If the tool is aimed at helping doctors identify potential conditions faster based on symptoms, you would need anonymized data but would still need to collect certain factors such as age and race. You would not need to collect national identifiers.

In developer tools, sensitive data may not be as clearly defined. For example, an environment variable may be used to pass secrets or pass confidential information, such as the image name from the developer to the AI tool. There may be secrets in fields you would not suspect. Data management in this scenario involves blocking the collection of fields where sensitive data could exist and/or ensuring there are mechanisms to scrub sensitive data built into the tool so that data does not make it to the model. 

Data management should include the following:

  • Implementing checks for unexpected data: In healthcare, this process may involve “allow” lists for certain data fields to prevent collecting irrelevant or harmful information. In developer tools, it’s about ensuring the model isn’t trained on malicious code, such as unsanitized inputs that could introduce vulnerabilities.
  • Evaluating the legitimacy of users and their activities: In healthcare tools, this step could mean verifying that users are licensed professionals, while in developer tools, it might involve detecting and mitigating the impact of bot accounts or spam users.
  • Continuous data auditing: This process ensures that unexpected data is not collected and that the data checks are updated as needed. 

2. Alerting and monitoring 

With AI, alerting and monitoring is imperative to ensuring the health of the data model. Controls must be both adaptive and configurable to detect anomalous and malicious activities. As AI systems grow and adapt, so too must the controls. Establish thresholds for data, automate adjustments where possible, and conduct manual reviews where necessary.

In a healthcare AI tool, you might set a threshold before new data is surfaced to ensure its accuracy. For example, if patients begin reporting a new symptom that is believed to be associated with diabetes, you may not report this to doctors until it is reported by a certain percentage (15%) of total patients. 

In a developer tool, this might involve determining when new code should be incorporated into the model as a prompt for other users. The model would need to be able to log and analyze user queries and feedback, track unhandled or poorly handled requests, and detect new patterns in usage. Data should be analyzed for high frequencies of unhandled prompts, and alerts should be generated to ensure that additional data sets are reviewed and added to the model.

3. Model tuning and maintenance

Producers of AI tools should regularly review and adjust AI models to ensure they remain secure. This includes monitoring for unexpected data, adjusting algorithms as needed, and ensuring that sensitive data is scrubbed or redacted appropriately.

For healthcare, model tuning may be more intensive. Results may be compared to published medical studies to ensure that patient conditions are in line with other baselines established across the world. Audits should also be conducted to ensure that doctors with reported malpractice claims or doctors whose medical license has been revoked are scrubbed from the system to ensure that potentially compromised data sets are not influencing the model. 

In a developer tool, model tuning will look very different. You may look at hyperparameter optimization using techniques such as grid search, random search, and Bayesian search. You may study subsets of data; for example, you may perform regular reviews of the most recent data looking for new programming languages, frameworks, or coding practices. 

Model tuning and maintenance should include the following:

  • Perform data audits to ensure data integrity and that unnecessary data is not inadvertently being collected. 
  • Review whether “allow” lists and “deny” lists need to be updated.
  • Regularly audit and monitor alerts for algorithms to determine if adjustments need to be made; consider the population of your user base and how the model is being trained when adjusting these parameters.
  • Ensure you have the controls in place to isolate data sets for removal if a source has become compromised; consider unique identifiers that allow you to identify a source without providing unnecessary sensitive data.
  • Regularly back up data models so you can return to a previous version without heavy loss of data if the source becomes compromised.

AI security begins with design

Security must be a foundational aspect of AI development, not an afterthought. By identifying data fields upfront, conducting thorough AI threat modeling, implementing robust data management controls, and continuously tuning and maintaining models, organizations can build AI systems that are secure by design. 

This approach protects against potential threats and ensures that AI systems remain reliable, trustworthy, and compliant with regulatory requirements as they evolve alongside their user base.

Learn more

❌
❌