Doing more with less: LLM quantization (part 2)

22 November 2024 at 07:00

What if you could get similar results from your large language model (LLM) with 75% less GPU memory? In my previous article,, we discussed the benefits of smaller LLMs and some of the techniques for shrinking them. In this article, we’ll put this to test by comparing the results of the smaller and larger versions of the same LLM.As you’ll recall, quantization is one of the techniques for reducing the size of a LLM. Quantization achieves this by representing the LLM parameters (e.g. weights) in lower precision formats: from 32-bit floating point (FP32) to 8-bit integer (INT8) or INT4. The

Friday Five — November 22, 2024

Red Hat

22 November 2024 at 07:00

Red Hat Enterprise Linux AI Brings Greater Generative AI Choice to Microsoft AzureRHEL AI expands the ability of organizations to streamline AI model development and deployment on Microsoft Azure to fast-track AI innovation in the cloud. Learn more Technically Speaking | How open source can help with AI transparencyExplore the challenges of transparency in AI and how open source development processes can help create a more open and accessible future for AI. Learn more ZDNet - Red Hat's new OpenShift delivers AI, edge and security enhancementsRed Hat introduces new capabilities for Red Hat O

InstructLab tutorial: Installing and fine-tuning your first AI model (part 1)

Red Hat

21 November 2024 at 07:00

After shifting my career from a product security program manager to chief architect, it was time for me to dip my toes into artificial intelligence (AI)—until that moment, I was pretty much a prompt operator.Why train my own models? Sometimes you have confidential, regulated or restricted information that can’t be uploaded to other third-party platforms (well, your data might end up training their model). Or, you might want to have tighter control over various aspects of your model.Think of a provider uploading your data into an external platform—there are lots of risks involved, like le

Rationalizing virtualized workloads: Load balancers and reverse proxies

Red Hat

20 November 2024 at 07:00

Many organizations are now migrating virtualized workloads to Red Hat OpenShift Virtualization.In many cases, this migration consists of a lift-and-shift (rehosting) approach, in which a virtual machine (VM) is moved from a source platform to OpenShift Virtualization while retaining the same network identity (MAC addresses, IP addresses, and so on). Essentially, the hypervisor and VM orchestrator change, but everything else remains the same. This is suitable when the objective is to replatform quickly.But once the migration is completed, you might ask yourself whether you can optimize, or even

The EU Cyber Resilience Act - what you need to know

Red Hat

20 November 2024 at 07:00

Today marks a new milestone in European cybersecurity: the Cyber Resilience Act (CRA) has been published in the EU’s Official Journal, bringing significant changes for businesses operating in the EU. But what does this mean for companies and users alike, and how is Red Hat positioned to support your needs in the new landscape?The CRA is a robust new legislative framework aimed at enhancing the cybersecurity of (hardware and software) products with digital elements - everything from smart home devices to complex operating systems in critical national infrastructure. The CRA enters into force

Red Hat announces 2024 North America Public Sector Partner Pinnacle Award winners

Red Hat

19 November 2024 at 07:00

The Red Hat 2024 North America Public Sector Partner Pinnacle Awards recognize public sector partners for their continued efforts in developing innovative solutions using Red Hat technologies to meet the U.S. government’s needs and improve mission critical outcomes.In today’s rapidly evolving landscape, collaboration between industry and government is essential for developing innovative, customized solutions that enable agencies to better serve constituents. This year’s Public Sector Partner Pinnacle Award winners stand out not only for their commitment to reimagining the future of gover

Managed Identity and Workload Identity support in Azure Red Hat OpenShift

Red Hat

19 November 2024 at 07:00

As organizations are looking to modernize their applications they are also looking for a more secure and easy-to-use application platform. Along with this move to modernization, there is a noticeable shift away from managing long-lived credentials in favor of short-term, limited privilege mechanisms that do not require active management. This has led to the rapid adoption of managed identities in Microsoft Azure, and our customers expect the same from their application platforms such as Azure Red Hat OpenShift (ARO) – a fully-managed turnkey application platform that allows organizations to

Landing Zone for Red Hat Enterprise Linux on Azure

Red Hat

19 November 2024 at 07:00

Deploying new infrastructure can be difficult, whether you’re moving on-premises deployments to the cloud, deploying a new service into your existing architecture or starting from a completely clean slate. There are a lot of choices to make, a lot of potential pitfalls and a lot of places where you can choose to integrate with the services offered by your cloud provider. On top of that, if you’re new to Red Hat Enterprise Linux (RHEL), you may not be familiar with all of the available features and extras. RHEL alone includes a number of products that you might not be aware of, such as Iden

How Red Hat Enterprise Linux powers the world’s fastest supercomputer and the future of exascale computing

Red Hat

19 November 2024 at 07:00

Innovation in computing is fueled by a combination of leading hardware and software–and the latest computing revolution is happening at the exascale level. Exascale machines are capable of performing an exaflop, or one quintillion calculations per second. This marks a new frontier, and one of the leading supercomputers at this level, El Capitan, has set a new benchmark for computational power. The underlying software powering this machine? Red Hat Enterprise Linux (RHEL).Red Hat and supercomputing: A legacy of performanceRed Hat has long been at the forefront of high-performance computing (H

Bringing Red Hat Enterprise Linux to Windows Subsystem for Linux

Red Hat

19 November 2024 at 07:00

The hybrid cloud is an innovation driver, whether pushing the enterprise technology envelope with breakthroughs like generative AI (gen AI) or simply making traditional IT more efficient and responsive through application modernization. Underpinning successful hybrid cloud strategies is choice - of architecture, of cloud provider and of technology stack.While we see this technology stack starting with Linux, many enterprise IT organizations and developer teams have standardized on Windows environments. For developers that need to build Linux applications on Windows desktops, Microsoft provides

Security of LLMs and LLM systems: Key risks and safeguards

Red Hat

18 November 2024 at 07:00

Now that large language models (LLMs) and LLM systems are flourishing, it’s important to reflect upon their security, the risks affecting them and the security controls to reduce these risks to acceptable levels.First of all, let’s differentiate between LLMs and LLM systems. This difference is key when analyzing the risks and the countermeasures that need to be applied. An LLM is an algorithm designed to analyze data, identify patterns and make predictions based on that data. A LLM system is a piece of software composed of artificial intelligence (AI) components, which includes a LLM along

Celebrating Red Hat's origin story: No one innovates alone

Red Hat

18 November 2024 at 07:00

Do you know the story behind our name? Our co-founder Marc Ewing used to wear his grandfather’s red Cornell lacrosse cap in his college computer lab, and people would say, "If you need help, look for the guy in the red hat." The tech industry is full of stories of lone geniuses, exaggerated for the sake of egos and investors. But innovation really happens when people help each other expand their possibilities, together.In open source, collaboration shows you what you need to ask—and who. It’s not long before you find people to teach you. Pretty soon, you find yourself teaching others.Wha

Hardening your operating system? Red Hat Enterprise Linux to the rescue!

Red Hat

15 November 2024 at 07:00

Security is important in enterprise scenarios, where core business applications need to run seamlessly but are often connected to the external world where they are vulnerable to attack.Malware, unauthorized access to files and execution of unverified code are just some examples of how system security can be compromised, not only by exploiting known bugs and vulnerabilities, but also by the lack of appropriate countermeasures.Red Hat Enterprise Linux (RHEL) can help, as it provides some tools and services that can natively support the process of system hardening to help make your system more se

Expanding your HPC cluster capacity with Red Hat Enterprise Linux for HPC in Azure

Red Hat

15 November 2024 at 07:00

High performance computing (HPC) has a long history of unlocking incredible scientific discovery around the world as well as powering private industry through highly parallel, cluster-based supercomputing. Today, many organizations from private enterprises to global government labs rely on the power of HPC to do product development, physical simulation, and scientific research.Red Hat Enterprise Linux (RHEL) has been an invaluable platform to many of these organizations in their HPC journey as the premiere operating system. The argument for using RHEL for HPC is clear: enhanced security capabi

Friday Five — November 15, 2024

Red Hat

15 November 2024 at 07:00

TechTarget: OpenShift AI boosts LLMOps chops with Neural Magic dealNeural Magic’s expertise in inference performance engineering and commitment to open source aligns with Red Hat’s vision of high-performing AI workloads that directly map to customer-specific use cases and data, anywhere and everywhere across the hybrid cloud. Learn more Red Hat Recognized as a Leader in 2024 Gartner® Magic Quadrant™ for Cloud Application PlatformsGartner named Red Hat a Leader in the first Magic Quadrant™ for Cloud Application Platforms. Red Hat is recognized for its Ability to Execute and Completen

Red Hat to Contribute Comprehensive Container Tools Collection to Cloud Native Computing Foundation

Red Hat

14 November 2024 at 07:00

The continued importance of cloud-native applications in an AI and hybrid cloud-centric world demands an open, more accessible ecosystem of development tools. Today, we’re pleased to help drive cloud-native evolution further into the next-generation of IT with our intent to contribute a comprehensive set of container tools to the Cloud Native Computing Foundation (CNCF), including bootc, Buildah, Composefs, Podman, Podman Desktop and Skopeo.Upon acceptance by the CNCF, the contributed tools will become hosted projects – alongside technologies like Kubernetes, Prometheus, Helm and many more

Red Hat Enterprise Linux for SAP Solutions comes to public cloud marketplaces

Red Hat

14 November 2024 at 07:00

Customers choose Red Hat solutions for their SAP environments because they trust in a consistent and reliable foundation, tailored to the needs of SAP workloads such as S/4HANA and SAP HANA platform.With Red Hat Enterprise Linux (RHEL) for SAP Solutions, including a high availability add-on and update services, now available in the AWS and Microsoft Azure marketplaces, we're bringing those same qualities to the way we deliver our broader RHEL offerings in the cloud. Marketplace access gives customers greater visibility into their organizations’ utilization, provides Tier 1 support directly f

The State of Platform Engineering in the Age of AI

Red Hat

13 November 2024 at 07:00

Platform engineering has changed how organizations develop, deploy and manage applications by streamlining processes, improving efficiencies and fostering collaboration. As organizations now look to integrate generative AI (gen AI) as quickly and as often as possible - platform engineering is proving to be critical to setting organizations up for success in adopting and implementing these groundbreaking new technologies.Our inaugural State of Platform Engineering in the Age of AI report examines trends, challenges and best practices from industry practitioners to help us to better understand h

Tame complexity with Red Hat Enterprise Linux 9.5

Red Hat

13 November 2024 at 07:00

For enterprise IT teams, complexity often grows faster than the time, budget and expertise on hand to manage them. Red Hat Enterprise Linux 9.5 (RHEL), the newest version of our industry-leading operating system, is for organizations and people wrestling with those growing complexities. RHEL 9.5 tames complexity across your IT organization with new features that reduce developer delays, amplify the abilities of sysadmins, protect essential workloads and simplify operational security.Equip developers to ship fasterNearly every organization struggles to give developers what they need to create t

Docling: The missing document processing companion for generative AI

Red Hat

13 November 2024 at 07:00

Congratulations to Docling for climbing the charts to be one of GitHub’s top repos of the month!Organizations have lots of data, from their intellectual property and knowledge to marketing, sales and customer data to operations policies and procedures and much more. The challenge isn’t collecting the data or generating documents; it’s extracting meaningful insights from it. Actionable insights can lead to better customer service, faster time-to-customer value, smoother operations and so. Generative AI (gen AI) promises to bridge this gap, turning the mountain of organizational data into

Reading view