For SpaceTech, managing buildings and communities at scale means leveraging innovative solutions in public safety, automation, and sustainability. As a subsidiary of China Vanke Co., Ltd., overseeing more than 8 million residential units and 2,000 commercial buildings, SpaceTech integrates Edge AI to boost safety, sustainability, and operational efficiency while enhancing the resident experience.
AI-driven management and energy optimization for smarter cities
SpaceTech is strengthening its commitment to sustainable, innovative building management by integrating smart technologies across its billion-square-meter portfolio. With a blend of property management expertise, service quality, and advanced tech solutions, it has become China’s leading space management provider.
SpaceTech needed a high-performance edge AI server to handle the increasing data traffic from smart devices, prioritizing performance while being constrained by power limits. A key challenge is waste management, where AI categorizes waste and triggers service tickets when bins are full.
Additionally, the complexity of managing legacy building systems—such as lighting, air conditioning, and access control—presents a challenge, as these systems often function in isolation. This lack of interoperability further fragmented the ecosystem, as devices from different manufacturers often couldn’t communicate with each other.
Moving from x86 edge servers to Arm-based servers
To improve performance, energy efficiency, and sustainability, SpaceTech partnered with Ampere Computing to explore a transition from x86 edge servers to Ampere Altra family Arm-based servers. Switching to Ampere could deliver significant advantages over x86, including 2.5x better performance per rack, 2.8x lower power consumption, and a 3x smaller footprint. This transition aligns with the broader industry trend of moving x86 code to Arm processors for high-performance, power-efficient computing in cloud-native and open-source workloads like PyTorch, MongoDB, and Nginx.
A proof of concept compared the performance of Alibaba’s Qwen-VL Vision Language Model (VLM) on Ampere’s Arm-based servers to x86 servers, revealing significant energy savings without sacrificing AI capabilities. Ampere also provides dedicated AI libraries, including two optimized quantization methods to improve performance by 1.5–2x while maintaining model size and perplexity. For example, Q4_K_4 quantization increases prompt processing speed by 1.6x compared to Q4_K_M with the Qwen2-7b-instruct model. Since the code ran smoothly on Arm with minimal adjustments, ecosystem compatibility was not an issue.
The two most critical components are video decoding, often the primary data source, and AI inference, which generates actionable results. With 1080p video streams running at 25 frames per second and multiple inference models operating in parallel, the Arm-based Ampere solution was 2.6 times faster, far outperforming the x86-based system.
The prospect of using Arm-based sensors and edge devices would allow SpaceTech to automate building operations, improving sustainability and enhancing the resident experience. For example, motion sensors detect room occupancy, allowing lights and HVAC systems to turn off when not in use.
AI further refines this process by learning occupancy patterns, preemptively adjusting systems to ensure spaces are comfortable when needed. Surveillance systems powered by edge AI monitor elevators to prevent unauthorized scooter use and littering in public areas, automatically generating service requests when issues arise. Integrating facial recognition with delivery services like Meituan streamlines building access from cloud and internet services all the way to the edge and community access endpoint, giving delivery personnel seamless entry to gated communities while ensuring security.
Arm-based servers to power SpaceTech’s smart city infrastructure
Ampere’s Arm-based servers are poised to play a crucial role in SpaceTech’s smart city infrastructure. These servers enhance energy efficiency, reduce the need for cooling, and offer a sustainable alternative to traditional x86 systems.
Cameras, sensors, and other devices across SpaceTech’s network could soon rely on Arm processors to efficiently handle endpoint AI workloads. They have also streamlined various building systems by running multiple software applications in virtual containers on an edge server cluster, which they plan to transition to Arm. This approach integrates previously siloed systems—such as lighting, HVAC, air quality, and access control—into a unified data lake, enabling seamless operation and optimization.
Through these initiatives, SpaceTech is transforming how they manage urban spaces. In addition to reducing energy consumption, they’re enhancing sustainability across their portfolio of residential and commercial buildings.
Powering smarter urban management with edge cloud native platform
To help manage 1 billion square meters of urban space with smart technology, SpaceTech developed an “Edge Cloud Native” platform where every device—be it a camera, sensor, or gateway—functions like a micro-cloud. This infrastructure enables real-time AI decision-making and efficient building management, reducing operational costs, power demands, and environmental impact.
As SpaceTech continues working with Ampere Computing, it aims to replace legacy x86 infrastructure with Arm-powered edge servers. This transition will further improve compute efficiency, reduce noise and cooling needs in edge rooms, and scale AI capabilities without requiring GPU-heavy hardware.
The future of SpaceTech’s urban management with Arm
With China’s extensive developer ecosystem trained on Arm architecture, SpaceTech will be well-positioned to extend its smart city solutions, setting new standards for intelligent urban management across China, and potentially beyond. By adopting Arm technology and Ampere Arm-based edge servers, SpaceTech is revolutionizing urban management by seamlessly integrating AI, sustainability, and efficiency into its operations.
This partnership addresses the complexities of large-scale property management while setting a new standard for connected smart cities and offering a model for other developers to harness the potential of intelligent buildings with the help of Edge AI.
Explore how Edge AI is transforming IoT by enhancing smart cities, industries, and retail by unlocking possibilities as deployments scale.
Check out more Ampere solutions helping to revolutionize AI inferencing workloads.
At Arm, we recognize the importance of the latest research findings from academia and how they can help to shape future technology roadmaps. As a global semiconductor ambassador, we play a key role in academic engagements and refining the commercial relevance of their respective research.
Our approach is multifaceted, involving broad engagements with entire departments and focused collaborations with individual researchers. This ensures that we are not only advancing the field of computing but also fostering the talent that will lead the industry in the future.
Pioneering research and strategic investments in academia
A prime example of our broad engagement is our long-standing relationship with the University of Cambridge’s Depart of Computer Science and Technology. We’ve announced critical investment into the Department’s new CASCADE (Computer Architecture and Semiconductor Design) Centre. To realize the potential of AI through next-gen processor designs, this initiative will fund 15 PhD students over the next five years who will undertake groundbreaking work in intent-based programming.
Meanwhile, our work with the Morello program continues to push the boundaries of secure computing. This is a research initiative aimed at creating a more secure hardware architecture for future Arm devices. The program is based on the CHERI (Capability Hardware Enhanced RISC Instructions) model, which has been developed in collaboration with the University of Cambridge since 2015. By implementing CHERI architectural extensions, Morello aims to mitigate memory safety vulnerabilities and enhance the overall security of devices.
In the United States, our membership in the SRC JUMP2.0 program, a public-private partnership alongside DARPA (Defense Advanced Research Projects Agency) and other noted semiconductor companies, enables us to support pathfinding research across new and emerging technical challenges. One notable investment is the PRISM Centre (Processing with Intelligent Storage and Memory), which is led by the University of California San Diego, where we are deeply engaged in advancing the computing field.
Fuelling innovation through strategic PhD investments
Arm’s broad academic engagements are complemented by specific investments in emerging research areas, where the commercial impact is still being defined. PhD studentships are ideal for these exploratory studies, providing the necessary timeframe to progress ideas and early-stage concepts toward potential commercial viability. Examples include:
A PhD studentship at the University of Utah exploring security and verification topics.
Shaping the future through research and technology
In areas where challenges are just being identified, Arm convenes workshops with academic thought leaders to scope future use cases and the fundamental experimental work needed to advance the field. Moreover, our white papers on Ambient Intelligence and the Metaverse are helping the academic community develop future research programs, acting as a springboard for further innovation.
Given our position in the ecosystem, we are often invited to provide thought leadership at academic conferences. Highlights from this year include:
A keynote by Rob Dimond, a System Architect and Fellow at Arm, at the DATE conference in Valencia, a major event for our industry and academia.
Reinforcing academic engagements by investing in future talent
Investing in PhDs is not just about research; it’s about nurturing the future talent pipeline for our industry. We also engage with governments and funding agencies to ensure that university research funding is targeted appropriately.
Expanding global collaborations to drive technological marvels
Arm’s commitment to academic engagements spans the globe, reflecting our dedication to fostering innovation worldwide. In Asia for instance, we have initiated collaborations with leading institutions to explore new frontiers in semiconductor technology. Our partnership with the National University of Singapore focuses on developing power-efficient computing solutions, which are crucial for the next generation of mobile and IoT devices.
In Europe, beyond our engagements in the U.K. and Spain, we are also working with the Technical University of Munich on advanced research in quantum computing. This collaboration aims to address some of the most challenging problems in computing today, paving the way for breakthroughs that could revolutionize the industry.
Bridging academics and the industry for a brighter future
innovation and supporting the next generation of technology leaders. Our investments in academic engagements not only advance the field of semiconductor technology but also ensure that we remain at the forefront of technological progress.
As we continue to nurture upcoming talent, support groundbreaking research, and foster global collaborations, we are shaping the future of computing.
For more information
For more details about Arm’s academic engagements and partnerships contact Andrea Kells, Arm’s Research Ecosystem Director at Andrea.Kells@arm.com
Once again, CES 2025 in Las Vegas proved to be a premier event for showcasing the latest and greatest in technology innovation. This year’s event featured a host of groundbreaking products and announcements that will help to transform a broad range of industries.
Whether it was the significant advancements in autonomous driving, leading-edge technologies for consumer tech markets, or new partnerships, Arm’s presence at CES 2025 highlighted our dedication to driving technology innovation in the age of AI.
Arm’s landmark partnership with the Aston Martin Aramco Formula One Team
The big Arm announcement at CES 2025 was the new landmark multiyear partnership with the Aston Martin Aramco Formula One® Team, with Arm named as the team’s ‘Official AI Compute Platform Partner.’ In Formula One, Arm’s compute platform will drive advancements in AI and computing, helping Aston Martin Aramco push the boundaries of performance on and off the track.
Moreover, through this unique partnership, Arm and Aston Martin Aramco aim to:
Accelerate equity and inclusivity for the future of STEM and motorsport as described in this blog;
Encourage opportunities for women in STEM and motorsport, with Jessica Hawkins, Head of F1 Academy at Aston Martin Aramco, representing Arm as an Official Ambassador; and
Empower the next generation of engineers, racers, and innovators.
As part of a CES 2025 panel live stream, Ami Badani, Arm’s Chief Marketing Officer, and Dipti Vachani, SVP and GM of Arm’s Automotive Line of Business, sat down with Jessica and Charlie Blackwall, Head of Electronics at Aston Martin Aramco Formula One Team. They spoke about the new partnership and its commitment to driving technology innovation alongside greater equity and inclusivity in tech and motorsport.
Pioneering innovations with NVIDIA
On the Monday before the start of CES 2025, NVIDIA showcased its latest AI-based technology innovations in Jensen Huang’s keynote. Arm’s technology is playing a pivotal role in NVIDIA’s solutions for the next generation of consumer and commercial vehicles. Arm CPU cores are also central to NVIDIA’s new personal AI supercomputer that will deliver accessible high-performance AI compute to developers.
New AI capabilities for next-generation vehicles
During the NVIDIA keynote, Jensen Huang announced that the NVIDIA DRIVE AGX Thor, a centralized compute system that delivers advanced AI capabilities for a range of automotive applications, will be available for production vehicles later this year. This is the first solution to use Arm Neoverse V3AE, our first ever Neoverse CPU enhanced for automotive applications, with many leading automakers already making plans to adopt NVIDIA DRIVE AGX Thor for the next-generation of software-defined vehicles (SDVs). These include Jaguar Land Rover, Mercedes Benz and Volvo Cars.
For more details on this collaboration, you can read Dipti Vachani’s news blog here.
High-performance AI at developers’ desks
NVIDIA also introduced Project DIGITS, a new personal AI supercomputer that makes it possible for every AI developer to have a high-performance AI system on their desk. This will help to democratize access to high-performance AI computing, enabling a new wave of innovation and research.
Project DIGITS is powered by the NVIDIA GB10 Grace Blackwell Superchip, which features 20 of Arm’s leading-edge CPU cores. Working with NVIDIA and our leading software ecosystem, we cannot wait to see how this new device brings the next generation of highly innovative AI applications to market.
For more insights into Project DIGITS and the GB10 features, you can read a blog from Parag Beeraka, Senior Director, Consumer Computing in the Client Line of Business, here.
The future of automotive is built on Arm
As the big screen at the entrance to the West Hall in the Las Vegas Convention Center said: “the future of automotive is built on Arm”, with 94 percent of global automakers using Arm-based technology in their newest vehicle models.
On the CES 2025 showfloor, this was apparent with a range of new vehicles featuring Arm-based technology, including Honda’s new family concept car called the SUV Zero. This caught the attention of Ami Badani and Will Abbey, Arm’s Chief Commercial Officer, during their CES 2025 show walkthrough, as shown in the video below.
Honda was represented on the Arm-sponsored session “Revolutionizing the Future of Driving – Unleashing the Power of AI“, which also featured representatives from the BMW Group, Nuro and Rivian. During the session, Dipti Vachani outlined how AI is helping to revolutionize the automotive industry across three key trends:
Electrification;
Autonomy; and
The driver experience.
Hearing from the leading automotive companies represented on the panel, it was clear that scalable, consistent, power-efficient compute platforms are needed to deliver the next-generation of AI-enabled SDVs.
Dipti Vachani also participated in a “Six Five On The Road at CES 2025” discussion about how Arm aims to shape innovation in the automotive industry. This covered a range of topics, from the biggest technology trends for 2025 to Arm’s role across the automotive ecosystem.
Meanwhile, Arm technology was shown to be accelerating software across leading automotive applications throughout CES 2025. Mapbox, a leading platform for powering location experiences, demoed its new virtual platform, the Virtual Head Unit (VHU), which it developed in partnership with Arm and Corellium.
This creates virtual prototypes of the Arm-based in-vehicle hardware before seamlessly integrating these with Mapbox’s navigation stack. Automotive OEMs can use the new VHU to build maps how they want, and then test and render it their way at a quicker rate before deployment.
Elsewhere, AWS Automotive showed how Arm optimizations supported the development of its prototype chatbot-based application for the next-generation of SDVs.
New, innovative consumer tech solutions
As with every CES, the event in Las Vegas highlighted a broad range of the latest consumer technology innovations. On the showfloor, it was difficult to escape the broad range of new TV products, including the latest AI TVs – many of which would be powered by Arm technology. This also included brand-new smart displays that provide a range of information for the smart home or even images and video, like the fireplace in the video below.
However, one notable highlight for the TV market away from the showfloor was Eclipsa Audio. Developed by Google, Samsung, Arm, and the Alliance for Open Media, this new open-source technology delivers a three-dimensional (3D) audio experience, revolutionizing the way people experience sound. Through leveraging the Immersive Audio Model and Formats (IAMF), Eclipsa Audio produces immersive soundscapes, spreading audio vertically and horizontally to closely mimic natural settings.
Arm played a crucial role in the technology through optimizing the Opus codec and IAMF library to enable better performance on Arm CPUs. These enhancements ensure that Eclipsa Audio can deliver unparalleled performance across a variety of consumer devices, from high-end cinema systems to entry-level mobile devices and TVs.
For more information on Eclipsa Audio, you can read the blog here.
Two new Arm-based XR products garnered significant attention at CES 2025. ThinkAR showcased its AiLens product series, which are lightweight AR smart glasses that offer intuitive experiences enhanced by powerful edge AI capabilities. The devices are powered by Ambiq’s ultra-efficient Apollo4 SoC, which features Arm technology.
Working in conjunction with SoftBank, the AiLens will cover a variety of applications and use cases, including healthcare, workplace productivity and training, retail, navigation and travel, education and skill development, and entertainment.
Moreover, XREAL displayed its new XREAL One Series AR smart glasses. Built on Arm Cortex-A CPU technology, these AR wearables offer impressive display capabilities on a very lightweight form factor, with users able to generate 3D objects through speaking to the devices.
Elsewhere at CES 2025, MediaTek highlighted the capabilities of its Arm-based Kompanio 838 for gaming and education on Chromebook devices. For gaming, all Android games can be played on Chromebook devices that are built on MediaTek’s Kompanio 838 processor. This provides a smooth and responsive experience for players. Meanwhile, its AI intelligence enhances camera capabilities for high-quality image capture and “text-to-image” translation, supporting education use cases for students.
Also leading OEM ASUS demonstrated its Chromebook CZ12, which is designed as a “rugged, student-centric study mate.” The device, which is powered by the Arm-based MediaTek 520 processor, aims to provide enriched educational experiences through a robust design that is easy for students to use.
Bringing advanced AI capabilities to edge and endpoint devices
Alif Semiconductor made waves at CES 2025 by announcing the integration of Arm’s Ethos-U85 NPU into its second generation of Ensemble microcontrollers (MCUs). These new Ensemble MCUs and fusion processors are designed to support generative AI workloads, enabling advanced AI capabilities at the edge and endpoint devices. This is particularly valuable for edge AI applications focused on vision, voice, text, and sensor fusion, providing instant, accurate, and satisfying user experiences without relying on the cloud.
Arm’s standardized Ethos NPU IP was chosen for its superior performance and efficiency, as well as its broad ecosystem support.
Sustainable AI for the future
On the last day of CES 2025, Ami Badani hosted a fascinating panel discussion with representatives from Meta and NVIDIA on “the key to powering a sustainable AI revolution.” All agreed that the next frontier of AI compute will require unprecedented compute power, with Arm, Meta and NVIDIA committed to power-efficient AI technologies and software from cloud to edge. The panel also discussed how the future of AI will see different models for different levels of performance and use cases, with AI resources being delivered more efficiently as part of this sustainable AI future.
Arm technology across every corner of CES 2025
With Arm technology touching 100 percent of the connected global population, AI innovations from our global partner ecosystem were across every corner of CES 2025. Alongside some incredibly exciting announcements, Arm’s presence at CES 2025 is setting the scene for the year ahead, with the Arm compute platform at the heart of all AI experiences.
Imagine sitting in your living room and watching a movie. As the helicopter flies overhead, you can hear it moving seamlessly above you, and then from one side of the room to the other, creating a truly immersive experience. This is now possible because of Eclipsa Audio, based on Immersive Audio Model and Formats (IAMF), a new open-source audio technology that uses fast and efficient processing for a variety of common consumer products – from high-end cinema systems featuring premium TVs to entry-level mobile devices and TVs.
The technology was developed by Google, Samsung, Arm, and the Alliance for Open Media (the organization behind the popular AV1 video format), which was launched at the Consumer Electronics Show (CES) 2025.
What is Eclipsa Audio?
Eclipsa Audio is a multi-channel audio surround format that leverages IAMF to produce an immersive listening experience. It will revolutionize the way we experience sound by spreading audio vertically as well as horizontally. This creates a three-dimensional soundscape that closely mimics natural settings, bringing movies, TV shows, and music to life.
Eclipsa Audio dynamically adjusts audio levels for different scenes, ensuring optimal sound quality. Additionally, it offers customization features that allow listeners to tweak the sound to their preferences, helping to ensure that every listening experience is personalized and unique.
An Eclipsa Audio bitstream can contain up to 28 input channels, which are rendered to a set of output speakers or headphones. These input channels can be fixed, like a microphone in an orchestra, or dynamic, like a helicopter moving through a sound field in an action movie.
Eclipsa Audio also features binaural rendering, which is essential for mobile applications when delivering immersive audio through headphones. Finally, the new audio technology supports content creation across consumer devices, enabling users to create their own immersive audio experiences.
How Arm worked with Google during IAMF development
Arm has been a strategic partner throughout the development of IAMF, working closely with Google’s team to optimize the technology’s performance. Our contributions focused on enhancing the efficiency of the Opus codec and the IAMF library (libiamf), ensuring that it delivers the best possible performance on Arm CPUs that are pervasive across today’s mobile devices and TVs.
Arm CPUs have included the NEON SIMD extension since 2005 and evolved significantly since then, providing remarkable performance boosts for DSP tasks like audio and video processing. For IAMF specifically, Arm’s engineers have focused on optimizations that allow real-time decoding of complex bitstreams with minimal CPU usage, ensuring reliable performance even when CPUs are busy processing other elements of the experience. This is particularly important for mobile applications where power efficiency is crucial.
Performance Enhancements
Arm has been upstreaming patches to the opus-codec and libiamf, focusing on floating point implementations for optimal performance. These enhancements include:
NEON Intrinsic Optimizations: Supporting various Arm architectures (armv7+neon, armv8-a32, armv8-a64, and armv9), these optimizations speed up float to int16 conversion and soft clipping, and provide CPU-specific optimizations for matrix multiplication and channel unzipping in multi-channel speaker output.
Performance Improvements: Significant performance uplifts were observed across different speaker configurations (Stereo, 5.1, 9.1.6) on devices like Quartz64 and Google Pixel 7. For instance, 9.1.6 output showed over 160% improvement on the Arm Cortex-A55 CPU cores in the Pixel 7.
Decoding Efficiency: After optimizations, all test files decode in less than 16% of real-time on Aarch64 and less than 23% on Arm32, making them highly efficient on the Cortex-A55.
Core Technologies
IAMF supports several codecs, including LPCM, AAC, FLAC, and Opus. Opus, being the most modern codec, is likely to be the preferred choice. We have further optimized Opus for the Arm architecture, ensuring it performs efficiently within the IAMF framework.
The IAMF library (libiamf) decodes IAMF bitstreams and produces speaker output for various sound systems. Arm introduced a framework for CPU specializations, using compile-time feature detection to ensure the library is optimized for the platform it runs on.
By optimizing key components like the Opus codec and libiamf library, our engineers ensured that IAMF delivers unparalleled performance on Arm CPUs. This not only enhances the user experience but also demonstrates the value of our technology in cutting-edge applications.
IAMF’s open standard approach, supported by the Alliance for Open Media, aligns with Arm’s vision of broad accessibility and innovation. This partnership highlights our role in driving the future of immersive audio, making high-quality sound experiences available across a wide range of existing and future devices, from high-end home cinema systems to entry-level mobile devices and TVs.
The future of immersive audio is here
IAMF represents a significant leap forward in immersive audio technology, offering a versatile and high-quality audio experience using AI and deep-learning techniques coupled with robust performance optimizations that are supported by Arm.
The future of immersive audio is here and is now more accessible than ever. Whether you’re a casual listener or an audiophile, IAMF promises to transform audio experiences, bringing you closer to the action than ever before.
Equity and inclusivity are core pillars in the Arm DEI (Diversity, Equity, and Inclusion) strategy that helps to drive business and cultural impact within the company and beyond. That overarching commitment is also a vital part of the landmark multiyear partnership between Aston Martin Aramco Formula One® Team and Arm, announced at the Consumer Electronics Show (CES).
The Arm and Aston Martin Aramco commitment to equity and inclusivity
Both tech and motorsport have made substantial strides towards attracting more diverse talent and advancing equity and inclusivity. The partnership aims to accelerate this even further across STEM (Science, Technology, Engineering and Mathematics) and motorsport, and the synergy between these two industries provides fantastic opportunities for Arm and Aston Martin Aramco to work together to nurture diverse talent and leadership. The dual commitment of both organizations to driving equity and inclusivity while collaborating to deliver leading-edge technology platforms makes the partnership truly unique.
Ami Badani, Arm’s Chief Marketing Officer, is excited about the partnership and the benefits it will bring to technology and motorsport. She says: “We believe that if we’re going to change industries like technology and sport for good, then equity and inclusivity needs to be at the heart of everything we do. It was clear from the beginning of the partnership that both Arm and Aston Martin Aramco have shared values around equity and inclusivity, and a similar passion for technology innovation. Through working together in this new partnership, we will be making a meaningful impact in both of these areas.”
Inspiring and empowering future talent
One of the core aims of the partnership is educating and empowering the next generation of engineers, racers and innovators by equipping them with the engineering and technical skills and knowledge needed for a future career in either technology or motorsport. As part of the partnership, Jessica Hawkins, Head of the F1 Academy at Aston Martin Aramco, will represent Arm as an Official Ambassador. In addition, Arm will be actively investing in Jessica’s own career, supporting her growth as a racer and female leader in motorsport, including her ambitions to race at the 24 hours of Le Mans in an Aston Martin car and to be the first female Formula One driver.
Arm’s support as part of the new partnership also extends to the Aston Martin Aramco F1 Academy program led by Jessica, which was created to cultivate women in motorsport. It aims to change the dynamics of an industry where the vast majority of racers are men, yet women are the fastest growing fan base.
A “game-changing” new partnership
Dipti Vachani, SVP and GM of Arm’s Automotive Line of Business, is passionate about the new partnership and what it will bring to representation in STEM and motorsport, particularly Jessica’s own career.
She says: “I realized this new partnership would be truly game-changing when I had a conversation with Jessica for the first time. I felt inspired by her story and achievements and believe that with the right support, Jessica can go as far she can in her career. This partnership is all about showing women that they deserve a seat at the table in tech and sport, with Jessica doing just this for motorsport.”
Ami and Dipti both featured alongside Jessica and Charlie Blackwall, Head of Electronics at Aston Martin Aramco Formula One Team, at CES 2025 to speak about the new partnership, including equity and inclusivity across both industries. They highlighted their own experiences navigating the technology industry and passions to drive greater inclusion into the future of technology and motorsport.
The start of a unique new partnership
The Arm and Aston Martin Aramco partnership represents a truly exciting moment across technology and motorsport. Both Arm and Aston Martin Aramco will be working hard to accelerate equity and inclusivity in STEM and motorsport through driving greater opportunities for the next generation of diverse talent, all while building transformative leading-edge technologies. This is just the start of a truly unique partnership, so watch this space for future developments!
One of the most exciting trends we see today is the rapid expansion and availability of AI-based applications and features across a variety of edge devices. As AI continues to grow and advance, it is crucial that AI researchers, data scientists, developers and students have access to high performance compute that can be used to develop or run the latest models, whether they are language, vision or multi-modal. With the pace of AI innovation moving faster than ever, we need to enable access to this performance beyond the cloud at the edge, bringing new capabilities directly to developers.
Putting game-changing AI performance at every developers’ fingertips
A big step towards the vision of developing and deploying AI everywhere is NVIDIA Project DIGITS, a personal AI supercomputer announced during NVIDIA founder and CEO Jensen Huang’s keynote at the Consumer Electronics Show (CES) 2025 today (Monday 6th January 2025). The Project DIGITS Linux-based system featuring Arm-powered CPU cores makes it possible for every AI developer to have a high performance AI system on their desk.
NVIDIA Project DIGITS is powered by the NVIDIA GB10 Grace Blackwell Superchip, bringing together the NVIDIA Grace CPU and NVIDIA Blackwell GPU with the latest-generation CUDA cores and fifth-generation Tensor Cores connected via NVLink®-C2C chip-to-chip interconnect and 128GB of unified memory. The NVIDIA Grace CPU features our leading-edge, highest performance Arm Cortex-X and Cortex-A technology, with 10 Arm Cortex-X925 and 10 Cortex-A725 CPU cores. The NVIDIA GB10 delivers up to one petaflop¹ (1000 TFLOPs) of AI computing performance at FP4 precision, enabling developers to prototype, fine-tune and run inferencing with large AI models and work in conjunction with the cloud or data center.
The value of the Arm compute platform
Leveraging the ubiquitous Arm compute platform allows new AI models and applications to run more efficiently and faster at the edge. In consumer technology markets, our CPU technologies are found at the heart of today’s edge devices and designed to target the most performant devices entering the market, whether that’s the latest Arm CPUs such as those used in Project DIGITS, or as part of Arm Compute Subsystems (CSS) for Client. All of these technologies are optimized for maximum performance throughout and peak efficiency across real-world applications and workloads.
The Arm compute platform also offers the flexibility to use different computational engines for different AI use cases. In NVIDIA Project DIGITS, the Arm-based NVIDIA Grace CPU and NVIDIA Blackwell GPU serve complementary roles, enabling developers to use these components for a variety of workloads. This heterogeneous computing approach is essential to achieving maximum AI performance, while managing memory utilization and power consumption.
“Our collaboration with Arm on the GB10 Superchip will fuel the next generation of innovation in AI, combining NVIDIA’s AI expertise with Arm’s scalable compute platform to deliver exceptional performance and efficiency,” said Ashish Karandikar, VP of SoC Products at NVIDIA. “Now, with the introduction of Project DIGITS, every AI developer and researcher can have a powerful supercomputer at their fingertips.”
Unlocking software innovation
For developers, it is critical to have a fully integrated hardware and software AI platform. Project DIGITS uses the open-source Linux operating system, and users can access an extensive library of NVIDIA AI software, including software development tools, libraries, frameworks and AI models available in the NVIDIA NGC catalog and the NVIDIA Developer portal to accelerate their generative AI workflows.
Arm’s presence in datacenters with NVIDIA Grace Hopper and Grace Blackwell provides a consistent platform architecture across both datacenter and edge environments, allowing developers to seamlessly use the same set of tools for AI application development. Moreover, Arm has been supporting and driving critical work in open-source developer communities to enable the software needed to deploy AI everywhere. As a result, over 20 million software developers worldwide are building their applications on the Arm compute platform, enabling a growing open-source community that is innovating at a rapid scale.
The ideal platform for high performance AI compute
Arm is the world’s leading, most pervasive compute platform for AI now and in the future, making it an ideal platform for the GB10 Superchip used in Project DIGITS, a powerful PC desktop platform that can run large AI models of up to 200B parameters, which has not been possible until now. This influence across the AI ecosystem delivers flexible, performant and power-efficient AI capabilities to millions of developers worldwide. Working with NVIDIA and our leading software ecosystem, we cannot wait to see the next generation of highly innovative AI applications deployed.
¹ This is a petaflop based on FP4, as referenced here.
At Arm, we are constantly thinking about the future of computing. From the latest architectural features to new technologies for silicon solutions, everything we create and design is geared towards how technology will be used and experienced in the future.
This is supported by our unique position in the technology ecosystem where we have a vast understanding of the global, interconnected and highly specialized semiconductor supply chain covering all markets, from IoT to datacenters and everything in between. This means we have a broad range of insights into the future direction of technology and key trends that are likely to emerge in the years ahead.
With this in mind, we have the following technology predictions for 2025 and beyond, covering all aspects of technology, from the future growth of AI to silicon designs to key trends across different technology markets. Read on to see what Arm thinks the year ahead will bring…
Silicon
A rethinking of silicon design with chiplets being part of this solution
From a cost and physics perspective, it is getting increasingly difficult to do traditional silicon tape-outs. The industry will need to rethink silicon designs and go beyond these traditional approaches. For example, there is a growing realization that not everything needs to be integrated on a single, monolithic chip, with new approaches like chiplets beginning to emerge as foundry and packaging companies find new ways to push the boundaries of Moore’s law, but under new dimensions.
The different implementation techniques of chiplets are getting more attention and having a deep impact on core architecture and micro-architecture. For chiplets, architects will need to be increasingly aware of what different implementations offer, whether it’s the manufacturing process node or the packaging technology, and then take advantage of the features for performance and efficiency benefits.
Chiplets are already addressing specific market needs and challenges, with this likely to advance in the years ahead. In the automotive market, chiplets can help companies achieve auto-grade qualifications during the silicon development process They can also help to scale and differentiate silicon solutions through utilizing different computing components. For example, chiplets focused on compute have a different number of cores, whereas memory-focused chiplets have different sizes and types of memories, so by combining and packaging these various chipsets at a system integrator level companies can develop a greater number of highly differentiated products.
The Moore’s Law “recalibration”
Moore’s Law has put billions of transistors on a chip, doubled performance and halved power every year. However, this continuous push for more transistors, more performance and less power on a single, monolithic chip is not sustainable. The semiconductor industry will need to rethink and recalibrate Moore’s Law and what it means to them.
Part of this means moving away from solely focusing on performance as the key metric and instead valuing performance per watt, performance per area, performance per power and total cost of ownership as core metrics during silicon design. There are also new metrics that focus on the implementation aspect of the system – which present the most challenges for development teams – and making sure performance is not degraded once the IP is integrated into a system-on-chip (SoC) and then the overall system. Therefore, this will involve continuous performance optimizations during silicon development and deployment. These metrics are more relevant to where the wider tech industry is heading, as it pushes for more efficient computing for AI workloads.
The growth of specialized silicon
The industry-wide push for specialized silicon will continue to grow. The rise of AI has put power consumption in focus, emphasizing that the datacenter can no longer be built around off-the-shelf computing solutions. Instead, the compute must be built and designed around specific datacenters and workloads. Across the industry, we have witnessed a movement towards custom silicon, particularly with leading cloud hyperscalers, including Amazon Web Services (AWS), Google Cloud and Microsoft Azure. In 2025, we expect this movement to continue with significant investment in technologies that allows leading technology companies to more rapidly design and deploy custom silicon, from ASIC services to chiplets.
True commercial differentiation in silicon solutions
The push for more specialized silicon will be part of the ongoing move for companies to deliver true commercial differentiation with their silicon solutions. A big part of this is the growing adoption of compute subsystems (CSS), which are core computing components that enable companies – big and small – to differentiate and customize their solutions, with each configured to perform or contribute towards specific computing functions or specialized functionalities.
The growing importance of standardization
Standardized platforms and frameworks are vital to ensure the ecosystem can differentiate their products and services, adding true commercial value while saving time and costs. With the emergence of chiplets that integrate different computing components, standards will be more important than ever, as it will enable different hardware from different vendors to work together seamlessly. Arm is already working with more than 50 technology partners on the Arm Chiplet System Architecture (CSA) and expect more to join as part of this growing push towards standardization in the chiplet marketplace. In the automotive industry, this will be combined with SOAFEE that aims to de-couple hardware and software in software-defined vehicles (SDVs), leading to greater flexibility and interoperability between computing components and faster development cycles.
Ecosystem collaboration on silicon and software like we have never seen before
As the complexities of silicon and software continue to grow, no single company will be able to cover every level of silicon and software design, development and integration alone, with deep levels of ecosystem collaboration needed. This provides unique opportunities for different companies – big and small – to deliver different computing components and solutions based on their core competencies. This is especially relevant to the automotive industry, which needs to bring together the entire supply chain – from silicon vendors and Tier 1s to OEMs and software vendors – to share their expertise, technologies and products to define the future of AI-enabled SDVs and fulfil its true potential for the end-user.
The rise of AI-enhanced hardware design
The semiconductor industry will see increased adoption of AI-assisted chip design tools, where AI helps optimize floor plans, power distribution, and timing closure. This approach will not only optimize performance results, but accelerate the development cycle of optimized silicon solutions and enable smaller companies to enter the market with specialized chips. While AI will not replace human engineers, it will become an essential tool in handling the growing complexity of modern chip design, particularly for power-efficient AI accelerators and edge devices.
AI
Significant investments into performant, power-efficient AI
A topic at the forefront of everyone’s minds – governments, industry and society in general – is how to manage the increasing power and compute demands in the age of AI. Finding ways to limit the power consumption of large datacenters without compromising performance is paramount, especially with the world’s datacenters requiring 460 terawatt-hours (TWh) of electricity annually, which is the equivalent to the entire country of Germany.
Performant, power efficient AI will be achieved through co-design of the entire system, with investments in both hardware and software. From a hardware perspective, further advancements in underlying processor technologies and CPU architecture will ensure that AI is processed as efficiently as possible. Specialized hardware will leverage these advancements to efficiently handle intensive AI workloads across all aspects of the datacenter, including network, storage, security, and data management. Meanwhile, new innovative software will optimize AI workloads, so they can operate with fewer resources while maintaining or improving performance.
The continuous growth of AI inference
In the year ahead, AI inference workloads will continue to grow, helping to ensure that AI can be deployed widely and sustainably everywhere. This growth is being driven by the increasing number of AI-enabled devices and services. In fact, the majority of AI inference, which covers everyday AI use cases like text generation and summarization, can take place on smartphones and laptops, which provides more responsive and secure AI-based experiences for the end-user. In order to see this growth, devices must be built upon a bedrock which enables faster processing, lower latency and efficient power management. Two key Armv9 architecture features are SVE2 and SME2, which together combine to enable fast and efficient AI workloads on the Arm CPU.
The power of heterogenous computing in the AI era
It is clear that no individual piece of hardware or computing component will be the answer for all workloads. This is especially important as AI inference continues to permeate all aspects of compute across all types of devices, from smart thermostats to the datacenter. We have seen significant growth of AI accelerators in 2024 including from leading hyperscalers, but to leverage those accelerators for AI workloads requires a CPU platform. Arm Neoverse is providing a level of flexible computing that makes the coupling of an Arm CPU with an accelerator seamless, enabling new types of engineering creativity and development like we have seen with the Grace Blackwell superchip that pairs NVIDIA’s Blackwell GPU architecture with the Arm Neoverse-based Grace CPU. The Arm CPU is the most ubiquitous compute platform in the world, and we expect to see more of these heterogenous compute collaborations in 2025.
AI at the edge will grow in prominence
In 2024, we have seen an increasing number of AI workloads running at the edge – on device – rather than being processed in large datacenters. This means power and cost savings, as well as privacy and security benefits for consumers and businesses.
2025 will also likely see the emergence of sophisticated hybrid AI architectures that separate AI tasks between edge devices and the cloud. These systems will use AI algorithms in edge devices to detect events of interest before employing cloud models to provide additional information. Determining where to run AI workloads, locally or in the cloud, will be based on factors like available power, latency requirements, privacy concerns, and computational complexity.
Edge AI workloads represent a shift towards decentralized AI, enabling smarter, faster, and more secure processing on the devices closest to the data source, which will be particularly beneficial in markets that require higher performance and localized decision-making, such as industrial IoT and smart cities.
The acceleration of smaller language models (SLMs)
Smaller, compact models with an increased compression, quantization and a decreasing number of parameters are evolving at a rapid rate. Examples include Llama, Gemma and Phi3, which are more cost-effective, efficient, and easier to deploy on devices with limited computational resources and we expect the number to grow in 2025. These models can run directly on edge devices, bringing enhanced performance and privacy.
We expect to see a rise in SLMs used for on-device tasks for language and device interactions, as well as vision-based tasks, such as interpreting and scanning for events. In the future, learnings will be distilled from larger models to develop local expert systems.
Multimodal AI models that can hear, see and understand more
Right now, large language models (LLMs), like GPT-4, are trained on human text. If these models are asked to describe a scene, they respond with a text-based description. However, we are beginning to see the advent of multimodal AI models that include text, images, audio, sensor data and many more. These multimodal models will deliver more advanced AI-based tasks through audio models that can hear, vision models that can see, and behavior models that can understand relationships between people and objects. These will give AI the ability to sense the world just as humans do – being able to hear, see and experience.
The growing use of AI agents
If users interact with AI today, they are likely to be interacting with a single AI that will do its best to complete the task requested on its own. With AI agents, a person still tells one AI that task required, but it would delegate this to a network of AI agents or bots. Examples of industries that have started using AI agents today include customer service support and coding assistants. We expect this to grow substantially in the year ahead across more industries, as AI becomes even more connected and intelligent. This will help to set the next stage for the AI revolution, leading to an even bigger impact on productivity for both our personal and work lives.
More powerful, intuitive, intelligent applications
Fueled by the rise of AI, there will be more powerful and personalized applications on devices. These will include more intelligent and intuitive personal assistants, and even personal physicians, with applications moving away from simply reacting to user requests, to making proactive suggestions based on the user and the environment they find themselves in. The move towards the hyper-personalization of AI through such applications will lead to an exponential rise in data usage, processing and storage, which supports the need for greater security measures and regulatory guidance from industry and governments.
Healthcare will be the key AI use case
Healthcare appears to be one of the leading use cases for AI, with this set to accelerate in 2025. Examples of AI for health use cases include predictive healthcare, digital record storage, digital pathology, the development of vaccines and gene therapy to help cure diseases. In 2024, the founders of DeepMind were awarded the Nobel Prize for Chemistry, as they worked with scientists to use AI to predict complex protein structures with 90 percent accuracy. Meanwhile, the use of AI has been shown to shorten the R&D cycle during the drug research process by 50 percent. The benefits of such AI innovations to society are significant, accelerating the research and creation of life-saving medicines. Furthermore, the combination of mobile devices, sensors and AI will allow users to have access to far better health data, so they can make more informed decisions about their own personal health.
A push towards “greener AI”
The integration of sustainable practices within AI is set to accelerate. Alongside the use of power-efficient technologies, there could be an increasing focus on “greener AI” approaches. For example, in response to increasing energy demands, training AI models in lower-emission geographies and during periods of lower grid demand could evolve to become standard practice. By balancing energy loads on the grid, this approach will help mitigate peak demand stress and reduce overall emissions. Therefore, expect to see more cloud providers offering scheduling options for energy-efficient model training.
Other approaches could include optimizing existing AI models for efficiency, reusing or repurposing pre-trained AI models, and an uptake in “green coding” to minimize energy usage. We may also start to see the introduction of voluntary, then formal, standards for sustainable AI development as part of this wider move towards “greener AI.”
Advancements in renewable energy paired with AI
The combination of renewable energy and AI is expected to increase innovation across the energy sector. The reliability of renewable sources and lack of flexibility to balance peak loads currently limits energy grid decarbonization. However, we expect AI to help predict energy demand with greater accuracy, optimize grid operations in real-time, and enhance the efficiency of renewable energy sources to help tackle this. Energy storage solutions are also benefiting from AI, which optimizes battery performance and longevity that are crucial for balancing the intermittent nature of renewable energy sources.
AI integration not only offers solutions to the challenge of predicting and balancing peak demand, but also predicting maintenance. This reduces disruptions while smart grids leverage AI for real-time energy flow management and reduction of wastage. The advancements in AI paired with renewable energy promises significant improvements in efficiency and sustainability of energy systems.
Markets
Heterogeneous computing for all types of AI in IoT
The broad range of AI applications, particularly in IoT, will need to use different computational engines for different AI demands. To maximize the deployment of AI workloads, CPUs will remain a key focus for deployment on existing devices. New IoT devices will offer enhanced performance for AI with increased memory sizes and higher performance Cortex-A CPUs. Embedded accelerators, such as the latest Ethos-U NPUs, will be used to accelerate low power machine learning (ML) tasks, and bring power-efficient edge inference to a broader range of use cases such as industrial machine vision and consumer robotics.
Essentially, in the short term, we will see multiple compute elements used to serve the AI needs of specific applications. This trend will continue to emphasize the need for common tools, libraries and frameworks to enable application developers to make the most of the capabilities in the underlying hardware. There is no “one size fits-all” solution for edge AI workloads, highlighting the importance of a flexible compute platform for the ecosystem.
Increasing adoption of virtual prototypes to transform silicon and software development process (Automotive)
Virtual prototypes are accelerating silicon and software development cycles, with companies able to develop and test software before the physical silicon is ready. The benefits are particularly relevant to the automotive industry where the availability of virtual platforms is accelerating automotive development cycles by up to two years.
In 2025, we are expecting more companies to launch their own virtual platforms as part of this ongoing transformation of the silicon and software development process. These virtual platforms will work seamlessly, with Arm architecture offering ISA parity, ensuring uniformity in architecture in the cloud and at the edge. With ISA parity, the ecosystem can build their virtual prototype in the cloud and then seamlessly deploy at the edge.
This saves significant time and costs, while giving developers more time to extract even greater performance from their software solutions. Following the introduction of the Armv9 architecture to automotive markets for the very first time in 2024, we expect to see more developers take advantage of this ISA parity in automotive and leverage virtual prototyping to build and deploy automotive solutions quicker.
End-to-end AI will enhance automated driving systems
Generative AI technology is rapidly being adopted in end-to-end models that promise to address scalability barriers faced by traditional automated driving (AD) software architectures. With end-to-end self-supervised learning, AD systems are more capable of generalizing to cope with previously unseen scenarios. This novel approach promises an effective way of enabling faster scaling of operational design domains, making it quicker and cheaper to deploy AD technology from the highway to urban areas.
More hands-off driving, but more driver monitoring too
Progress in the harmonization of vehicle regulations for L2+ hands-off DCAS and L3 ALKS will accelerate wide deployment of these premium features worldwide. Leading automakers are already investing in equipping vehicles with the hardware necessary to up-sell these features through subscriptions throughout the lifetime of the vehicle.
In order to prevent driver misuse of driving automation systems, regulations and New Car Assessment Programs are focusing on increasingly sophisticated in-cabin monitoring systems like driver monitoring (DMS). In Europe, for example, EuroNCAP 2026’s new rating scheme will incentivize deeper integration of direct-sensing (e.g. camera-based) DMS with advanced driver assistance systems (ADAS) and AD features to provide adequate vehicle responses to different levels of driver disengagement.
The smartphone is the primary consumer device for decades, not just years
The smartphone’s crown as the world’s primary consumer device will not be going away anytime soon. In fact, it is likely to be the go-to device for consumers for decades, not just years, with no device likely to offer a realistic challenge. With adoption of Armv9 growing across leading smartphones, this will mean more computing capabilities and better application experiences from new flagship smartphones in 2025 that will only reinforce this number one position. However, it is clear that consumers use different devices for different purposes, with the smartphone being primarily used for app use, web browsing and communication, whereas PCs and laptops are still seen as the “go-to” devices for productivity and work-based tasks.
It will also be interesting to see the emergence of AR wearables, like smart glasses, as the ideal companion device for the smartphone. A key reason behind the staying power of the smartphone is its ability to evolve, from apps to cameras to gaming, and now the industry is seeing the emergence of new usage models for AR, with the smartphone beginning to support AR-based experiences from wearable devices.
The ongoing miniaturization of technology
Across the tech industry, we are seeing smaller, sleeker devices, like AR smart glasses and smaller wearable tech. This is being made possible through a combination of factors. Firstly, the adoption of power-efficient technologies that offer the performance required to support key device features and experiences. Secondly, in the case of AR smart glasses, we are now seeing the adoption of ultra-thin silicon carbide technologies that not only enable high-definition displays, but also dramatically reduce the thickness and weight of the devices. Finally, new compact language models are transforming AI-based experiences across these smaller devices, making them more immersive and interactive. The powerful combination of power-efficiency, lightweight hardware and smaller AI models will be a driver for the growth of smaller, more capable consumer devices in the next year.
The continuous rise of Windows on Arm
In 2024, the Windows on Arm (WoA) ecosystem saw significant progress, with the most widely used applications now providing Arm-native versions. In fact, 90 percent of the time spent by the average Windows user will be with applications that are now Arm-native. One recent example is Google Drive, which released an Arm-native version at the end of 2024. We expect this momentum to continue throughout 2025 and lead to WoA becoming even more appealing for both developers and consumers, as demonstrated by the impressive performance enhancements for Arm-native applications, like Google Chrome, that are essential to everyday user experiences.
2024 was the year of the Arm compute platform for AI, with significant announcements, products and initiatives that made this a reality. Here’s the Arm A to Z of what we achieved in the past year.
A is for Automotive Enhanced (AE), with a brand-new suite of technologies announced for the automotive market, including the first times we introduced Neoverse beyond server markets and Armv9 CPUs to the automotive market. This was alongside SOAFEE reaching 150 members from across the entire automotive supply chain.
B is for Bears, with Arm-based AI being used to identify bears as part of the BearID wildlife conservation project, which was discussed in two Arm Viewpoints podcast episodes here and here.
C is for Compute Subsystems (CSS), with Arm launching two new Neoverse CSS built on brand-new third-generation Neoverse IP and CSS for Client with verified implementations of new Arm CPUs and GPUs, while a new CSS for Automotive was announced for 2025.
D is for Developers, with over 20 million software developers worldwide building their applications on the Arm compute platform, developer engagement continued throughout 2025 via leading industry events, like GitHub Universe, Microsoft Build and Ignite, KubeCon NA, PyTorch Conference, and AWS re:Invent, Even the ever-growing Arm Developer Program hit 13,000 members in 2024.
E is for Edge AI, with the new Arm Ethos-U85 NPU offering support for transformer networks, which, when combined with Arm CPUs, accelerates edge AI for IoT applications that demand higher performance and localized real-time decision making.
F is for Frederique Olivier, a wildlife photographer and explorer who uses Arm-based tech to capture stunning visuals and footage. Her story was featured as part of the ‘Purpose on Arm’ collaboration with National Geographic.
G is for GitHub, with a combination of the new Arm extension for GitHub Copilot, GitHub Actions, native GitHub runners and AI frameworks all supported on Arm, which unlocks simplified AI workflows and deployments from cloud to edge for millions of developers worldwide.
H is for Heterogenous computing, with the Arm compute platform providing the flexibility for partners to use and integrate different computational engines, including accelerator technologies, in specialized silicon solutions for different AI use cases and demands.
I is for Influence, with Arm’s role as a global semiconductor ambassador facilitating vital discussions on security and AI at a G7 Semiconductor Experts Working Group hosted at Arm’s global HQ in Cambridge, UK.
J is for Jensen Huang who featured as the first guest on the new Tech Unheard podcast, which is hosted by Arm CEO Rene Haas. Expect more podcast episodes with tech industry leaders, with the second episode featuring Mike Gallagher, Head of Defense at Palantir Technologies, and a former Congressman and Marine.
K is for Kleidi, Arm’s new CPU acceleration software, which seamlessly integrates with leading frameworks to ensure AI workloads run best on the Arm CPU, with developers then able to access this accelerated performance. Kleidi has already been integrated into Meta’s ExecuTorch and PyTorch, Google’s MediaPipe and Tencent’s Hunyuan.
L is for Llama, with the Arm and Meta collaboration enabling the new Llama 3.2 Large Language Models (LLMs) on optimized Arm CPUs when running a range of AI inference workloads.
M is for Mobile Generative AI, with a range of use cases, including virtual chatbots, group chat and voice note summarization and real-time assistant, possible on Arm-powered mobile devices, like the latest AI flagship smartphones, via the Arm CPU.
N is for the Nasdaq-100 Index, which represents a collection of the top 100 largest and most actively traded companies on the Nasdaq stock exchange, with Arm among the fastest companies to be included in the index post-IPO.
S is for SOX (the PHLX Semiconductor Index), which Arm was added to in September 2024 in recognition of our rapid growth as a company, with SOX representing 30 of the largest eligible semiconductor companies listed in the U.S. ranked by market capitalization.
T is for Total Design, with the number of Arm Total Design members doubling since the ecosystem was formed in 2023. This has led to powerful ecosystem collaborations that are accelerating the development and deployment of specialized silicon solutions, including chiplets, for the cloud and datacenter.
U is for Unlocking new possibilities in the cloud, with leading partners including AWS, Google Cloud and Microsoft announcing a range of hardware and software innovations to support developers’ cloud workloads.
V is for Virtual Platforms, with a range of automotive partners launching their own Arm-based virtual prototyping solutions to accelerate silicon and software development.
Z is for Net Zero, with the 2024 Arm Sustainable Business Report providing comprehensive information about how Arm positively impacts the environment, our people and society as a whole.
We cannot wait to see what 2025 brings, with Arm continuing to be the prominent force that is driving all AI experiences.
Each year kicks off with the Consumer Electronics Show (CES), which showcases the latest and greatest tech innovations from the world’s technology companies, big and small. At CES 2024, AI took center stage, with attendees demoing their latest AI-based tech solutions, including many of Arm’s partners from automotive, consumer technology and IoT markets.
At CES 2025, we anticipate that AI will remain front and center at the event, as it continues to expand and grow at a rapid rate. In fact, Ami Badani, Arm’s Chief Marketing Officer, will be talking with industry leaders, including Meta and NVIDIA, on how to power a sustainable AI revolution. However, we also expect to see more specific tech trends emerging that will set the tone for innovation for the rest of the year.
This blog outlines these trends, and how Arm and our partners are playing leading roles across each one. These include:
Autonomous driving innovations;
More AI coming to the car;
Accelerating automotive software development;
AI-powered smart home devices, including the TV;
Momentum around Arm-based PCs and laptops;
Driving XR tech adoption; and
The rise of high-performance edge AI.
Autonomous driving innovations
2024 saw various technology innovations that are set to take us closer to fully-fledged autonomous vehicles on the roads. The collaboration between Arm and Nuro is helping to accelerate this future, with the Nuro Driver integrating Arm’s Automotive Enhanced (AE) solutions for more intelligent and advanced autonomous experiences in cars. NVIDIA will bring Arm Neoverse-V3AE to its upcoming DRIVE Thor for next-generation software-defined vehicles (SDVs). Several leading OEMs have already announced plans to adopt the chipset for their automotive solutions, including BYD, Nuro, XPENG, Volvo and Zeekr. We expect CES 2025 to highlight the latest technology solutions and collaborations that will define the future of autonomous driving in the year ahead and beyond.
More AI coming to the car to enhance the driver experience
Across in-vehicle infotainment (IVI) and advanced driver assistance systems (ADAS), there have been various OEM innovations in the past year, with AI models being integrated into these systems. For example, Mercedes-Benz is using Chat-GPT for intelligent virtual assistants within its vehicles. It will be fascinating to see the broad range of OEM innovations on display at CES 2025, with 94 percent of global automakers using Arm-based technology for automotive applications. This is alongside the top 15 automotive semiconductor suppliers in the world adopting Arm technologies in their silicon solutions.
As a technology thought leader in the automotive industry, Dipti Vachani, Arm’s SVP and GM for the Automotive Line of Business, will be participating in a CES 2025 panel with leading OEMs, including BMW, Honda and Rivian, as well as Nuro, on revolutionizing the future of driving through unleashing the power of AI. The panel will discuss the technological impacts of AI on future vehicle designs.
However, hardware innovation is only as strong the software to run on it, which is why we are looking forward to AWS, Elektrobit, LeddarTech, and Plus.AI highlighting their latest AI-enabled solutions at CES 2025. AWS will be showcasing its new generative AI-powered voice-based user guide for inside the vehicle, which runs on virtual hardware in the cloud before running on the physical instance in the car. The chatbot-based solution allows users to interact with the car on its features and dashboard information, with the AI small language model (SLM) being continuously kept up-to-date via software.
Other AI-based in-vehicle demos include the US debut of Elektrobit’s first functional safety compliant Linux operating system (OS) for automotive applications, the EB corbos Linux, which has been announced as a 2025 Honoree in Vehicle Tech and Advanced Mobility in the CES 2025 Innovation Awards. LeddarTech, which is already optimizing its ADAS perception and fusion algorithms through utilizing the latest Arm AE solutions and virtual platforms, will display its latest LeddarVision software for SDVs. Meanwhile, Plus.AI will be highlighting their latest AI-based autonomous driving software solutions, demonstrating how its autonomous driving technology stack can scale across all levels of autonomy for passenger cars and commercial vehicles, with this running on any Arm-based hardware.
Accelerating automotive software development
As the automotive industry evolves to introducing more SDVs on the road, accelerating software development is becoming critical. In 2024, as part of our launch of new automotive technologies, we announced a range of new virtual platforms from our partners. These are transforming the silicon design, development and deployment process, as the virtual platforms allow our partners to develop and test their software before physical hardware is ready. This accelerates development times and leads to a faster time-to-market.
Tata Technologies will be presenting its cloud-to-car demo, which showcases technologies from all four members of the SDV Alliance – which was launched at CES 2024 – that run both on physical Arm-based hardware and on virtual platforms running in an AWS Graviton-powered cloud instance. Meanwhile, AWS will also be showcasing its Graviton G4 hosted Arm RD-1AE reference implementation running on a Corellium virtual platform. Finally, QNX is using CES 2025 to show how developers can create their own innovative cross-platform solutions through its highly accessible software.
The value of ecosystem collaborations in automotive
At CES 2025, we expect to see a range of automotive partners highlighting the value of ecosystem collaborations to support the development and deployment of software in vehicles. This includes Mapbox, a leading platform for powering location experiences for automakers such as BMW, General Motors, Rivian and Toyota, which recently launched its own virtual platform solution, the Virtual Head Unit (VHU), in partnership with Arm and Corellium. The solution empowers leading automakers to expedite the integration, testing, and validation of their navigation systems.
There will also be a range of SOAFEE members highlighting their latest Blueprints at the event. LG will be introducing the LG PICCOLO, which enhances its Battery Management System (BMS) from a solution that has limited update capabilities to one that can be continuously updated and customized with new scenarios at any time. We have been working with LG to integrate BMS and LG PICCOLO into the cloud virtual platform for the Arm RD1-AE, allowing for virtual validation, lower costs and a quicker time-to-market before deployment to the vehicles. In addition, Tier IV and Denso will showcase their SOAFEE Open AD kit Blueprints for autonomous driving, and Red Hat will highlight its mixed-critical demo to improve security and safety in SDVs.
AI-powered smart home devices, including the TV
Previous CES events have demonstrated the possibilities of true integration across smart home devices and applications, like heating, lighting and security, with the TV at the center of these experiences. This is likely to continue at CES 2025, as the smart home effectively becomes a “smart assistant” that adjusts settings in the home based on user preferences, from temperature and light settings to playing music.
It will also be interesting to see the range of new AI-powered features and applications in next-generation TVs on display at CES. This started with picture quality enhancements and content recommendations, but AI in the TV is now powering a range of new use cases, including health and fitness through body tracking via the smart camera. CES 2025 is likely to unearth yet more fascinating AI use cases for the TV, including new immersive experiences.
Moreover, as with previous CES events, the latest premium TVs will be on full display. These include new leading-edge Arm-powered TVs from LG, Hisense, Samsung and TCL. CES 2024’s “showstopper” TV product was LG’s transparent TV, so it will be interesting to see what will take the crown in 2025.
Momentum around Arm-based PCs and Laptops
In 2024, there was significant progress with theWindows on Arm (WoA) ecosystem with the most widely used applications on PC and laptop now providing Arm-native versions. Most recently, Google released an Arm-native version of Google Drive for WoA. This continuous momentum means WoA is an increasingly attractive area of tech for the wider ecosystem. We also expect a range of hardware for AI PCs to be highlighted at the event. This includes MediaTek’s Kompanio SoCs for Chromebook devices that are increasingly adopting new AI-based features.
Driving XR tech adoption
2024 saw significant XR tech innovation, with new AR smart glasses, like Snap’s fifth-generation Spectacles, Meta’s next-generation Ray-Ban, and Meta’s Orion smart glasses, being launched and announced. Hardware advancements, including touch screens and camera miniaturization, as well as software improvements in applications and operating systems, have created opportunities for XR wearable devices to become more mainstream.
CES 2025 will provide the perfect platform to highlight further innovation in the XR space, whether this is new wearable devices or supporting tech and apps. For example, SoftBank-backed ThinkAR will be showcasing its range of wearable devices, including AI smart glasses and wearable AI assistants. Meanwhile, there will be AI updates to current generation XR wearable products, like Meta’s Ray-Ban AR smart glasses.
The rise of high-performance edge AI
CES 2024 saw a range of low power IoT products from Arm partners showcasing edge AI capabilities, enabling use cases like presence, face and gesture detection, and natural language processing. At CES 2025, we expect a step-up in edge AI through higher performance use cases on IoT devices, like localized decision-making, real-time data processing and responses, and autonomous navigation. These are particularly beneficial for applications servicing primary industries, as well as smart cities, industrial IoT and robotics, where quick responses to environments are crucial for functionality and safety.
Looking at the shortlist for the CES 2025 Innovation Awards, there are a range of innovative Arm-powered tech products across IoT industries that are showcasing advanced edge AI use cases. For industrial IoT and robotics, R2C2 ARIII is a robot brain that enhances autonomous industrial inspection, while DeepRobotics is demoing its Lynx four-foot robotic dog for diverse terrains. Elsewhere, SoftBank-backed Aizip is highlighting its on-device edge AI application for high-accuracy fish counting in underwater environments.
CES runs on Arm
With unmatched scale that touches 100 percent of the connected global population, we fully expect the Arm compute platform to feature heavily across many of the technologies on display at CES 2025. We will be kicking off the new year through showing the world that Arm is at the heart of AI experiences, with CES running on Arm-powered technology.
To get the latest Arm CES 2025 updates visit here.
Llama is an open and accessible collection of large language models (LLMs) tailored for developers, researchers, and businesses to innovate, experiment, and responsibly scale their generative AI ideas. The Llama 3.1 405B model stands out as the top-performing model in the Llama collection. However, deploying and utilizing such a large-scale model presents significant challenges, especially for individuals or organizations lacking extensive computational resources.
To address those challenges, Meta is introducing the Llama 3.3 70B model, which retains the same architecture as the Llama 3.1 70B model but incorporates the latest advancements in post-training techniques for greater model evaluation performance, while delivering notable improvements in reasoning, mathematics, general knowledge, instruction following, and tool use. Compared to the Llama 3.1 405B model, it offers similar performance, while being significantly smaller in size.
In close partnership with Meta, Arm’s engineering teams evaluated the inferencing performance of the Llama 3.3 70B model on Google Axion, a family of custom Arm64-based processors built on Arm’s Neoverse V2 technology, which are available through the Google Cloud. Google Axion is designed for higher performance, lower power consumption and greater scalability than legacy, off-the-shelf processors, which better prepares its data centers for the age of AI.
Our benchmarking shows that C4A virtual machines (VMs) based on Axion processors deliver seamless AI-based experiences when running Llama 3.3 70B model and achieve human readability levels across multiple user batch sizes. Human readability refers to the average speed at which a human can read text. This provides developers with flexibility to attain high-quality performance in text-based applications that is comparable to results produced with Llama 3.1 405B model, while no longer requiring large computational resources.
CPU inferencing performance with Llama 3.3 70B on Google Axion processors
Google Cloud offers Axion-based C4A VMs with up to 72 vCPUs and 576 GB of RAM. For these tests, we have used mid-range cost-effective c4a-standard-32 machine type to deploy the Llama 3.3 70B model with 4-bit quantization. For running our performance testing, we utilized the popular Llama.cpp framework, which as of version b4265 has been optimized with Arm Kleidi. The Kleidi integration provides optimized kernels to ensure AI frameworks can by default unlock the AI capabilities and performance of Arm CPUs.
Now let’s get a closer look at the results.
Prompt encoding speed refers to how quickly user inputs are processed and interpreted by the language model. As prompt encoding is done in parallel and leverages multiple cores, as shown in Figure 1, performance remains consistent at around ~50 tokens per second across various batch sizes, and the speed is comparable for the different prompt sizes tested.
Token generation speed measures the rate at which the model generates responses when running Llama 3.3 70B model. Arm Neoverse CPUs optimize machine learning workflows with advanced SIMD instructions, such as Neon and SVE, that are designed to accelerate General Matrix Multiplication (GEMM). To further boost throughput, especially for larger batch sizes, Arm has introduced specialized optimization instructions like SDOT (Signed Dot Product) and MMLA (Matrix Multiply Accumulate).
As shown in Figure 2, the token generation speed increases with larger user batch sizes, while remaining relatively consistent across different token generation sizes tested. This capability to achieve higher throughput with larger batch sizes is essential for building scalable systems capable of serving multiple users effectively.
To evaluate the performance perceived by each user when multiple users are interacting with the model at the same time, we measured the token generation speed per batch. Token generation speed per batch is critical, as it directly influences the real-time experience during user interactions with the model.
As shown in Figure 3, the token generation speed reached an average human readability level for batch sizes up to 4, indicating that the performance remains stable as the system scales to accommodate multiple users. To accommodate larger numbers of concurrent users, leveraging serving frameworks like vLLM is beneficial, as these frameworks optimize KV cache management to enhance scalability.
A game-changer for generative AI
The new Llama 3.3 70B model is a potential game-changer in the accessibility and efficiency for utilizing the benefits of large-scale AI. The smaller model size makes generative AI processing more accessible to the ecosystem, with large computational resources no longer required. Meanwhile, the Llama 3.3 70B model helps to deliver more efficient AI processing that is vital for datacenter and cloud workloads, while delivering comparable performance to Llama 3.1 405B model in terms of model evaluation benchmark.
Through our benchmarking work, we have demonstrated how Google Axion processors, powered by Arm Neoverse, provide a smooth and efficient experience when running the Llama 3.3 70B model, delivering text generation with human-level readability across multiple user batch sizes tested.
We’re proud to continue our close partnership with Meta to enable open-source AI innovation on the Arm compute platform, helping to ensure that Llama LLMs operate seamlessly and efficiently across hardware platforms.
This blog also had contributions from Milos Puzovic, Technical Director, Arm, and Nobel Chowdary Mandepudi, Graduate Software Engineer, Arm.
The Arm Meta partnership
Learn more about how Arm and Meta are unlocking AI technologies together.
The cloud computing landscape is undergoing a dramatic transformation, driven by the explosive growth of AI. As AI applications become increasingly sophisticated and demanding, the need for powerful, efficient, and cost-effective computing solutions has never been greater. Customers deploying their workloads in the cloud are rethinking what infrastructure they need to meet the requirements of these modern workloads. Their requirements range from achieving better performance and reduced costs to achieving new benchmarks in energy efficiency for regulatory or sustainability goals.
Arm and AWS have a long-standing collaboration aimed at providing specialized silicon and compute, paving the way for a more efficient, sustainable, and powerful cloud. This week at AWS re:Invent 2024, you’ll see more evidence for how Graviton4 marks a significant leap forward, empowering developers and businesses to unlock the full potential of their cloud workloads.
Exceptional Performance Benefits
The latest Arm Neoverse V2 based AWS Graviton4 processors provide up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than previous generation Graviton3 processors. Thanks to these advantages, we are now seeing a significant adoption of AWS Graviton processors in the ecosystem and by customers.
The Arm Neoverse V2 platform includes new capabilities of the Armv9 architecture, such as high-performance floating-point and vector instruction support, with features like SVE/SVE2, Bfloat16, and Int8 MatMul delivering strong performance for AI/ML and HPC workloads.
AI/ML Workloads
To further drive adoption of AI workloads Arm launched Arm Kleidi earlier this year, collaborating with leading AI frameworks and the software ecosystem to ensure the full ML stack can benefit from out-of-the-box inference performance optimizations on Arm, allowing developers to build their workloads without needing extra Arm-specific expertise. We’ve showcased how these optimizations in Pytorch enable running LLMs such as Llama 3 70B and Llama 3.1 8B on AWS Graviton4 with significantly improved tokens/sec and time-to-first-token metrics.
For HPC workloads, Graviton4 marks a significant leap forward in capability compared to Graviton3E providing 16% more main-memory bandwidth per core, and a doubling of L2 cache per vCPU. These are significant for HPC application performance which is often memory-bandwidth bound, and AWS has managed to achieve benefits across these areas as shown below.
For EDA workloads, Graviton4 delivers up to 37% higher performance over Graviton3 for RTL simulation workloads as measured by production runs conducted by Arm’s engineering teams.
Ecosystem Adoption
Over the last few years, we have seen a continual ramp in adoption across the software ecosystem with end customers deploying a wide range of cloud workloads on AWS Graviton processors. Customers are saving money, seeing better performance, and improving their carbon and sustainability footprints. Here are a few examples:
Upcoming AWS re:Invent 2024
If you are visiting AWS re:Invent 2024, you can check out the following key sessions on a wide range of topics related to AWS Graviton processors. For a full list of more than 60+ sessions on AWS Graviton, check out the event’s official agenda.
Get ready to harness the power of Graviton
We believe the future of cloud computing is undoubtedly Arm-powered, and are proud to support AWS in placing Graviton at the forefront of this revolution. Arm continues to invest in further strengthening our software ecosystem and removing any friction for developers to build on Arm – and to access all the performance and efficiency benefits the Arm compute platform delivers.
Developer resources
Here are some key resources and avenues to engage directly with us and AWS Graviton teams:
Learn.arm.com: Explore in-depth technical resources on Arm architecture.
As we move into the era of advanced computing, Arm is leading the charge with groundbreaking tech innovations. November 2024 has been a month of significant strides in technology innovation, particularly in AI, machine learning (ML), Arm Neoverse-based Kubernetes clusters, and system-on-chip (SoC) architecture.
The Arm Editorial Team has highlighted the cutting-edge tech innovations that happened at Arm in November 2024 – all of which are set to shape the next generation of intelligent, secure, and high-performing computing systems.
Harnessing SystemReady to drive software interoperability on Arm-based hardware
Arm’s SystemReady program ensures interoperability and standardization across Arm-based devices. Dong Wei, Standards Architect and Fellow at Arm, talks about how the program benefits the industry by reducing software fragmentation, lowering development costs, and enabling faster deployment of applications on a wide range of Arm hardware. Meanwhile, Pere Garcia, Technical Director at Arm, discusses the certification benefits of SystemReady, including how it simplifies software deployment across different Arm devices, reduces fragmentation, and enhances the reliability of embedded systems.
Building safe, secure, and versatile software with Rust on Arm
The Rust programming language enhances safety and security in software development for Arm-based systems by eliminating common programming errors at compile time. In the first blog of this three-part Arm Community series, Jonathan Pallant, Senior Engineer and Trainer at Ferrous Systems, explains why Rust’s unique blend of safety, performance, and productivity has gained attention from government security agencies and the White House.
This blend delivers a robust toolchain for mission-critical applications, reduces vulnerabilities, and improves overall software reliability. Part 2 and Part 3 from Jonathan Pallant provides more insights.
Boosting efficiency with Arm Performance Libraries 24.10
Arm Performance Libraries 24.10 enhance math libraries for 64-bit Arm processors, boosting efficiency in numerical applications with improved matrix-matrix multiplication and Fast Fourier Transforms. Chris Goodyer, Director, Technology Management at Arm, highlights significant speed improvements, including a 100x increase in the Mersenne Twister random number generation. This offers faster, more accurate computations for engineering, scientific, and ML applications across Linux, macOS, and Windows.
Enabling real-Time sentiment analysis on Arm Neoverse-based Kubernetes clusters
Real-time sentiment analysis on Arm Neoverse-based Kubernetes clusters enables businesses to efficiently process social media data for actionable insights. Na Li, ML Solutions Architect at Arm, demonstrates how AWS Graviton instances with tools like Apache Spark, Elasticsearch, and Kibana provide a scalable, cost-effective framework adaptable across cloud providers to enhance analytics performance and energy efficiency.
Enhancing control flow integrity with PAC and BTI on AArch64
Harnessing Arm’s Scalable Vector Extension (SVE) in C#
.NET 9’s support for Arm Scalable Vector Extension (SVE) allows developers to write more efficient vectorization code. Alan Hayward, Staff Software Engineer, highlights how using SVE in C# improves performance and ease of use, enabling more optimized applications on Arm-based systems.
Expanding Arm on Arm with the NVIDIA Grace CPU
The NVIDIA Grace CPU, built on Arm Neoverse V2 cores, enhances performance, reduces costs, and improves energy efficiency for high-performance computing (HPC) and data center workloads. Tim Thornton, Director Arm on Arm, discusses how the NVIDIA Grace CPU Superchip-based servers enable Arm to deploy high-performance Arm compute in their own data centers, providing access to the same Neoverse V2 cores used in AWS and Google Cloud.
Accelerating ML development with Corstone-320 FVP: A guide for Arm Ethos-U85 and Cortex-M85
The benefits of the new Arm Corstone-320 Fixed Virtual Platform (FVP) include developing and testing ML applications without physical hardware. Zineb Labrut, Software Product Owner in Arm’s IoT Line of Business, highlights how this platform accelerates development, reduces costs, and mitigates risks associated with hardware dependencies, making it a valuable tool for developers in the embedded and IoT space.
Real-time Twitter/X sentiment analysis on the Arm Neoverse CPU: A KubeCon NA 2024 demo
Pranay Bakre, a Principal Solutions Engineer at Arm, demonstrates the power of an Arm Neoverse-based CPU as it runs a real-time Twitter/X sentiment analysis program using a StanfordNLP model during KubeCon North America 2024. More information can also be found in this blog.
Learning all about the Fragment Prepass for Arm Immortalis and Mali GPUs
The latest Arm GPUs for consumer devices – the Immortalis-G925, Mali-G725 and Mali-G625 – all adopt a new feature called the Fragment Prepass. Tord Øygard, a Principal GPU Architect at Arm, provides more details about this Hidden Surface Removal technique in this Arm Community blog, which leads to improved performance and power efficiency when processing geomerty workloads for graphics and gaming.
Elsewhere with graphics and gaming at Arm, Ian Bolton, Staff Developer Relations Manager, summarized Arm’s involvement in the inaugural AI and Games Conference in this Arm Community blog.
Exploring .NET 9 and Arm’s SVE with Microsoft and VectorCamp’s simd.info
In the latest “Arm Innovation Coffee”, Kunal Pathak from Microsoft dives into .NET 9 and Arm’s SVE, while Konstantinos Margaritis and Georgios Mermigkis from VectorCamp showcase simd.info, a cutting-edge online reference tool for C intrinsics across major SIMD engines.
Built on Arm partner stories: AI, automotive, cloud and Windows on Arm
Insyde Software CTO Tim Lewis highlights their groundbreaking AI BIOS product, which leverages AI to simplify firmware settings, and showcases their expertise in developing firmware for everything from laptops to servers. Lewis also explains how collaborating with Arm is driving innovation in power management and security.
Tilo Schwarz, VP and Head of Autonomy at Nuro, discusses the innovative Nuro Driver, a state-of-the-art software and hardware solution powering autonomous driving technology, and explains how its partnership with Arm is shaping the future of modern transportation.
Bruce Zhang, Computing Product Architect at Alibaba, discuss how Arm and Alibaba are accelerating AI workloads in the cloud and enabling robust applications that are transforming industries.
Francis Chow, VP and GM of the Edge Business at Red Hat, explains how Red Hat’s solutions, which run on Arm technologies, incorporate leading-edge performance, power efficiency and real-time data processing features as part of the industry-wide move to modern software-defined vehicles.
Aidan Fitzpatrick, CEO and Founder of Reincubate, a London-based software company, talks about the benefits of Windows on Arm to their flagship application Camo, which enables users to create high-quality video content. Main benefits include advanced AI-powered features and performance without sacrificing battery life.
Exploring the latest Trends in the automotive and connected vehicle space
As part of a panel at Reuters Event Automotive USA, Dipti Vachani, SVP and GM of the Automotive Line of Business at Arm, highlights the latest trends and challenges in the automotive and connected vehicle space.
The Arm Tech Symposia 2024 events in China, Japan, South Korea and Taiwan were some of the biggest and best attended events ever held by Arm in Asia. The size of all the events was matched by the enormity of the occasion that is being faced by the technology industry.
As Chris Bergey, SVP and GM of Arm’s Client Line of Business, said in the Tech Symposia keynote presentation in Taiwan: “This is the most important moment in the history of technology.”
There are significant opportunities for AI to transform billions of lives around the world, but only if the ecosystem works together like never before.
A re-thinking of silicon
At the heart of these ecosystem collaborations is a broad re-think of how the industry approaches the development and deployment of technologies. This is particularly applicable to the semiconductor industry, with silicon no longer a series of unrelated components but instead becoming “the new motherboard” to meet the demands of AI.
This means multiple components co-existing within the same package, providing better latency, increased bandwidth and more power efficiency.
Silicon technologies are already transforming the everyday lives of people worldwide, enabling innovative AI features on smartphones, like the real-time translation of languages and text summarization, to name a few.
As James McNiven, VP of Product Management for Arm’s Client Line of Business, stated in the South Korea Tech Symposia keynote: “AI is about making our future better. The potential impact of AI is transformative.”
The importance of the Arm Compute Platform
The Arm Compute Platform is playing a significant role in the growth of AI. This combines hardware and for best-in-class technology solutions for a wide range of markets, whether that’s AI smartphones, software-defined vehicles or data centers.
This is supported by the world’s largest software ecosystem, with more than 20 million software developers writing software for Arm, on Arm. In fact, all the Tech Symposia keynotes made the following statement: “We know that hardware is nothing without software.”
How software “drives the technology flywheel”
Software has always been an integral part of the Arm Compute Platform, with Arm delivering the ideal platform for developers to “make their dreams (applications) a reality” through three key ways.
Firstly, Arm’s consistent compute platform touches 100 percent of the world’s connected population. This means developers can “write once and deploy everywhere.”
The foundation of the platform is the Arm architecture and its continuous evolution through the regular introduction of new features and instruction-sets that accelerate key workloads to benefit developers and the end-user.
SVE2 is one feature that is present across AI-enabled flagship smartphones built on the new MediaTek Dimensity 9400 chipset. It incorporates vector instructions to improve video and image processing capabilities, leading to better quality photos and longer-lasting video.
Secondly, through having acceleration capabilities to deliver optimized performance for developers’ applications. This is not just about high-end accelerator chips, but having access to AI-enabled software to unlock performance.
One example of this is Arm Kleidi, which seamlessly integrates with leading frameworks to ensure AI workloads run best on the Arm CPU. Developers can then unlock this accelerated performance with no additional work required.
At the Arm Tech Symposia Japan event, Dipti Vachani, SVP and GM of Arm’s Automotive Line of Business, said: “We are committed to abstracting away the hardware from the developer, so they can focus on creating world changing applications without having to worry about any technical complexities around performance or integration.”
This means that when the new version of Meta’s Llama, Google AI Edge’s MediaPipe and Tencent’s Hunyuan come online, developers can be confident that no performance is being left on the table with the Arm CPU.
Kleidi integrations are set to accelerate billions of AI workloads on the Arm Compute Platform, with the recent PyTorch integration leading to 2.5x faster time-to-first token on Arm-based AWS Graviton processors when running the Llama 3 large language model (LLM).
Finally, developers need a platform that is easy to access and use. Arm has made this a reality through significant software investments that ensure developing on the Arm Compute Platform is a simplified, seamless experience that “just works.”
As each Arm Tech Symposia keynote speaker summarized: “The power of Arm and our ecosystem is that we deliver what developers need to simplify the process, accelerate time-to-market, save costs and optimize performance.”
The role of the Arm ecosystem
The importance of the Arm ecosystem in making new technologies a reality was highlighted throughout the keynote presentations. This is especially true for new silicon designs that require a combination of core expertise across many different areas.
As Dermot O’Driscoll, VP, Product Management for Arm’s Infrastructure Line of Business, said at the Arm Tech Symposia event in Shanghai, China: “No one company will be able to cover every single level of design and integration alone.”
Empowering these powerful ecosystem collaborations is a core aim of Arm Total Design, which enables the ecosystem to accelerate the development and deployment of silicon solutions that are more effective, efficient and performant. The program is growing worldwide, with the number of members doubling since the program was launched in late 2023. Each Arm Total Design partner offers something unique that accelerates future silicon designs, particularly those that are built on Arm Neoverse Compute Subsystems (CSS).
One company that exemplifies the spirit and value of Arm Total Design is South Korea-based Rebellions. Recently, it announced the development of a new large-scale AI platform, the REBEL AI platform, to drive power efficiency for AI workloads. Built on Arm Neoverse V3 CSS, the platform uses a 2nm process node and packaging from Samsung Foundry and leverages design services from ADtechnology. This demonstrates true ecosystem collaboration, with different companies offering different types of highly valuable expertise.
Dermot O’Driscoll said: “The AI era requires custom silicon, and it’s only made possible because everyone in this ecosystem is working together, lifting each other up and making it possible to quickly and efficiently meet the rising demands of AI.”
Arm Total Design is also helping to enable a new thriving chiplet ecosystem that already involves over 50 leading technology partners who are working with Arm on the Chiplet System Architecture (CSA). This is creating the framework for standards that will enable a thriving chiplet market, which is key to meeting ongoing silicon design and compute challenges in the age of AI.
The journey to 100 billion Arm-based devices running AI
All the keynote speakers closed their Arm Tech Symposia keynotes by reinforcing the commitment that Arm CEO Rene Haas made at COMPUTEX in June 2024: 100 billion Arm-based devices running AI by the end of 2025.
However, this goal is only possible if ecosystem partners from every corner of the technology industry work together like never before. Fortunately, as explained in all the keynotes, there are already many examples of this work in action.
The Arm Compute Platform sits at the center of these ecosystem collaborations, providing the technology foundation for AI that will help to transform billions of lives around the world.
As developers and platform engineers seek greater performance, efficiency, and scalability for their workloads, Arm-based cloud services provide a powerful and trusted solution. At KubeCon NA 2024, we had the pleasure of meeting many of these developers face-to-face to showcase Arm solutions as they migrate to Arm.
Today, all major hyperscalers, including Amazon Web Services (AWS), Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure (OCI), offer Arm-based servers optimized for modern cloud-native applications. This shift offers a significant opportunity for organizations to improve price-performance ratios, deliver a lower total cost of ownership (TCO), and meet sustainability goals, while gaining access to a robust ecosystem of tools and support.
At KubeCon NA, it was amazing to hear from those in the Arm software ecosystem share their migration stories and the new possibilities they’ve unlocked.
Arm from cloud to edge at KubeCon
Building on Arm unlocks a wide range of options from cloud to edge. It enables developers to run their applications seamlessly in the cloud, while tapping into the entire Arm software and embedded ecosystem and respective workflows.
Arm-based servers are now integrated across leading cloud providers, making them a preferred choice for many organizations looking to enhance their infrastructure. At KubeCon NA 2024, attendees learned about the latest custom Arm compute offerings available from major cloud service providers including:
AWS Graviton series for enhanced performance and energy efficiency;
Microsoft Azure Arm-based VMs for scalable, cost-effective solutions;
Google Cloud’s Tau T2A instances for price-performance optimization; and
OCI Ampere A1 Compute for flexible and powerful cloud-native services.
Ampere showcased their Arm-based hardware in multiple form factors across different partner booths at the show to demonstrate how the Arm compute platform is enabling server workloads both in the cloud and on premises.
System76 ‘s Thelio Astra, an Arm64 developer desktop, featuring Ampere Altra processors, was also prominently displayed in booths across the KubeCon NA show floor. The workstation is streamlining developer workflows for Linux development and deployment across various markets, including automotive and IoT.
During the show, the Thelio Astra showcased its IoT capabilities by aggregating and processing audio sensor data from Arduino devices to assess booth traffic. This demonstrated cloud-connected IoT workloads in action.
Migrating to Arm has never been easier
Migrating workloads to Arm-based servers is more straightforward than ever. Today, 95% of graduated CNCF (Cloud Native Computing Foundation) projects are optimized for Arm, ensuring seamless, efficient, and high-performance execution.
Companies of all sizes visited the Arm booth at KubeCon NA to tell us about their migration journey and learn how to take advantage of the latest developer technologies. They included leading financial institutions, global telecommunications providers and large retail brands.
For developers ready to add multi-architecture support to their deployments, we demonstrated a new tool – kubearchinspect – that can be deployed on a Kubernetes cluster and scan for container images to check for Arm compatibility. Check out our GitHub repo to get started and how to validate Arm support for your container images.
Hundreds of independent software vendors (ISVs) are enabling their applications and services on Arm, with developers easily monitoring application performance and managing their workloads via the Arm Software Dashboard.
For developers, the integration of GitHub Actions, GitHub Runners, and the soon to be available Arm extension for GitHub Copilot, means a seamless cloud-native CI/CD workflow is now fully supported on Arm. Graduated projects can scale using cost-effective Arm runners, while incubating projects benefit from lower pricing and improved support from open-source Arm runners.
Extensive Arm ecosystem and Kubernetes support
As Kubernetes continues to grow, with 5.6 million developers worldwide, expanding the contributor base is essential to sustaining the cloud-native community and supporting its adoption in technology stacks. Whether developers are using AWS EKS, Azure AKS, or OCI’s Kubernetes service, Arm is integrated to provide native support. This enables the smooth deployment and management of containerized applications.
Scaling AI workloads and optimizing complex inference pipelines can be challenging across different architectures. Developers can deploy their AI models across distributed infrastructure, seamlessly integrating with the latest AI frameworks to enhance processing efficiency.
Through a demonstration at the Arm booth, Pranay Bhakre, a Principal Solutions Engineer at Arm, showcased AI over Kubernetes. This brought together Kubernetes, Prometheus and Grafana open-source projects into a power-efficient real-time, scalable, sentiment analysis application. More information about how to enable real-time sentiment analysis on Arm Neoverse-based Kubernetes clusters can be found in this Arm Community blog.
Additionally, at Kubecon 2024, we launched a pilot expansion of our “Works on Arm” program into the CNCF community. This offers comprehensive resources to help scale and optimize cloud-native projects on the Arm architecture. Developers can click here to take a short survey and request to be included in this new initiative.
Switch to Arm for smarter deployment and scalable performance
As demonstrated at KubeCon 2024, Arm is transforming cloud-native deployment and accelerating the developer migration to Arm.
In fact, now is the perfect time to harness Arm-based cloud services for better performance, lower costs, and scalable flexibility. Developers can start building or migrating today to deploy smarter, optimized cloud-native applications on Arm, for Arm.
Developers are welcome to join us at KubeCon Europe in April 2025 to learn more about our latest advancements in platform engineering and cloud-native technologies.
We’re living in a generation of compute that is being defined by AI – a transformation that is happening at a pace unlike anything we’ve seen before. Arm remains on the critical path to enabling this AI-accelerated future in a sustainable and scalable way, providing new engineering innovation and developments to make it happen. It’s clear to me that this vision is shared across our ecosystem, including at this week’s Microsoft Ignite event.
Across the many AI advancements announced by Microsoft, it’s evident they are on the path to building a sustainable, scalable, and secure platform for AI and that they’re dedicated to changing the way developers build, deploy, and scale their applications in the cloud. Arm’s collaboration with Microsoft on Azure Cobalt 100 has already shifted the landscape of cloud data centers and the services offered by Microsoft in just one year since its launch in 2023. By leveraging the flexibility and power-efficiency of Arm Neoverse Compute Subsystems (CSS), Microsoft is pushing the boundaries of compute with Cobalt 100, establishing a capable and flexible infrastructure supporting a wide variety of mission critical modern applications — from media servers and open-source databases to CI/CD pipelines.
AI has not only opened the world’s eyes to the power challenge in the datacenter, but it has unlocked a greater emphasis on the need for more specialized silicon. Every watt counts, and for change-makers like Microsoft, this means taking greater control over the entire infrastructure stack from silicon to cloud service deployment with sustainability in focus.
As mentioned in the Microsoft keynote, 100% of Microsoft Teams’ media processing capabilities now run on Cobalt 100, which is a testament to purpose-built compute delivering the required performance as efficiently as possible. This is the mission that Neoverse CSS was built for. Through tailored solutions like Cobalt 100, Microsoft is setting the stage for a future-ready cloud, capable of handling the growing demands of AI-enabled workloads without pushing energy consumption to unsustainable levels. To dig in on the impressive performance gains delivered by Cobalt 100-powered VMs to date, I encourage you to check out this week’s Arm Viewpoints podcast with Arpita Chatterjee, Senior Product Manager for Azure Platforms. And if you happen to tune into the Microsoft Ignite digital event, check out Arm’s virtual booth.
In addition to the impressive Cobalt 100 momentum to date, Microsoft announced they will be the first cloud vendor to make instances based on Nvidia’s Grace Blackwell platform available. Consisting of 72 Arm Neoverse V2 cores connected through a high-bandwidth coherent link to Nvidia’s latest Blackwell accelerator, Grace Blackwell is a great example of the kind of specialized silicon the Arm platform enables our partners to build, in this case targeting the most demanding AI training and inference workloads.
The groundwork for an AI-powered future
Arm’s longstanding partnership with Microsoft has been instrumental in our mission to enable a modern AI-enabled data center with specialized silicon, but silicon is not the limit of our work together. We’re partnering to make it as easy as possible for developers to transition their workloads to optimized, Arm-based platforms. With tools like the Arm Software Ecosystem Dashboard and a robust library of Azure-specific tutorials and resources, developers are getting access to a comprehensive view of software packages supported on Arm and hands-on instructions to seamlessly migrate and run their applications on Arm-based Microsoft Azure instances. One example I’m particularly excited about is the new Arm extensions for GitHub Copilot which will offer specialized tools for AI and standard code development, such as code migration, containerization, CI/CD workflows, and performance optimization. We’ll be releasing it in the Github marketplace this year, so watch this space for more updates on availability!
Cobalt 100 is only one example of a movement toward Arm-based purpose-built computing solutions that is happening across the broader data center landscape. The Arm architecture is becoming the foundation for specialized silicon needed to achieve the performance and efficiency required to succeed in the AI era. Alongside decades of investment in a robust software ecosystem to help developers bring their AI innovations to life, this is the groundwork for an AI-powered future that brings innovative advances in sciences, commerce, productivity and more.
At AI Expo Africa 2024, Arm brought together AI developers, enthusiasts, and industry leaders through immersive workshops, insightful talks, exclusive networking opportunities, and an engaging booth experience. The event is Africa’s largest AI conference and trade show, with over 2,000 delegates from all over the African continent.
Arm has been attending AI Expo Africa for the past three years, and this year we noted a significant uptick in AI applications running on Arm and a definite thirst for knowledge in how to best to deploy and accelerate AI on Arm. Held at the Sandton Convention Centre in Johannesburg, South Africa, Arm’s presence at the event left a strong impact on the AI developer ecosystem, fostering connections and sparking innovation, with a range of expert insights from Arm tech leaders and Ambassadors from the Arm Developer Program.
Arm Ambassadors are a group of experts and community leaders developing on Arm who support and help lead the Developer Program through a host of Arm-endorsed activities like the various talks, workshops and engagements at AI Expo Africa. At the event, there were Arm Ambassadors from Ghana, Kenya, Switzerland and, of course, South Africa in attendance.
Day 1: Workshops and live demos
Arm kicked off with a high-energy workshop that saw an incredible turnout. Shola Akinrolie, Senior Manager for the Arm Developer Program, opened the session with a keynote introduction, setting the stage for a deep dive into Arm’s AI technology and its community-driven initiatives.
Distinguished Arm Ambassador Peter Ing then took the spotlight, showing how to run AI models at the edge on the Arm Compute Platform. He demonstrated the Llama 3.2 1B model running on a Samsung mobile device, showcasing real-time AI inference capabilities and illustrating how Arm is creating new opportunities for running small language models on the edge. The live demo left the audience captivated by the performance and efficiency of the Arm Compute Platform.
Another standout session was led by Distinguished Arm Ambassador Dominica Abena Oforiwaa Amanfo, who shared her expertise on the Grove Vision AI V2 microcontroller (MCU), which is powered by a dual-core Arm Cortex-M55 CPU and Ethos-U55 NPU NN unit. Dominica highlighted the TinyML’s capabilities, as well as its compatibility with PyTorch and ExecuTorch. This showcased the reach and versatility of low-power, high impact AI innovations that are powered by Arm.
The Arm booth: A hub of innovation
At AI Expo Africa, the Arm booth was bustling with energy, drawing hundreds of developers eager to experience Arm’s technology first-hand. The team engaged with visitors in discussions and hands-on demos. The booth was packed with excitement, from insightful tech exchanges to exclusive SWAG giveaways, including a highly sought-after Raspberry Pi MCU!
To end the day, Arm hosted an exclusive Arm Developer Networking Dinner. The evening was filled with lively discussions led by Arm’s Director of Software Technologies Rod Crawford and Arm Developer Program Ambassadors, as they shared their insights on AI’s future and the impact of edge computing across various industries.
Day 2: Inspiring talks and networking
On day two of the event, Arm’s Rod Crawford, captivated the audience with a powerful talk on “Empowering AI from Cloud to Edge.” Rod shared how Arm supports developers in harnessing the full potential of AI, from efficient cloud computing to high-performance, edge-based AI solutions. This means developers can create more powerful applications that work better and faster.
The tallk demonstrated how both generative AI and classic AI workloads could run across the entire spectrum of computing on Arm, from powerful cloud services to mobile and IoT devices. Through Arm Kleidi, Arm is engaging with leading AI frameworks, like MediaPipe, ExecuTorch and PyTorch, to ensure developers can seamlessly take advantage of AI acceleration on Arm CPUs without any changes to their code. Rod’s insights were met with enthusiasm as developers learned how Arm’s technologies accelerate AI deployment, even for the most demanding applications.
The final day wrapped up with a high-spirited “Innovation Coffee” session, offering attendees a relaxed environment to connect and reflect on Arm’s advancements. Stay tuned for highlights of this session on the Arm Software Developers YouTube channel.
A heartfelt thanks
Arm extends its deepest gratitude to everyone who contributed to and joined us at AI Expo Africa. Special thanks to the Arm team—Rod Crawford, Gemma Platt, and Stephen Ozoigbo—as well as the incredible Arm Developer Program Ambassadors Peter Ing, Dominica Amanfo, Derrick Sosoo, Brenda Mboya, and Tshega Mampshika for their hard work and passion. We also appreciate MarvinRotermund, Nomalungelo Maphanga, StephaniaObaa Yaa Bempomaa, and Mia Muylaert for their energy and support at the booth.
Here are what some of the Arm Developer Program Ambassadors had to say about the event:
Brenda Mboya: “One of my favorite moments at the event was seeing the lightbulb go off for attendees who visited the Arm booth and realized how integral Arm has been in their lives. It was an honor to engage with young people interested in utilizing Arm-based technology in their school initiatives and I am glad that I was able to direct them to sign-up to be part of the Arm Developer Program.”
Derrick Sosoo: “Arm’s presence at AI Expo Africa 2024 marked a significant shift towards building strong connections with developers through immersive experiences. Our engaging workshops, insightful talks, Arm Developer meetup, and interactive booth showcase left an indelible mark on attendees.”
Dominica Amanfo: “We witnessed overwhelming interest from visitors eager to learn about AI on Arm and our Developer Program. I’m particularly grateful for the opportunity to collaborate with fellow Arm Ambassadors alongside our dedicated support team at the booth, which included students from the DUT Arm (E³)NGAGE Student Club.”
The future of AI is built on Arm
By uniting innovators, developers, and enthusiasts, Arm is leading the charge in shaping the future of AI. Together, we’re building a community that will drive the future of AI on Arm, empowering developers worldwide to innovate and bring cutting-edge technology to life.
Learn more about Arm’s developer initiatives and join the journey at Arm Developer Program.
Artificial intelligence (AI) in the automotive industry is no longer a future-looking buzzword. From smart navigation that learns from every journey to intelligent interactions between the driver and car, AI has been consistently revolutionizing the driving experience.
Moreover, AI is helping to save lives. It’s making roads safer with predictive safety features and driver assistance systems that feel like having a co-pilot with superhuman reflexes. But, contrary to popular belief, AI is not a recent phenomenon in the automotive sector and has been integrated into automotive applications for over two decades.
As Masashige Mizuyama, Representative Director, Vice President and CTO at Panasonic Automotive Systems, highlights in the recent Arm Viewpoints podcast: “AI has been integrated into automotive applications for over 20 years, evolving from simple voice commands to advanced deep learning models that understand natural language.”
This evolution goes beyond just adding new features; it’s about fundamentally transforming the driving experience. Advanced driver assistance systems (ADAS), Human Machine Interface (HMI), and in-vehicle infotainment (IVI) are prime examples of how AI enhances vehicle safety and user interaction. Moreover, the fusion of sensor data using AI improves safety and provides meaningful insights to both drivers and passengers.
AI in ADAS
One of the most prominent applications of AI in cars is ADAS. These systems enhance vehicle safety by providing real-time data processing and decision-making capabilities. According to a report by the Partnership for Analytics Research in Traffic Safety (PARTS), by 2023 five ADAS features – forward collision warning, automatic emergency braking, pedestrian detection warning, pedestrian automatic emergency braking (AEB), and lane departure warning—achieved market penetration rates higher than 90% in new vehicles.
AI in HMI
Another significant advancement is the HMI. AI-powered voice recognition systems allow drivers to keep their eyes on the road and hands on the wheel while interacting with their vehicles. This technology is rapidly evolving, making in-car interactions more seamless and enhancing overall driving safety.
AI enables cars to perceive and infer the intentions of drivers and passengers, allowing for smarter and more autonomous responses. For instance, if a driver expresses a desire for coffee, the AI can recommend a nearby coffee shop, set the navigation route, and even place an order—all while minimizing distractions.
Moreover, AI’s ability to process vast amounts of data from various sensors is essential for ensuring safety and enhancing the overall driving experience. By leveraging AI, vehicles can provide a more comfortable and creative environment, allowing occupants to make the most of their time on the road.
AI-powered voice recognition systems, for example, allow drivers to interact with their vehicles without taking their hands off the wheel or their eyes off the road. This technology is rapidly evolving, making in-car interactions more seamless and intuitive. In fact, according to GlobalData’s report, in the past three years, the automotive industry has seen over 720,000 patents filed and approved. This widespread use highlights the increasing reliance on voice technology for in-car interactions.
AI-based IVI systems
AI-powered IVI systems are set to transform the driving experience by integrating multiple advanced technologies that continuously adapt to the habits of drivers. Voice recognition will be used for seamless, hands-free interaction, enhancing safety and convenience. Natural language processing will enable intuitive communication, making interactions feel more human-like.
AI-based data analytics, meanwhile, will provide real-time, relevant updates for drivers to create a more enjoyable, efficient, and personalized driving environment and experience. In fact, according to ABIresearch, by 2030 consumers are expected to spend over 500 million hours annually using in-car video-on-demand apps.
The fusion of sensor data
Integrating various sensor data using AI is a critical advancement in automotive technology. This process, known as sensor fusion, combines data from multiple sensors to create a comprehensive understanding of the vehicle’s environment. This fusion not only improves the efficiency of data processing, but also enhances safety by providing meaningful insights to both drivers and passengers.
Sensor fusion technology allows autonomous vehicles to build a detailed model of their surroundings using data from RADAR, LiDAR, cameras, and ultrasonic sensors.
Meanwhile, AI-driven sensor fusion also enables personalized in-car experiences. By analyzing data from various sensors, vehicles can adjust settings, such as seat position, climate control, and infotainment preferences based on the driver’s habits and preferences.
Simon Teng, Senior Director of Automotive Partnerships at Arm, emphasizes the importance of integrating various sensor data using AI: “This fusion of information not only improves the efficiency of data processing but also enhances safety by providing meaningful insights to both drivers and passengers. The ability to process complex instructions and deliver personalized experiences marks a significant leap in automotive technology”.
By leveraging AI, vehicles can offer a more intuitive and seamless driving experience. For instance, AI can analyze data from in-cabin cameras to detect driver drowsiness and issue alerts or adjust the vehicle’s settings to keep the driver alert. This proactive approach to safety and comfort is a testament to the transformative potential of AI in the automotive industry.
The future of AI in Car
Looking ahead, AI advancements promise to significantly improve the in-vehicle experience. Mizuyama-san envisions a future where AI enhances comfort and hospitality, allowing cars to proactively offer suggestions and controls based on the inferred needs of drivers and passengers.
Vehicles will transform into versatile spaces that can adapt to various needs, such as becoming a mobile office or a relaxing environment. By leveraging AI, cars can create personalized experiences that make time spent in the vehicle more enjoyable and productive.
How Arm and Panasonic Automotive Systems are pioneering innovations
Both Arm and Panasonic Automotive Systems are at the forefront of automotive innovation, working together to push the boundaries of what is possible. Mizuyama-san shared Panasonic Automotive Systems’ vision to become the best “Joy in Motion” design company, focusing on eliminating the pains of mobility and enhancing the overall user experience.
Overall, the integration of AI in the automotive industry is not just about adding new features; it’s about transforming the entire driving experience. As AI technology continues to advance, we can expect even greater innovations that will redefine how we interact with our vehicles.
As artificial intelligence evolves, there is increasing excitement about executing AI workloads on embedded devices using small language models (SLM).
Arm’s recent demo, inspired by Microsoft’s “Tiny Stories” paper and Andrej Karpathy’s TinyLlama2 project, where a small language model trained on 21 million stories generates text, showcases endpoint AI’s potential for IoT and edge computing. In the demo, a user inputs a sentence, and the system generates an extended children’s story based on it.
Our demo featured Arm’s Ethos-U85 NPU (Neural Processing Unit) running a small language model on embedded hardware. While large language models (LLMs) are more widely known, there is growing interest in small language models due to their ability to deliver solid performance with significantly fewer resources and lower costs, making them easier and cheaper to train.
Implementing A Transformer-based Small Language Model on Embedded Hardware
Our demo showcased the Arm Ethos-U85 as a small, low-power platform capable of running generative AI, highlighting that small language models can perform well within narrow domains. Although TinyLlama2 models are simpler than the larger models from companies like Meta, they are ideal for showcasing the U85’s AI capabilities. This makes them a great fit for endpoint AI workloads.
Developing the demo involved significant modeling efforts, including the creation of a fully integer int8 (and int8x16) Tiny Llama2 model, which was converted to a fixed-shape TensorFlow Lite format suitable for the Ethos-U85’s constraints.
Our quantization approach has shown that fully integer language models can successfully balance the tradeoff between maintaining strong accuracy and output quality. By quantizing activation, normalization functions, and matrix multiplications, we eliminated the need for floating-point computations, which are more costly in terms of silicon area and energy—key concerns for constrained embedded devices.
The Ethos-U85 ran a language model on an FPGA platform at only 32 MHz, achieving text generation speeds of 7.5 to 8 tokens per second—matching human reading speed—while using just a quarter of its compute capacity. In a real system-on-chip (SoC), performance could be up to ten times faster, significantly enhancing speed and energy efficiency for AI processing at the edge.
The children’s story-generation feature used an open-source version of Llama2, running the demo on TFLite Micro with an Ethos-NPU back-end. Most of the inference logic was written in C++ at the application level. Adjusting the context window enhanced narrative coherence, ensuring smooth, AI-driven storytelling.
The team’s adaptation of the Llama2 model to run efficiently on the Ethos-U85 NPU required careful consideration of performance and accuracy due to the hardware limitations. Using mixed int8 and int16 quantization demonstrates the potential of fully integer models, encouraging the AI community to optimize generative models for edge devices and expand neural network accessibility on power-efficient platforms like the Ethos-U85.
Showcasing the Power of the Arm Ethos-U85
Scalable from 128 to 2048 MAC units (multiply-accumulate units), the Ethos-U85 achieves a 20% power efficiency improvement over its predecessor, the Ethos-U65. A standout feature of the Ethos-U85 is its native support for transformer networks, which earlier versions could not support.
The Ethos-U85 enables seamless migration for partners using previous Ethos-U NPUs, allowing them to capitalize on existing investments in Arm-based machine learning tools. Developers are increasingly adopting the Ethos-U85 for its power efficiency and high performance.
The Ethos-U85 can reach 4 TOPS (trillions of operations per second) with a 2048 MAC configuration in silicon. In the demo, however, a smaller configuration of 512 MACs on an FPGA was used to run the Tiny Llama2 small language model with 15 million parameters at just 32 MHz.
This capability highlights the potential for embedding AI directly into devices. The Ethos-U85 effectively handles such workloads even with limited memory (320 KB of SRAM for caching and 32 MB for storage), paving the way for small language models and other AI applications to thrive in deeply embedded systems.
Bringing Generative AI to Embedded Devices
Developers need better tools to navigate the complexities of AI at the edge, and Arm is addressing this with the Ethos-U85 and its support for transformer-based models. As edge AI becomes more prominent in embedded applications, the Ethos-U85 is enabling new use cases, from language models to advanced vision tasks.
The Ethos-U85 NPU delivers the performance and power efficiency required for innovative, cutting-edge solutions. Like the “Tiny Stories” paper, our demo represents a significant advancement in bringing generative AI to embedded devices, demonstrating the ease of deploying small language models on the Arm platform.
Arm is opening new possibilities for Edge AI across a wide range of applications, positioning the Ethos-U85 to power the next generation of intelligent, low-power devices.
Read how Arm is accelerating real-time processing for edge AI applications in IoT with ExecuTorch.
When you’re driving hard to disrupt quantum computing paradigms, sometimes it’s smart to chill out.
That’s Equal1’s philosophy. The Ireland-based company has notched another milestone on its journey deeper into the rapidly evolving field of quantum computing. Building on its success as winners of the “Silicon Startups Contest” in 2023, Equal1 has successfully tested the first chip incorporating an Arm Cortex processor at an astonishing temperature of 3.3 Kelvin (-269.85°C). That’s just a few degrees warmer than absolute zero, the theoretical lowest possible temperature where atomic motion nearly stops.
Equal1’s achievement is a crucial step in integrating classical computing components within the extremely power-constrained environment of a quantum cryo chamber. This brings the world closer to practical, scalable quantum computing systems. Cold temperatures reduce thermal noise that can cause errors in quantum computations and preserve quantum “coherence” – the ability of qubits to exist in multiple states simultaneously.
The Importance of Cryogenic Temperatures in Quantum Computing
What sets Equal1 apart in the quantum computing landscape is its pragmatic approach to quantum integration. Rather than creating entirely new infrastructure, Equal1’s vision was to build upon the foundation of the well-established semiconductor industry. This strategy became viable with the emergence of fully depleted silicon-on-insulator (FDSOI) processes, which the company’s founders recognized as having the potential to support quantum operations.
“Our thesis is that rather than tear up everything we’ve done and start anew, let’s try to build on top of what we’ve already built,” said Jason Lynch, CEO of Equal1. This philosophy has led to partnerships with industry leaders like Arm and NVIDIA, leveraging existing semiconductor expertise while pushing into quantum territory.
Cryo-Temperature Breakthrough
What makes this accomplishment particularly remarkable is the extensive engineering required to make it possible.
“There is no such thing as a Spice Kit that works, that predicts what silicon is going to do at 3 Kelvin,” said Brendan Barry, Equal1’s CTO. “In fact, there’s no such thing as a methodology, no libraries you can get to make it happen.”
Over five years, Equal1, which is part of the Arm Flexible Access program, developed its own internal Process Design Kit (PDK) and methodologies to predict and optimize logic behavior at cryogenic temperatures.
Equal1’s approach uses electrons or holes (the absence of electrons) as qubits, making their technology uniquely compatible with standard CMOS manufacturing processes. This choice wasn’t accidental; it’s fundamental to the company’s vision of creating practical, manufacturable quantum computers.
Working with commercial CMOS Fabs, Equal1 uses a standard process with proprietary design techniques developed over six years of research. These techniques enable operation at cryogenic temperatures while maintaining manufacturability.
“We’re not changing anything in the process itself, but we are certainly pushing the limits of what the process can do,” Barry said.
Integrating the Arm Cortex-A55 Processor
Building on this success, Equal1 is now setting its sights even higher. The company plans to incorporate the more powerful Arm Cortex-A55 processor into its next-generation Quantum System-on-Chip (QSoC). This ambitious project aims to have silicon available by mid-2025, the company said.
The integration of Arm technology is crucial not just for processing power, but for power efficiency. At cryogenic temperatures, power management becomes critical as any heat generated can affect the quantum states. Arm’s advanced power-management features make it an ideal choice for this challenging environment.
Equal1’s technology targets three primary application areas:
Chemistry and drug discovery, potentially reducing the current 15-year, $1.3 billion average cost of bringing new drugs to market.
Optimization problems in finance, logistics, and other fields requiring complex variable management.
Quantum AI applications, where quantum computing could dramatically improve efficiency.
Perhaps most revolutionary is Equal1’s approach to deployment. Unlike traditional quantum computers that require specialized facilities, Equal1 envisions rack-mounted quantum computers that can be installed in standard data centers at a fraction of the cost of current solutions.
“They just rack in like any other standard high-performance compute,” said Patrick McNally, Equal1’s marketing lead.
The Road Ahead for Quantum Computing and Equal1
Equal1’s progress brings the world closer to the reality of compact, powerful quantum computers that can be deployed in standard high-performance computing environments. The company’s integration of Arm technology at cryogenic temperatures opens new possibilities for quantum-classical hybrid systems, potentially creating increased demand for Arm adoption across the quantum computing industry.
As quantum computing continues to evolve, Equal1’s practical approach to integration with existing semiconductor technology and infrastructure could prove to be a game-changer. With applications ranging from drug discovery to financial modeling and beyond, the future of quantum computing is looking increasingly accessible and practical.
The rising adoption of advanced driver-assistance systems (ADAS), autonomous driving (AD) features, and software capabilities in software-defined vehicles (SDVs) is leading to growing computing complexities, particularly for software and developers. This has created a demand for more efficient, reliable, and powerful tools that streamline and strengthen the automotive development experience.
System76 and Ampere have responded to this need with Thelio Astra, an Arm64 developer desktop designed to revolutionize the Arm Linux development process for automotive applications. This innovative desktop offers developers the performance, compatibility, and reliability to push the boundaries of new and advancing automotive technologies.
Unlocking the potential of automotive software with Thelio Astra
Designed to meet the rigorous demands of ADAS, AD, and SDVs, the Thelio Astra uses the same architecture as Arm-based automotive electronic control units (ECUs). The architectural consistency ensures that the software developed for automotive applications runs efficiently on Arm-based systems without additional modifications.
This native-development environment provides faster, more cost-effective, and more power-efficient software testing, promoting safer roads with smarter prototypes. Moreover, by leveraging the same architecture in build and deployment environments, developers can streamline their processes by avoiding cross-compilation, which simplifies the build, test, and deployment environments.
Key benefits of Thelio Astra
Access to native performance: Developers can execute build and test cycles directly on Arm Neoverse processors, eliminating the performance overhead and complexities associated with instruction emulation and cross-compilation.
Improved virtualization: Familiar virtualization and container tools on Arm simplify the development and test process.
Better cost-effectiveness: Developers benefit from the ease of use and cost savings of having a local computer with a high core count, large memory, and plenty of storage.
Enhanced compatibility: Out-of-the-box support for Arm64 and NVIDIA GPUs eliminates the need for Arm emulation, which simplifies the developer process and overall experience.
Built for power efficiency: The system is engineered to prevent thermal throttling, ensuring reliable, sustained performance during the most intensive workloads, like AI-based AD and ADAS.
Advanced AI: Developers can build AI-based applications using frameworks, such as PyTorch on Arm, enabling powerful AI capabilities for automotive.
Optimized developer process: The development process can be optimized by enabling developers to run large software stacks on their local machine, making it easier to fix issues and improve performance.
Unrivaled ecosystem support: The robust and dynamic Arm software ecosystem for automotive offers a comprehensive range of tools, libraries, and frameworks to support the development of high-performance, secure, and reliable automotive software.
Accelerated time-to-market: Developers can create advanced software solutions without waiting for physical silicon, accelerating innovation and reducing development cycles.
Cutting-edge configuration for efficient automotive workloads
Thelio Astra is designed to handle intensive workloads. This is achieved through an advanced configuration with up to a 128-core Ampere® Altra® processor (3.0 GHz), 512GB of 8-channel DDR4 ECC memory (3200 MHz), an NVIDIA RTX 6000 Ada GPU, 8TB of PCIe 4.0 M.2 NVMe storage, and dual 25 Gigabit Ethernet SPF28. This setup guarantees that developers can tackle the most demanding tasks with ease, providing the performance and reliability that are essential for cutting-edge automotive development.
Driving Innovation with SOAFEE and Arm Neoverse V3AE
Thelio Astra will play a crucial role in the Scalable Open Architecture for Embedded Edge (SOAFEE) initiative, which aims to standardize automotive software development. By providing a native Arm64 development environment, Thelio Astra supports the SOAFEE reference stack, EWAOL, alongside other automotive software frameworks, with helping to accelerate innovation and shorten development cycles.
Thelio Astra also capitalizes on the momentum from the introduction of the Arm Neoverse V3AE, the first server-class CPU designed for the automotive market. The Neoverse V3AE delivers robust performance and reliability, making it essential for AI-accelerated AD and ADAS workloads.
Pioneering the future of automotive software development
Thelio Astra represents a significant leap forward in Arm Linux development for the automotive industry. By addressing the growing complexities of ADAS, AD, and SDVs, System76 and Ampere have created an indispensable tool with Thelio Astra. This will provide the compatibility needed for automotive target hardware, while delivering the performance developers expect from a developer desktop.
As the automotive landscape continues to evolve, tools like Thelio Astra will be essential in ensuring that developers have the resources they need to create the next generation of automotive applications and software.
Access the new learning path
Looking for more information? Here’s an introductory learning path for automotive developers interested in local development using the System76 Thelio Astra Linux desktop computer.