Deploying Accelerated Llama 3.2 from the Edge to the Cloud

By: Anjali Shah

26 September 2024 at 01:39

Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an...

Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an updated Llama Guard model with support for vision. When paired with the NVIDIA accelerated computing platform, Llama 3.2 offers developers, researchers, and enterprises valuable new capabilities and optimizations to realize their…

Source

Using Generative AI to Enable Robots to Reason and Act with ReMEmbR

Jetson – NVIDIA

By: Abrar Anwar

24 September 2024 at 03:01

Vision-language models (VLMs) combine the powerful language understanding of foundational LLMs with the vision capabilities of vision transformers (ViTs) by... Photo of robot moving down a path.

Vision-language models (VLMs) combine the powerful language understanding of foundational LLMs with the vision capabilities of vision transformers (ViTs) by projecting text and images into the same embedding space. They can take unstructured multimodal data, reason over it, and return the output in a structured format. Building on a broad base of pretraining, they can be easily adapted for…

Source

The SenseCAP Watcher is a voice-controlled, physical AI agent for LLM-based space monitoring (Crowdfunding)

CNX Software

By: Tomisin Olujinmi

19 September 2024 at 00:01

Seeed Studio has launched a Kickstarter campaign for the SenseCAP Watcher, a physical AI agent capable of monitoring a space and taking actions based on events within that area. Described as the “world’s first Physical LLM Agent for Smarter Spaces,” the SenseCAP Watcher leverages onboard and cloud-based technologies to “bridge the gap between digital intelligence and physical applications.” The SenseCAP Watcher is powered by an ESP32-S3 microcontroller coupled with a Himax WiseEye2 HX6538 chip (Cortex-M55 and Ethos-U55 microNPU) for image and vector data processing. It builds on the Grove Vision AI V2 module and comes in a form factor about one-third the size of an iPhone. Onboard features include a camera, touchscreen, microphone, and speaker, supporting voice command recognition and multimodal sensor expansion. It runs the SenseCraft software suite which integrates on-device tinyML models with powerful large language models, either running on a remote cloud server or a local computer [...]

The post The SenseCAP Watcher is a voice-controlled, physical AI agent for LLM-based space monitoring (Crowdfunding) appeared first on CNX Software - Embedded Systems News.

Reading view