Object tracking with YOLO-NAS and how to use DeciLM to create diffusion prompts
Plus The Sherry Code, Deci Digest, and a webinar on speeding up LLM inference
What’s up, Community!
If you're interested in freelance technical writing or blogging opportunities, feel free to email me with examples of your work. This quarter, I have a lot of content planned but not enough time to create it all.
So, I'm looking for talented, creative, and knowledgeable writers to help bring these ideas to life.
If that’s you, let’s get in touch!
🧐 What’s in this edition?
🕵🏻Object Tracking with DeepSORT and YOLO-NAS
🗞️ The Sherry Code (News headlines)
🏎️ How to Speed up LLM Inference (webinar)
📰 The Deci Digest (Research and repositories)
A poll - Let me know how I did this week
🕵🏻Object Tracking with DeepSORT and YOLO-NAS: A Practitioner’s Guide
Object tracking tracks detected objects throughout frames using their spatial and temporal features.
In this blog post, we will implement two of the most popular tracking algorithms, SORT and DeepSORT, with YOLO-NAS. In addition, will see how we can use YOLO-NAS for object tracking on a custom dataset (i.e., ship detection).
We will also create an application for vehicle counting (entering and leaving) by applying YOLO-NAS and SORT object tracking.
This is tutorial is super hands-on. I hope you’re ready to code!
🗞️ The Sherry Code: Your Weekly AI Bulletin
Shout out to Sherry for sharing her top picks for the AI news headlines you need to know about!
She’s just launched her newsletter,AI Snippets
Show some support and follow her on Instagram, Twitter, and Threads.
Introducing Magic Studio: the power of AI, all in one place. With Magic Studio, there’s no need to toggle between multiple AI tools or learn lots of different software – all the best of AI is at your fingertips.
LinkedIn: Reimagining Hiring and Learning with the Power of AI. LinkedIn is enhancing its Talent tools with AI, introducing "Recruiter 2024" for smarter hiring, AI coaching in Learning, and integrating with Candidate Management Systems to support evolving HR roles.
OpenAI's ChatGPT Now Searches the Web in Real Time—Again - Decrypt. OpenAI reintroduced the web search feature to ChatGPT, enabling it to generate answers to prompts by searching the web for the latest information. ChatGPT Plus subscribers can enable the feature within the account settings under the "Beta features" tab.
Mistral 7B. Mistral 7B, a 7.3B parameter model, excels in various benchmarks with efficient inference mechanisms. It's easily fine-tunable, released under Apache 2.0, and showcased superior chat task performance when fine-tuned, indicating robust generalization capabilities.
🏎️ How to Speed Up LLM Inference
Join us for a webinar on Wednesday, October 11th @ 11:00 am PST for an in-depth exploration into the forefront of model design and optimization techniques. Discover strategies to accelerate LLM inference speed without sacrificing quality or escalating operational expenses.
The size and autoregressive nature of today's large language models (LLMs) pose significant challenges for fast LLM inference. Substantial computational and memory demands profoundly affect latency and cost. Achieving rapid, cost-efficient inference requires the development of smaller, memory-efficient models and the implementation of advanced runtime optimization techniques.
What you'll learn:
Explore efficient modelling techniques: Dive into techniques that enhance LLM efficiency while maintaining quality, including grouped query attention (GQA) and variable GQA.
Understand recent LLMs: Discover why recent LLMs, such as Llama 2 7B and DeciLM 6B outperform older and significantly larger LLMs.
Uncover advanced optimization techniques: Learn about advanced runtime optimization strategies like selective quantization, CUDA kernels, optimized batch search, and dynamic batching.
📰 The Deci Digest
📸 Reka AI Labs, the AI startup founded by researchers from DeepMind, Google, Baidu, and Meta, unveiled Yasa-1. This multimodal #AI assistant understands text, images, videos, and audio.
🚀 The challenge: Running StableDiffusion 1.5 with a 1B transformer model on a Raspberry Pi Zero 2, no extra swap, and no disk offloading. The solution: OnnxStream, a light inference library that separates the inference engine from model weight handling.
🎥 In a recent study, a group of researchers introduced VideoDirectorGPT. This innovative framework capitalizes on the capabilities of LLMs to effectively address the challenge of creating coherent multi-scene videos.
✨ Introduced by Google and Cornell researchers, RealFill is an image completion model that fills missing parts of an image with contextually accurate content. Personalized with a few reference images, it creates compelling, faithful scene completion.
📙 Meta AI released AnyMAL, a comprehensive language model that effortlessly incorporates diverse modality signals, including text, image, video, audio, IMU motion sensor, and generates textual responses.
🖼️ How to Use YOLO-NAS and DeciLM to Generate Diffusion Prompts for DeciDiffusion
At Deci AI, we’ve been on a roll this year, unveiling a series of groundbreaking models that have not only pushed the boundaries of what’s possible in AI but have also been generously made available to the community.
As open models with permissive licenses, they stand as a testament to our commitment to fostering innovation and creativity in the AI space.
As someone who takes immense pride in being a part of the Deci AI family, I felt inspired to celebrate our achievements uniquely.
Rather than just showcasing each model in isolation, I embarked on a creative exploration to weave three of our most popular models into a cohesive project.
The idea?
To use our object detection model’s output as a springboard for our Language Model, DeciLM-6B-Instruct, to craft a captivating one-sentence story.
This story would then prompt our diffusion model, culminating in a visually stunning image generation.
That’s it for this week!
Let me know how I’m doing.
Cheers,