HomeToolsNVIDIA’s custom chatbot runs locally on RTX AI PCs

NVIDIA’s custom chatbot runs locally on RTX AI PCs

NVIDIA has released Chat with RTX as a tech demo of how AI chatbots could be run locally on Windows PCs using its RTX GPUs.

The standard approach of using an AI chatbot is to make use of an internet platform like ChatGPT or to run queries via an API, with inference happening on cloud computing servers. The drawbacks of this are the prices, latency, and privacy concerns with personal or corporate data transferring forwards and backwards.

NVIDIA’s RTX range of GPUs is now making it possible to run an LLM locally in your Windows PC even should you’re not connected to the web.

Chat with RTX lets users create a personalised chatbot using either Mistral or Llama 2. It uses retrieval-augmented generation (RAG) and NVIDIA’s inference optimizing TensorRT-LLM software.

You can direct Chat with RTX to a folder in your PC after which ask it questions related to the files within the folder. It supports various file formats, including .txt, .pdf, .doc/.docx and .xml.

Because the LLM is analyzing locally stored files with inference happening in your machine, it is absolutely fast and none of your data is shared on potentially unsecured networks.

You could also prompt it with a YouTube video URL and ask it questions on the video. That requires web access nevertheless it’s an incredible method to get answers without having to observe an extended video.

You can download Chat with RTX without spending a dime but you’ll should be running Windows 10 or 11 in your PC with a GeForce RTX 30 Series GPU or higher, with a minimum 8GB of VRAM.

Chat with RTX is a demo, slightly than a finished product. It’s a little bit buggy and doesn’t remember context so you possibly can’t ask it follow up questions. But it’s a pleasant example of the way in which we’ll use LLMs in the long run.

Using an AI chatbot locally with zero API call costs and little or no latency is probably going the way in which most users will eventually interact with LLMs. The open-source approach that corporations like Meta have taken will see on-device AI drive the adoption of their free models slightly than proprietary ones like GPT.

That being said, mobile and laptop users may have to attend some time yet before the computing power of an RTX GPU can fit into smaller devices.


Please enter your comment!
Please enter your name here

Must Read