Discover - Install - Start Doing: The Fascination of Local (gen)AI

A hands-on tour through local AI tools like LM Studio, GPT4All and Forge - and why trying things yourself is still the surest way to learn.

In the recent past, I have familiarised colleagues with a wide variety of (gen)AI tools as part of internal Deutsche Telekom training events. These training events are usually designed, organised and carried out by employees of the organisation themselves. Yes, of course we also bring in expertise from “outside”. However, the approach of internal colleagues passing on their own knowledge and allowing everyone to benefit from it is far more motivating and intrinsically fulfilling - at least that’s what I say, having been able to use these formats on several occasions. In addition, internally organised sessions benefit from the specific expertise of colleagues that external contributions would not be able to match. Shakil Awan, for example, organises regular “LEX Sessions” for internal training - and not just on AI topics. Deutsche Telekom’s service division organises regular “Magenta Curriculum” events, mostly online, also not exclusively on the topic of (gen)AI. And then there is the “AI4Coding” area, our “AI Guild”, promptathons, AI Insights and so much more.

With such busy AI activities, I thought it would be interesting to describe a few of the tools that I present in such sessions in more detail here in a slightly larger group. For the professionals among you, there is probably nothing new here. But if you’re looking for an introduction to local AI, you might find a few useful tips.

So let’s take a look at how artificial intelligence can be utilised directly on our own devices. Why is that? Because hands-on learning is still the surest way to success. Not me, but Xunzi (Xun Kuang), a Confucian philosopher from the third century BC:

Not having heard of something is not as good as having heard of it. To have heard of it is not as good as having seen it. To have seen it is not as good as knowing it. Knowing it is not as good as putting it into practice.

Local (gen)AI enables you to do things yourself without worrying about data loss. So let’s see how this looks and works in detail.

Attention: It’s All About the Right Hardware!

Before we install our own AI applications on our computers, one thing should be said: you need a computer with at least average performance to be able to load and test AI models - unfortunately, the Medion 230-Euro laptop won’t cut it. Personally, I would recommend at least the following core equipment:

up-to-date motherboard
2 TB SSD as C: disk
more than 10 TB normal hard drive for AI-generated data
very important: at least one NVIDIA GTX 1070 graphics card with 8 GB VRAM

You should therefore be well equipped to implement mediocre AI tasks on your own computer. Of course, you can also go bigger, better and, above all, more expensive. With an NVIDIA RTX 4090 graphics card, for example. This also has 24 GB of VRAM instead of 8 GB. The larger the VRAM memory, the larger the models that can be loaded and executed directly in its memory. You can also use your own CPU, but it is extremely slow compared to the graphics chips from NVIDIA.

Local AI Models with LM Studio and GPT4All

The development of large language models (LLMs) has moved in two directions in recent years. On the one hand, the models are becoming ever more compact and powerful thanks to newer methods. On the other hand, the hardware on which these models run is becoming ever more powerful (and cheaper). Only the really big AI models are still located in the cloud. However, the options for running powerful AI models locally on your own computer are becoming increasingly numerous and relevant. Two particular tools in this area are “LM Studio” and “GPT4All”.

LM Studio is a cross-platform desktop application that allows users to explore and use LLMs directly on their computer. With an intuitive user interface, LM Studio makes the use of advanced language models accessible to users without extensive technical experience. A particular advantage is the ability to download and manage compatible models directly from Hugging Face. Hugging Face is THE platform for open source AI models as well as data sets and much more.

An outstanding feature of LM Studio is the support of multimodal models such as LLaVA, which can process not only text but also images. This opens up completely new possibilities for analysing and describing visual content.

GPT4All follows a similar approach and makes it possible to run language models on standard hardware. With support for Mac M-Series chips, AMD and NVIDIA GPUs, GPT4All offers broad compatibility. A particular focus here is on data protection and security - all calculations take place locally without sensitive data leaving the device.

Both tools, LM Studio and GPT4All, democratise access to AI technologies and enable researchers, developers and enthusiasts to experiment with state-of-the-art language models without having to rely on external cloud services.

AI-Powered Image Generation with Forge

While we have focussed on language models so far, there are also constantly new developments in the area of image generation for local applications. One notable tool in this context is Forge, an interface for creating images with Stable Diffusion. In addition to ComfyUI, Forge also offers the integration of the currently best free image model called FLUX.

Forge is based on the well-known Automatic1111 interface and offers numerous improvements and enhancements. Particularly noteworthy are the significant speed increases in image generation compared to Automatic1111. For advanced users, Forge offers a range of extended functions, such as:

IP Adapter with Masking: Enables the combination and masking of two input images.
New samplers: Including DDPM, DPM++ 2M Turbo and LCM Karras for improved image quality and speed.
Stable Video Diffusion (SVD): Support for the generation of short AI-generated videos.

However, it is important to note that Forge was recently declared as an experimental interface. Despite a surprise update in August 2024, users should be cautious and use more stable alternatives when in doubt. For example, the beginner-friendly approach of Fooocus.

Text Generation with Oobabooga’s Text Generation WebUI

Another exciting tool in the field of local AI applications is Oobabooga’s Text Generation WebUI. This web-based user interface makes it possible to use various backends for text generation in a single UI and API. Supported backends include Transformers, llama.cpp, ExLlamaV2 and AutoGPTQ.

The versatility and customisability make Oobabooga a valuable tool for developers and researchers who want to experiment with different text generation models. You can also use extensions to talk to your chatbots with a microphone and loudspeaker. Very exciting!

RVC Voice Cloning: The Future of Voice Synthesis

Speaking of “speaking”! Voice cloning is a fascinating area of AI applications. RVC (Retrieval-based Voice Conversion) is a technology that makes it possible to change or imitate voices.

RVC uses advanced voice analysis technologies to analyse voices and generate a voice model. This model can then be used for various applications, from text-to-speech to voice modification and AI-supported vocal covers. Training such a model on a computer with the above-mentioned GTX 1070 graphics card takes about 2 days once, after which the conversion of a recorded voice into the trained voice takes a few seconds. A computer with an RTX 4090 graphics card, on the other hand, trains a new voice clone in about 2 hours instead of 2 days. Bigger is better.

The possibilities are many and varied, ranging from the creation of character voices for video games to personalised voice assistants. However, it is important to consider the ethical implications of this technology and use it responsibly!

Tips and Tricks for Getting Started with Local AI Tools

Start with simple models: Start with smaller models that require fewer resources to familiarise yourself with how they work.
Pay attention to the hardware: Check the system requirements of the tools and make sure that your computer has enough power. See my examples above.
Experimentation is essential! Try out different settings and models to get a feel for the possibilities and limitations of AI hands-on.
Stay up to date: The AI landscape is evolving fast. Follow forums and developer communities to stay informed about new features and updates.
Pay attention to data protection: Even if the processing takes place locally, handle your sensitive data responsibly.

Finally, I would like to leave you with a quote from the futurist Alvin Toffler:

“The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn and relearn.”

With this in mind, I would like to encourage you to continue learning and experimenting with the fascinating possibilities of local AI tools. Stay curious and keen to experiment!