Making GenAI fit for business

Technology14 Mar 2024

GenAI is the business fad du nos jours – but a different approach to its potential may yet yield genuine results

Generative AI tends to split opinions. Some view it with the same awe they would gold dust in an alchemist’s laboratory; others dismiss it as a probability engine, or autocomplete on steroids.

Meanwhile, pragmatists are trying to find realistic, real-world applications for it by managing the risks posed by its inaccuracy, tendency towards “hallucination” (presenting nonsense as genuine information) and the inherent bias and disinformation it tends to inherit from untrustworthy sources.

But while the business world tries to figure out the practicalities GenAI can be applied to, the scientific research into large language models is heading down the rabbit hole. It would seem that science is having a little trouble when it comes to actually explaining how GenAI works in the way it does.

A recent article in MIT Technology Review talks about how ChatGPT was developed through trial and error; not precision-engineered to solve a particular problem but evolving almost naturally. The models that have sprung up thanks to this methodology are now so complex that, the article explains, “researchers are studying them as if they were strange natural phenomena, carrying out experiments and trying to explain the results.”

Boaz Barak, a computer scientist at Harvard University, currently on secondment with Open AI, talks in the article about how GenAI models are often compared to physics at the beginning of the 20th century. “We have a lot of experimental results that we don’t completely understand, and often when you do an experiment, it surprises you,” he explains.

Harnessing the genius of GenAI

Understandably, business is keen to jump at the opportunity of leveraging GenAI’s superpower in real-life scenarios. However, putting the unbridled genius of ChatGPT-4 and other competing models to practical use is a challenging risk management exercise.

The most astonishing feature of ChatGPT-4 is its unprecedented capability to generalise, guided by the vast information repository that is the internet – at least to a point. Having been trained on any available data on the internet, it can learn languages and if it’s trained to complete some arithmetic exercises in English, it can do the same in French, when instructed so, without any language training.

But its whims and hallucinations, as well as its lack of access to data updates in real time (until recently, ChatGPT was only trained using data created up until September 2021), risk wreaking havoc on the reputations of businesses brazen enough to implement them without a human in the loop.

While the unpredictable nature of LLMs make for fascinating material for researchers, real-life applications require reliability, accuracy and transparency. The key to making LLMs more accurate and relevant lies in grounding them with instructions, further training or specific context to prevent brain farts and random associations.

Grounding GPTs

There are several ways of making general-purpose, or so-called foundation models, more applicable in business, depending on the use case.

These smaller models considerably underperform ChatGPT-4. Some require more programming expertise and upfront investment; others need a human in the loop full-time. But, though they may be simpler, they’re also far cheaper to train than the larger LLMs researchers experiment with, which can cost from a few million to tens of millions of pounds.

In short, they’re good enough. As they’re designed with a far narrower remit, smaller, more specialised models with fewer parameters can perform on a par with huge, more general ones that require four times as much computing power, provided they are trained on additional, domain-specific data or prompts or are linked to corporate data bases.

Fine-tuning and RAG

One way of optimising these “lower-density” models for a particular task is fine-tuning, where the pre-trained (the P in GPT) model is trained further on a smaller, task-specific, labelled dataset. This means updating the core parameters of the model: a delicate process that requires careful work and a lot of computational power and, therefore, cash.

Another method known as retrieval augmented generation (RAG), however, leaves the parameters of the original foundation model intact, and teaches the model to prioritise the relevant business context rather than random snippets of information it has found on the internet by linking it to a corporate database. This way, responses will be more accurate and relevant, and hallucination can be kept to a minimum.

The prompt engineer: today’s most sought-after expert

Prompt engineering doesn’t involve interference with an LMM’s parameters or connecting it to external resources, but is about crafting and optimising prompts given to an LLM to achieve the best outcomes. This is where humans play a central role, and the competence to be able to talk to LLMs effectively is a highly valuable one.

To excel at this job, prompt engineers need to have a rather complex skillset. First, it takes some writing skill to be able to design and modify prompts that lead to better results, as well as a good understanding of context and how to refine it.

But the job requires more than just language skills. Prompt engineers also need a good understanding of the given LLM model and its datasets: the “engineering” side of the equation. Prompt engineers will also need knowledge of the terminology and structure of whatever the GPT is being applied to. If, for example, the prompts are used to generate content for a photovoltaics company, the prompt engineer must be conversant with the technical language of solar panels. Or if the model is to provide medical information for people with diabetes, prompt engineers will need to be able to communicate about the condition in accurate medical terms. And, although coding experience isn’t a requirement, familiarity with some basic programming languages doesn’t hurt.

Directories of the best prompts are already available. In job markets, there is already a rise in vacancies for prompt engineers, and companies have emerged offering prompt engineer hire and prompt engineering services.

The engineering of prompts is touted as a job that’s expected to be in high demand in the age of GPTs. However, it’s a job that perhaps contains within it the seeds of its own obsolescence, as – in today’s age of breakneck technological advancement – automated prompt-engineering solutions are already appearing on the horizon…

Digital Transformation