Generative AI is a fascinating and exciting application of artificial intelligence. Many applications are now accessible to the public. Its related technologies not only streamline work but also provide fun and companionship to their users.
However, as many AI enthusiasts and users are probably aware, there are some shortcomings in generative AI. We can quite easily spot these limitations across various model types, whether they are image or text generators. These errors can be funny, but they can be problematic sometimes to the point they can take down a service.
In this article, we’ll look over some limitations in generative AI models together with real-life cases, and how we can move forward.
What are some main limitations of generative AI models?
Generative AI models face several limitations users and developers should consider. Here are a few:
- Bias
- AI hallucination
- Bugs
- Security
A complex combination of multiple factors cause these limitations, so there are rarely any quick solutions. In some cases, it can be challenging to even identify the root causes of these problems.
Real-life examples of generative AI limitations
Let’s have a look at how the above-mentioned limitations in generative AI models manifest in real-life situations.
AI Bias
Biases in generative AI models pertain to the model’s tendencies to over-rely on certain knowledge or information. As a result, they can cause imbalance and partiality in the model’s outputs. This limitation is regularly mentioned as one of the most problematic aspects to address in AIs.
For instance, AI image generator Midjourney contains bias when it comes to colors. It tends to overuse teal and orange, especially in outputs with abstract or broad prompts.
Biases like these happen due to an imbalance in the training data. To a certain extent, these minor biases can be a characteristic or a quirk that defines an AI model.
Nonetheless, a redundancy of the same bias can tire out users. For instance, this happens when text generation models overuse a particular set of words or phrases. We can commonly find lists of these AI-favorite words that give out AI-written texts. On the other hand, it’s also not too hard to sniff out these signs ourselves too.
On a little more serious note, generative AI biases can bring societal issues into light. Some models stick to a certain physical or cultural trait when asked to generate images of a person. Some other AI models can produce hate speech or politically motivated statements. On the other hand, certain generative AI models have even become a subject of backlash for attempting to mediate these kinds of issue.
In a different and more common case of biases in AI, some speech recognition engines cannot grasp certain accents at all. We can alleviate these limitations to a certain extent by reducing the imbalance present in the data.
AI hallucination
The infamous AI hallucination refers to when AIs get too creative and start spewing nonsense due to a lack of information. This generative AI limitation is perhaps one of the most well-known ones. It’s easy to spot AIs hallucinating, and it is as hard to prevent it.
The common factor present among hallucination cases is that it generates wrong information with no credible sources. By now, many AI service users are well-aware of this trait. We commonly get to see cases where these hallucinations are mentioned in fun, teasing spirit.
There are cases where AI text models generate humorously absurd and rather obvious misinformation like this. But in some cases, they are less detectable. This can be a serious issue in cases where users use LLM-based chatbots to look for solid information. As many services (including ChatGPT) disclaim, it’s important to always fact-check the information provided by AI chatbots. Hallucinated responses referenced in academic or research papers, or in mass media can lead to serious consequences.
Generative AI limitations like hallucinations are often caused by a complex combination of factors, which means there’s no quick fix. It’s possible to train the AI model to acknowledge that it does not know the answers to a specific set of questionnd often done so, However, this approach is insufficient to a certain extent, as there are just way too many variations to the questions that AI cannot answer.
Unexpected bugs and errors
Not too long ago on February 20, 2024, several users reported a bizarre phenomenon in their interactions with ChatGPT, one of the world’s most used AI systems. These reports claimed that the chatbot was full of nonsense, unlike its usual, seemingly clever self.
In the first image, the chatbot is seen producing garbled sentences that are neither English nor Spanish. According to the thread, the user often used to ask ChatGPT-4 to mix in some Spanish within the responses to practice their Spanish skills. They were generally happy with their AI experience until this incident happened.
And in the second image, the bot’s answer seems far from its usual high-quality outputs.
According to the incident report for OpenAI, there was an error in the way the base model assigned how many context tokens to sample. They were able to roll out a fix for this error immediately.
This incident was able to wrap up neatly. However, it’s also a reminder that even a large language model with massive amount of information and potential can encounter unexpected errors. If we can’t control these sudden issues, it could be detrimental.
Security issues and AI hacks
It’s possible to unexpectedly encounter AI limitations, but there are also ways to actively find these rooms for improvements. This technique is called prompt injection.
Text generation models, like large language models, function on broken down bits of language (also known as tokens) and their statistical correlations. In other words, they technically cannot process language like humans do. This means that they cannot fully detect the shift in user intents between neutral prompt A and malicious prompt B. The prompt injection technique exploits this limitation of AI to induce certain responses, much like how phishing works on humans.
The term “prompt injection” was possibly most popularized during the test period of Microsoft Bing, when a Stanford student Kevin Liu coerced the chatbot to spill sensitive internal information and succeeded. This is when Bing chatbot revealed its internal alias “Sydney,” and read out the first few prompts that serve as its internal mechanism.
Another popular case is when a user jokingly tricked Chevrolet’s GPT-powered chatbot to sell him a 2024 Chevy Tahoe at 1 US dollar, as a “legally binding offer – no takesies backsies.”
The case above did not pose direct threats or security problems to people, but these techniques are potentially useful in harmful scenarios too. It’s a reminder that protecting generative AI models from malicious use cases or users is also a part of the responsibilities for companies that develop powerful AI systems.
Humans can control how to build and structure an AI model and its data, but their internal mechanism is still an enigma to us. This factor also contributes to the challenges in addressing these generative AI limitations.
What causes limitations in generative AI?
One of the main root causes of generative AI limitations and biases often lies in the AI model’s data. Many large AI models are trained with data that is available and scraped from the internet. Because of the large data requirement, it can be extremely time-consuming to vet every dataset. As a result, the model’s dataset can comprise of both high and poor quality data.
Some companies are opting to exclusively use copyright-issue free, vetted data to train their generative AI models. One of the most notable examples is Anthropic.
However, aside from these unique cases, most commercial AI models today make use of datasets that are not fully fact-checked or balanced.
Researchers are working on a technique called machine unlearning to address this issue. This technique involves making a model forget certain information after the training is over. However, it is a highly challenging assignment.
A different solution that generative AIs can take is to undergo enough fine-tuning, and have human evaluators rate their outputs. The goal of these tasks is to supplement data insufficiency and provide most user-appropriate responses.
Moving forward with addressing generative AI limitations
These challenges are the reason Flitto has recently revamped our RLHF, or reinforcement learning from human feedback, platform. Now, our platform can benefit various LLM tasks with different groups of professional evaluators best suited for each task.
As one of the leading AI data companies partnered with tech enterprises from over the world, Flitto is dedicated to help AI services provide the best experiences to their users. In doing this, we leverage our 14 million user platform that supports 173 languages from all around the world. Your AI model can anticipate massive AI data scalability with our platform that can supercharge data at a rate of 500,000 strings of data per day.
Here at Flitto, we look most forward to help AI services become safer and better through a comprehensive data-as-a-service (DaaS) cycle. Until the day more people can use generative AI services with less limitations, we will be back with more AI and data information.
Interested in our RLHF and LLM task dataset solutions? Visit here.