The following article is meant for those who work in and around AI but don't actually work with it yet. I say "yet" because the day is coming when we'll all be working with (or for) AI. Regardless, if you're curious about it, have been wondering what all the fuss is about, and, most importantly, want to understand what Generative AI is vs. non-generative AI (also known as Analytical AI or just AI), then this article will get you started.
We're all familiar with Artificial Intelligence (AI). The buzz is everywhere. We've been watching it and waiting for it. Hollywood has prepared us for the worst and the best, right? Bicentennial Man shows AI grow from a helpful tool to a sentient being capable of love and sacrifice and even a longing for a peaceful end to life. Artificial life-form "Data" from the Star Trek TV/Movie franchise presented us with an AI striving for human emotion. The Matrix depicted an "evil" side to AI whereby humankind was seen as its worst enemy. Of course, War Games is the movie that set the wheels turning for me and my generation. "Would you like to play a game?" But what is AI? More specifically, what is Generative AI? That's where computers are creating new content and ideas. That's where tremendous opportunity exists and, subsequently, a great deal of risk exists. Obviously, there are ethical and regulatory risks on which we need to focus from the outset of this journey. However, the greater risk to a company is the loss of competitiveness if you choose not to get on board with becoming AI-enabled.
What's all the Hullabaloo?
In 2014, Stephen Hawking, a renowned physicist and general genius, warned the world of the dangers of Artificial Intelligence in an interview with BBC. He believed AI could surpass human intelligence and ultimately take over the world. He was concerned that if we allow AI-enabled machines to create other, more advanced, AI-enabled machines, "Humans, bound by the slow pace of biological evolution, would be tragically outwitted." (BBC, 2014) Hawking was not alone in his fears. Many others have expressed similar concerns ranging from Dystopian Matrix-like futures to less fantastic, though equally troubling, concerns such as human rights, privacy, job security, and other social impacts. Hawking, being the scientist he was, also openly acknowledged the many benefits of AI. After all, he used an AI-based system to overcome communication challenges caused by his amyotrophic lateral sclerosis (ALS), which left him paralyzed and unable to speak. In fact, his speech system used Natural Language Processing (NLP), a critical technology used in both generative and non-generative AI. (Don't worry, I'm getting to a clarification of these two concepts). In summary, AI is our future. However, the jury is still out on how long that future will be (Just kidding, I hope). More importantly, the term "AI" has become a buzzword, and it's used by so many people to mean so many things, many of which are NOT AI. Further, there are many flavors and nuances to AI. For the remainder of this article, I'll attempt to provide a primer on two main "categories" of AI. I hope my MIT genius friends will excuse the simplification of such a complicated subject.
OK, so what are Generative and Non-Generative AI?
For those who've worked with me, you'll know I find the best first step to understanding a subject, any subject, is to start with the dictionary definition(s) of the term or terms associated with the subject. Let's compare some definitions of "Artificial Intelligence":
a branch of computer science dealing with the simulation of intelligent behavior in computers.
Also, the capability of a machine to imitate intelligent human behavior.
the capacity of a computer, robot, or other programmed mechanical device to perform operations and tasks analogous to learning and decision-making in humans, as speech recognition or question answering.
a computer, robot, or other programmed mechanical device having this humanlike capacity:
the branch of computer science involved with the design of computers or other programmed mechanical devices having the capacity to imitate human intelligence and thought. Abbreviations: AI, A.I.
the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings.
Not bad. It's a bit dry, but we get the point. AI is technology, and it's capable of doing human stuff. I like Britannica's definition the best. It's short, sweet, and to the point. Of course, there's much more to it, but that's the gist of it. So, let's talk about Generative and Non-Generative AI. After all, the distinction is at the crux of all the hullabaloo.
Non-generative AI refers to software algorithms that can learn and become better at carrying out one (or a few) specific task(s) as they are exposed to more data. These are systems trained to analyze, understand, and reason about existing data. They do not create entirely new artifacts or content. They are primarily used for purposes such as:
Analytical focus - Non-generative AI focuses on classification, prediction, recommendation, clustering, optimization, etc., rather than synthesizing original output.
Pattern recognition - These systems identify statistical patterns, correlations, and insights in data like images, text, sensor readings, databases, etc.
Predefined options - Rather than open-ended creation, they select or suggest options from a predefined set of possible classifications, predictions, recommendations, etc.
Training data dependency - The quantity and quality of the training data limits performance. The AI cannot go beyond patterns in the data.
Interpretability - Behavior is generally more explainable by analyzing training data and recognition capabilities.
Example applications - Image recognition, predictive analytics, automated reasoning, adaptive control systems, molecular modeling, and more leverage non-generative techniques.
Non-Generative AI differs from Generative AI, which simulates every aspect of natural intelligence. These systems can autonomously generate original content and artifacts rather than just analyzing existing data. Examples of artifacts Generative AI systems can produce include:
The ability to create original content distinguishes Generative AI from Non-generative AI (a.k.a. analytical AI), which, as described above, primarily focuses on tasks like classification, prediction, and recommendation. In summary, generative AI leverages learning from data to produce novel, original, and potentially creative output - rather than solely analyzing what already exists. However, it requires careful design and training to generate high-quality artifacts.
Teaching AI Models to "Think"
All AI models are trained using data. Generative models are first trained on large datasets like text corpora or image collections to recognize patterns and learn a representation of that domain. The AI then uses its training to create new combinations and variations outside its original dataset by recombining learned patterns and features. Once trained, Generative AI models can be used for automatic text and image generation, synthetic media, conversational agents, drug/materials discovery through molecular generation, and more.
Useful terms and concepts:
Clearly, this post is meant to be something other than the be-all-and-end-all to the discussion on Generative AI. The day has yet to come for the AI Book of Knowledge, but we're working on it. However, a brief overview of some terms and techniques is always helpful. Here are some definitions of common Generative AI Techniques:
Generative Adversarial Networks (GANs) - Two neural networks that train against each other to generate new data. A generator network creates candidates, while a discriminator evaluates realism.
Autoregressive Models - Models that predict the next token/data point based on the previous ones. Allows generating data sequentially.
Variational Autoencoders (VAEs) - Neural networks that learn compressed latent representations of data, allowing new samples to be decoded by manipulating the latent space.
Diffusion Models - Models that learn to reverse a noisy diffusion process, allowing high-quality data generation from noise through repeated denoising.
Reinforcement Learning - Agents that learn to generate optimal data by taking actions that maximize rewards in complex environments.
Generative Grammars - Grammars (formal rule systems) that define probabilistic rules for sequentially generating well-formed strings or tree structures, like sentences in a natural language.
Future posts will describe each of these in more detail and include more context.
Limitations of Generative AI
Generated content can lack overall coherence or factual accuracy. Hallucinations and bias issues can emerge from imperfect training data. The following are some of the known problems with which we must be concerned when working with Generative AI:
Coherence - Because generative models create new combinations, the outputs do not always maintain topical or narrative coherence, especially for longer-generated text. The lack of a coherent theme or storyline is a common limitation. Having said that, I have personally experienced some amazingly coherent interactions.
Factual accuracy - Without concrete knowledge or a robust real-world model, generated text can state false or nonsensical facts. The AI may mix up names, places, and events in unrealistic ways.
Hallucination - A tendency for generative AI to "hallucinate" or generate content not properly grounded in its training data. For text, this includes fabricating names, events, and places.
Bias - Generative models perpetuate and amplify societal biases and stereotypes, which can be present in training data sets. This leads to issues like gender, race, and other prejudices reflected in outputs.
Researchers are exploring techniques to promote coherence, penalize inaccuracy, align AI with ethics and values, and mitigate limitations through a combination of improved models, training strategies, human guidance, and responsible deployment. However, reducing harmful errors and biases remains an ongoing challenge. As a data governance professional, bias, ethics, and imprinting of values (good or bad) on outputs is an area of interest and, quite frankly, concern. Successfully transforming organizations to AI-enabled autonomous organizations capable of next-generation competitive advantage will require moving governance of data usage to the table-stakes position, making it a non-issue for consumers and regulators alike. Welcome to the future. If you're not AI-centric, you're a dinosaur watching the ice age like a passing phase. There has never been a more disruptive innovation for business than AI. It will impact every business - small and large - and it will do it at scale. It's like the Data Ice-Age™.