This module includes a graded quiz to test and reinforce your understanding of concepts covered in the course. The module also includes a glossary to enhance comprehension of generative AI-related terms. The module includes a final project, which provides an opportunity to gain hands-on experience on the concepts covered in the course. Finally, the module guides you through the next steps in your learning journey.
Learning Objectives
- Demonstrate understanding of the course concepts through the graded quiz and project.
- Prepare for the next steps of your learning journey.
Glossary and Final Project
Glossary: Generative AI: Foundation Models and Platforms
Glossary
Term | Definition |
---|---|
Artificial neural networks (ANNs) | A collection of smaller computing units called neurons which are modeled in a manner similar to how a human brain processes information. |
Bidirectional autoregressive transformer model (BART) | A text-to-text transfer transformer model developed by Facebook AI with a seq2seq translation architecture with bidirectional encoder representation like BERT and a left-to-right decoder like GPT. |
Bidirectional encoder representations from transformers (BERT) | A family of language models by Google that uses pre-training and fine-tuning to create models that can accomplish several tasks. |
Chatbot | A computer program that simulates human conversation with an end user. Though not all chatbots are equipped with artificial intelligence (AI), modern chatbots increasingly use conversational AI techniques like natural language processing (NLP) to make sense of the user’s questions and automate their responses. |
Clustering | An application of unsupervised learning wherein the algorithms group similar instances together based on their inherent properties. |
Code2Seq | A text-to-code seq2seq model developed by OpenAI trained on a substantial text and code data set. It leverages the syntactic structure of programming languages to encode source code. |
CodeT5 | A text-to-code seq2seq model developed by Google AI trained on a large data set of text and code. CodeT5 is the first pre-trained programming language model that is code-aware and encoder-decoder based. |
Convolutional neural networks (CNNs) | Deep learning architecture networks that contain a series of layers, each conducting a convolution or mathematical operation on a previous layer. |
DALL-E | A text-to-image generation model developed by OpenAI that is trained on a large data set of text and images and can be used to generate realistic images from various text descriptions. |
Deep learning | A type of machine learning focused on training computers to perform tasks through learning from data. It uses artificial neural networks. |
Diffusion model | A type of generative model that is popularly used for generating high-quality samples and performing various tasks, including image synthesis. It is trained by gradually adding noise to an image and then learning to remove the noise. This process is called diffusion. |
Dimensionality reduction | An application of unsupervised learning wherein the algorithms capture the most essential data features while discarding redundant or less informative ones. |
Falcon | A large language model developed by the Technology Institute of Innovation (TII). Its variant, falcon-7b-instruct, is a 7-billion-parameter model based on the decoder-only model. |
Foundational models | AI models with broad capabilities that can be adapted to create more specialized models or tools for specific use cases. |
Generative adversarial network (GAN) | A type of generative model that includes two neural networks: Generator and discriminator. The generator is trained on vast data sets to create samples like text and images. The discriminator tries to distinguish whether the sample is real or fake. |
Generative AI models | Models that can understand the context of input content to generate new content. In general, they are used for automated content creation and interactive communication. |
Generative pre-trained transformer (GPT) | A series of large language models developed by OpenAI designed to understand language by leveraging a combination of two concepts: Training and transformers. |
Google flan | An encoder-decoder foundation model based on the T5 architecture. |
Google JAX | A machine learning framework used for transforming numerical functions that combines autograd (automatic obtaining of the gradient function through differentiation of a function) as well as TensorFlow’s XLA (accelerated linear algebra). |
Hugging Face | An AI platform that allows open-source scientists, entrepreneurs, developers, and individuals to collaborate and build personalized machine learning tools and models. |
IBM Granite | Multi-size foundation models that are specially designed for businesses. These models use a decoder architecture to apply generative AI to both language and code. |
IBM watsonx | An integrated AI and data platform with a set of AI assistants designed to scale and accelerate the impact of AI with trusted data across businesses. |
Imagen | A text-to-image generation model developed by Google AI trained on a large data set of text and images. Imagen is used to generate realistic images from various text descriptions. |
Large language models (LLMs) | A deep learning model trained on substantial text data to learn the patterns and structures of language. They can perform language-related tasks, including text generation, translation, summarization, sentiment analysis, and more. |
Llama | A large language model from Meta AI. |
Natural language processing (NLP) | A subset of artificial intelligence that enables computers to understand, manipulate, and generate human language (natural language). |
Neural code generation | A process that uses artificial neural networks like neural networks work in the human brain. |
Neural network model | A type of text-to-text generation model that uses artificial neural networks to generate text. |
Neural networks | Computational models inspired by the human brain’s structure and functioning. They are a fundamental component of deep learning and artificial intelligence. |
Open lakehouse architecture | A data lakehouse architecture that combines elements of data lakes and data warehouses. |
PanGu-Coder | A text-to-code transformer model developed by Microsoft Research. It is a pre-trained decoder-only language model that generates code from natural language descriptions. |
Pre-trained models | A machine learning model trained on an extensive data set before being fine-tuned or adapted for a specific task or application. These models are a type of transfer learning where the knowledge gained from one task (the pre-training task) is leveraged to perform another task (the fine-tuning task). |
Pre-training | A technique in which unsupervised algorithms are repeatedly given the liberty to make connections between diverse pieces of information. |
Prompt | An instruction or question given to a generative AI model to generate new content. |
PyTorch | An open-source machine learning framework based on the Torch library. This framework is used for applications such as computer vision and natural language processing. |
Recurrent neural networks (RNNs) | Deep learning architecture designed to handle sequences of data by maintaining hidden states that capture information from previous steps in the sequence. |
Seq2seq model | A text-to-text generation model that first encodes the input text into a sequence of numbers and then decodes this sequence into a new one, representing the generated text. |
Statistical model | A type of text-to-text generation model that uses statistical techniques to generate text. |
Supervised learning | A subset of AI and machine learning that uses labeled data sets to train algorithms to classify data or predict outcomes accurately. |
T5 | A text-to-text transfer transformer model developed by Google AI trained on a substantial data set of code and text. It can be used for various tasks, including summarization, translation, and question-answering. |
TensorFlow | A free and open-source software library used for machine learning and artificial intelligence. |
Text-to-code generation model | A type of machine learning model used to generate code from natural language descriptions. It uses generative AI to write code through neural code generation. |
Text-to-image generation model | A type of machine learning model used to generate images from text descriptions. It uses generative AI to make meaning out of words and turn them into unique images. |
Text-to-text generation model | A type of machine learning model used to generate text from a given input. It is trained on a large text corpus and is taught to learn patterns, grammar, and causal information. Using the given input, the models generate the new text. |
Training data | Data (generally, large data sets that also have examples) used to teach a machine learning model. |
Transformers | A deep learning architecture that uses an encoder-decoder mechanism. Transformers can generate coherent and contextually relevant text. |
Unsupervised learning | A subset of machine learning and artificial intelligence that uses algorithms based on machine learning to analyze and cluster unlabeled data sets. These algorithms can discover hidden patterns or data groupings without human intervention. |
Variational autoencoder (VAE) | A generative model that is a neural network model designed to learn the efficient representation of input data by encoding it into a smaller space and decoding it back to the original space. |
watsonx.ai | A studio of integrated tools for working with generative AI capabilities powered by foundational models and building machine learning models. |
watsonx.data | A massive, curated data repository that can be used to train and fine-tune models with a state-of-the-art data management system. |
watsonx.governance | A powerful toolkit to direct, manage, and monitor your organization’s AI activities. |
Graded Quiz and Wrap-up
Graded Assignment: Graded Quiz: Generative AI: Foundation Models and Platforms
Lina is part of a group of developers tasked with creating a sample large language model (LLM). Which deep generative network should they use to accomplish this task?
A transformer-based model
Correct! A transformer-based model can process an exceptionally high number of parameters and, therefore, is best suited to create LLMs.
Gemini
The best choice for Lina’s team is a transformer-based model. Here’s why:
- LLMs & Transformers: Transformer architectures are the foundation of modern, powerful LLMs like GPT-3. Transformers have revolutionized LLMs due to their ability to:
- Process long sequences of text data effectively.
- Handle complex relationships within text (dependencies, context).
- Parallelize training, allowing for massively scaled models.
Why other options aren’t the best fit:
- Recurrent Neural Networks (RNNs): Older RNNs were used for language tasks, but they struggle with long-range dependencies and vanishing gradients, making them less suitable for large, complex LLMs.
- Supervised Learning Algorithms: Supervised learning refers to a category of algorithms, not a specific architecture. Transformers can be used in supervised settings, but it’s not what defines them.
- Convolutional Neural Networks (CNNs): CNNs are excellent for image processing, but their structure isn’t ideal for the sequential nature of language data.
Key Point: If Lina’s team aims to create a sample LLM, using a transformer-based model will align them with the technology behind state-of-the-art language models.
ChatGPT
Lina and her team should opt for a transformer-based model to create the sample large language model (LLM). Transformers have demonstrated superior performance in natural language processing tasks, especially for large-scale language modeling, thanks to their self-attention mechanism that allows them to capture long-range dependencies efficiently.
Llama3
The correct answer is:
D) A transformer-based model
Transformer-based models have been widely used for natural language processing tasks, including language modeling, and have achieved state-of-the-art results in many benchmarks. They are particularly well-suited for tasks that involve sequential data, such as text, and are capable of modeling long-range dependencies and complex relationships between words.
Recurrent neural networks (RNNs) are also commonly used for language modeling, but they can be less effective for very long sequences and may suffer from the vanishing gradient problem.
Supervised learning algorithms are not typically used for language modeling, as the goal is to generate text rather than classify or predict specific labels.
Convolutional neural networks (CNNs) are typically used for image and signal processing tasks, and are not well-suited for natural language processing tasks.
Therefore, a transformer-based model is the most suitable choice for creating a large language model.
While working with a recurrent neural network, Nitin struggles to process long sequences of text. Which type of generative AI model can he use to model long-term dependencies in text?
Transformer-based models
Correct! Transformer-based models are built with attention mechanisms that allow them to model long-term dependencies in text and generate contextually relevant content.
Gemini
The best answer is Transformer-based models. Here’s why:
- Transformers excel at long-term dependencies: Their attention mechanisms allow them to understand relationships between words or tokens even across long sequences of text. This is crucial for tasks like language translation, text summarization, and text generation, which require the model to maintain context over extended passages.
- RNNs and their limitations: While recurrent neural networks (RNNs) were used for sequence processing in the past, they suffer from vanishing/exploding gradient problems during training. This makes it difficult for them to learn long-term dependencies effectively.
The other options are less suitable:
- Variational autoencoders (VAEs): are primarily used for data encoding/decoding and image generation tasks. While they can handle sequences, they aren’t specifically designed for long-term dependency modeling.
- Generative adversarial networks (GANs): are powerful for image generation and style transfer. They are less commonly used for text-based tasks with a focus on long-term context.
- Diffusion models: are a newer class of generative models showing promise in image and audio generation, but their use in text modeling with long-term dependencies is still being researched.
ChatGPT
Nitin can use transformer-based models to model long-term dependencies in text efficiently. Transformers are known for their ability to capture long-range dependencies in sequences, making them well-suited for tasks involving processing long sequences of text.
Llama3
The correct answer is:
D) Transformer-based models
Transformer-based models are particularly well-suited for processing long sequences of text because they can model long-term dependencies and complex relationships between words. They use self-attention mechanisms to weigh the importance of different parts of the input sequence, allowing them to focus on relevant information and ignore irrelevant information.
Variational autoencoders (VAEs) are typically used for dimensionality reduction and generative modeling, but are not well-suited for processing long sequences of text.
Diffusion models are a type of generative model that can be used for image and audio generation, but are not typically used for natural language processing tasks.
Generative adversarial networks (GANs) are also not typically used for natural language processing tasks, and are more commonly used for image and audio generation.
Transformer-based models are particularly well-suited for natural language processing tasks, including language modeling, machine translation, and text summarization, and are well-equipped to handle long sequences of text.
Huma is searching the internet for information on the twelve ways to stay alert on night watch. Which foundation model will likely give her the best result for such a query?
OpenAI’s GPT-4
Correct! OpenAI’s GPT-4, which is pre-trained on an estimated 180 plus trillion parameters, is the largest general-purpose model available to users.
Gemini
Here’s the best option for Huma’s search, and why the others aren’t as suitable:
- OpenAI’s GPT-4 (or a similar large language model): These models are designed to understand and process natural language. They can search through vast amounts of information and provide answers in a comprehensive and easy-to-understand manner.
Why others aren’t ideal:
- Meta’s Galactica: While also a language model, Galactica was focused on scientific papers and was withdrawn due to issues with accuracy and bias. It’s not the best choice for general web searches.
- OpenAI’s DALL-E: This is an image generation model. It wouldn’t be helpful for finding textual information about staying alert.
- AlexNet: AlexNet is a convolutional neural network (CNN) that revolutionized image classification. It’s not designed for natural language processing or web searches.
Key point: For a factual search query like Huma’s, a large language model trained on a massive dataset of text is the most likely to provide relevant and accurate results.
ChatGPT
For Huma’s query on “twelve ways to stay alert on night watch,” OpenAI’s GPT-4 would likely provide the best results. GPT-4 is specifically designed for natural language understanding and generation tasks, making it well-suited for processing textual queries and providing relevant information.
Llama3
The correct answer is:
B) OpenAI’s GPT-4
GPT-4 is a large language model developed by OpenAI that is designed to generate human-like text and answer questions. It is a transformer-based model that has been trained on a massive dataset of text from the internet and can generate text on a wide range of topics.
Given Huma’s query about the twelve ways to stay alert on night watch, GPT-4 is likely to provide the best results because it is a language model that is designed to generate text and answer questions. It can provide a detailed and informative response to Huma’s query, including relevant information and tips on how to stay alert during night watch.
Meta’s Galactica is not a well-known language model, and it’s unlikely that it would provide relevant results for Huma’s query.
OpenAI’s DALL-E is a text-to-image model that generates images from text prompts, but it’s not designed to answer questions or provide information on specific topics.
AlexNet is a deep learning-based image recognition system that is not designed to answer questions or provide information on specific topics.
Neil is developing an application that lets users type in text descriptions to create unique images. Which generative AI model will be most suitable for implementing this feature?
A generative adversarial network
Correct! A generative adversarial network uses a generator-discriminator structure to provide users with an image closest to their description.
Gemini
The most suitable option for Neil’s application is a generative adversarial network (GAN). Here’s why:
- GANs for Image Generation: GANs excel at generating realistic and diverse images from simple inputs. Their structure, where a generator and discriminator compete, is ideal for learning to create images matching a given description.
- Other Options and Their Limitations:
- Markov Chain: Markov chains are simple models for generating sequences, but they lack the sophistication for complex text-to-image translation.
- Variational autoencoder (VAE): VAEs are good for image generation but can produce blurry results. They are not as strong as GANs in handling detailed descriptions.
- T5: T5 is a large language model from Google AI, excellent for text-based tasks but not the best choice for image generation.
Key Point: If Neil wants to let users create unique and realistic images based on text descriptions, GANs offer the most effective approach within current generative AI models.
ChatGPT
For Neil’s application that generates unique images based on text descriptions, a generative adversarial network (GAN) would be the most suitable choice. GANs are adept at generating realistic images from textual descriptions by learning the mapping between text and images and producing visually coherent results.
Llama3
The correct answer is:
D) A generative adversarial network (GAN)
Generative Adversarial Networks (GANs) are a type of deep learning model that can generate new, synthetic data that resembles existing data. In this case, a GAN can be trained to generate images based on text descriptions. The GAN consists of two neural networks: a generator network that produces images, and a discriminator network that tries to distinguish between real and generated images. Through this adversarial process, the generator network learns to produce more realistic images that match the text descriptions.
A Markov chain is a mathematical system that can be used for modeling random processes, but it’s not suitable for generating images from text descriptions.
A variational autoencoder (VAE) is a type of neural network that can be used for dimensionality reduction and generative modeling, but it’s not typically used for generating images from text descriptions.
T5 is a text-to-text transformer model that can be used for a wide range of natural language processing tasks, but it’s not designed for generating images from text descriptions.
Therefore, a GAN is the most suitable choice for implementing the feature of generating images from text descriptions.
Anita from the IT department is trying to develop an application that can automatically generate code from a text prompt. Which generative AI model will be most suitable for implementing this feature?
PanGu-Coder
Correct! The PanGu-Coder is a transformer model developed by Microsoft Research that generates code from natural language descriptions.
Gemini
The most suitable choice for Anita is PanGu-Coder or a similar code-generation model. Here’s why:
- Specialized Code Generators: Models like PanGu-Coder are specifically trained on vast amounts of code. They understand programming languages’ syntax, patterns, and logic, allowing them to generate code that matches the functionality described in a text prompt.
- Why Other Options Aren’t Ideal:
- Bidirectional Encoder Representations from Transformers (BERT): BERT is a powerful language model, but it’s primarily designed for natural language understanding, not code generation.
- Style-generative adversarial network (StyleGAN): StyleGAN specializes in image generation, making it irrelevant to code-related tasks.
- Variational autoencoder (VAE): While VAEs can generate data, they are not as adept at understanding the strict structural requirements of code as specialized code-generation models.
Key Point: Using a model specifically trained on code will give Anita the best chances of success for her application that automatically generates code from text prompts.
ChatGPT
For Anita’s application of automatically generating code from a text prompt, PanGu-Coder would be the most suitable generative AI model. PanGu-Coder is specifically designed for code generation tasks and is based on transformer architecture, making it adept at understanding and generating code from textual prompts.
Llama3
The correct answer is:
D) PanGu-Coder
PanGu-Coder is a neural network-based code generator that can generate code from natural language prompts. It uses a combination of natural language processing (NLP) and programming language processing (PLP) techniques to generate code that is syntactically and semantically correct.
BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language model that can be fine-tuned for a wide range of NLP tasks, but it’s not specifically designed for generating code.
StyleGAN is a type of generative adversarial network (GAN) that is designed for generating images, not code.
A variational autoencoder (VAE) is a type of neural network that can be used for dimensionality reduction and generative modeling, but it’s not typically used for generating code.
PanGu-Coder is specifically designed for generating code from natural language prompts, making it the most suitable choice for implementing this feature.
Therefore, Anita from the IT department should consider using PanGu-Coder to develop the application that can automatically generate code from a text prompt.
Veronique has just signed onto the IBM watsonx.ai platform. Which of these tasks can she perform on this platform?
Customize models based on her company’s business use cases
Correct! IBM watsonx.ai lets you customize models based on your data and business use cases.
Gemini
The most accurate option from the choices is:
- Customize models based on her company’s business use cases
Here’s why the other options are less accurate or relevant to IBM watsonx.ai:
- Use multiple query engines…: While watsonx.ai integrates with various data sources, the focus isn’t on multiple query engines specifically.
- Perform dimensional reporting visibility…: This language suggests OLAP-type analysis, not a core purpose of watsonx.ai.
- Use dashboards to analyze other AI models…: watsonx.ai has model monitoring components, but analyzing external models isn’t a primary feature.
What is IBM watsonx.ai?
watsonx.ai is a platform for building, managing, and deploying AI models. Its emphasis is on:
- Customizable AI Solutions: Providing tools and pre-trained models to develop tailored solutions.
- Responsible AI: Includes features for fairness, explainability, and lineage tracking of AI models.
- Enterprise-ready: Designed with security and scalability in mind.
ChatGPT
Veronique can customize models based on her company’s business use cases on the IBM watsonx.ai platform.
Llama3
The correct answer is:
C) Use dashboards to analyze other AI models on the platform
IBM Watson X AI is a cloud-based platform that provides a range of AI and machine learning tools and services. One of the key features of the platform is the ability to analyze and visualize AI models using dashboards.
Veronique can use the dashboards to analyze and understand the performance of other AI models on the platform, including those developed by her company. This can help her to identify trends, patterns, and insights that can inform business decisions.
The other options are not correct because:
- Using multiple query engines to process data efficiently is not a feature of the IBM Watson X AI platform.
- Performing dimensional reporting to gain visibility into AI models is not a feature of the platform.
- Customizing models based on a company’s business use cases is not a feature of the platform, although it may be possible to do this using other tools and services on the platform.
Therefore, option C is the correct answer.
Chandra is searching the Hugging Face website for a tool to help her develop her transformer-based model. Which Hugging Face tool is most likely to help her do that?
Transformers library
Correct! Hugging Face offers developers an open-source transformer library for PyTorch, TensorFlow, and Google JAX.
Gemini
The most likely Hugging Face tool to help Chandra is the Transformers library. Here’s why:
- Purpose of the Transformers Library: This open-source library provides state-of-the-art pre-trained transformer-based models and the building blocks to create custom ones. It’s the core resource for transformer-related development.
- Why Other Options Are Less Relevant:
- Expert Acceleration Program: This program provides consulting and support services, not direct development tools.
- Spaces Tab: This section hosts demos and applications built using Hugging Face tools, but it’s not the primary development toolkit itself.
- HuggingChat: This tool is geared toward building chatbots with Hugging Face models, but not the general development of transformer models.
Chandra’s Development Needs: If she wants to work with transformer-based models, the Transformers library offers the most comprehensive set of features, including:
- Pre-trained models for various tasks (text classification, translation, etc.)
- Tokenizers for text preprocessing
- Pipelines for simplified model usage
- Training and fine-tuning capabilities
ChatGPT
Chandra is most likely to find help in developing her transformer-based model by using the “Transformers library” on the Hugging Face website. This library provides various pre-trained transformer models, along with tools and utilities for fine-tuning and deploying custom transformer models.
Llama3
Chandra is most likely to find help in developing her transformer-based model by using the “Transformers library” on the Hugging Face website. This library provides various pre-trained transformer models, along with tools and utilities for fine-tuning and deploying custom transformer models.