Skip to content
Home » Google Career Certificates » Introduction to Generative AI Learning Path Specialization » Introduction to Large Language Models

Introduction to Large Language Models

This module explores what large language models (LLM) are, the use cases where they can be utilized, and how you can use prompt tuning to enhance LLM performance. It also covers Google tools to help you develop your own Gen AI apps.

Learning Objectives

  • Define Large Language Models (LLMs)
  • Describe LLM Use Cases
  • Explain Prompt Tuning
  • Describe Google’s Gen AI Development tools

Vide: Introduction to Large Language Models

The article discusses the concept of Large Language Models (LLMs), which are a subset of Deep Learning. LLMs are pre-trained models that can be fine-tuned for specific purposes. The article explains that LLMs are trained on large datasets and can be used for various tasks such as text classification, question answering, and text generation.

The article highlights the benefits of using LLMs, including:

  1. Single model can be used for different tasks
  2. Minimal field training data required
  3. Performance continuously growing with more data and parameters

The article also explains the concept of pre-training and fine-tuning, which is similar to training a dog. The model is first trained for general purposes and then fine-tuned for specific tasks.

The article also discusses the concept of prompt design, which is the process of creating a prompt that is clear, concise, and informative. The article highlights the importance of prompt design in LLMs and explains the difference between prompt design and prompt engineering.

The article also discusses the different types of LLMs, including generic language models, instruction-tuned models, and dialog-tuned models. It also explains the concept of chain-of-thought reasoning, which is the observation that models are better at getting the right answer when they first output text that explains the reason for the answer.

The article also discusses the concept of tuning, which is the process of adapting a model to a new domain or a set of custom use cases. The article highlights the importance of tuning in LLMs and explains the different methods of tuning, including fine-tuning and parameter-efficient tuning methods.

The article also discusses the Google Cloud services that can help developers get more out of their LLMs, including Generative AI Studio, Vertex AI, and PaLM API.

Reading: Introduction to Large Language Models: Reading

Quiz: Introduction to Large Language Models: Quiz

What are large language models (LLMs)?

An LLM is a type of artificial intelligence (AI) that can generate human-quality text. LLMs are trained on massive datasets of text and code, and they can be used for many tasks, such as writing, translating, and coding.

An LLM is an advanced natural language processing framework that uses linguistic algorithms to generate sophisticated conversational agents.

An LLM is an artificial neural network architecture optimized for training large-scale reinforcement learning agents capable of mastering complex tasks in robotics.

An LLM is a state-of-the-art computer vision system that excels in recognizing and analyzing intricate patterns and features in images and videos.

What is a benefit of using large language models (LLMs)?
They can generate inaccurate or misleading content if their training data is incomplete or biased.
They can generate human-quality text for tasks such as content creation, writing assistance, and automatic summarization.
They can be trained using only a tiny dataset of text and code.
They can only provide output in the English language.

What are some of the applications of LLMs?
LLMs can be used for many tasks, including writing, translating, and coding.
LLMs can be used for many tasks, including personalized advice and therapy.
LLMs can be used for many tasks, including original creative expression and ethical decision making.
LLMs can be used for many tasks such as making real-time decisions in emergency situations and generating content based on physical perceptions.

What are some of the challenges of using LLMs? Select three options.


  • How’s it going? I’m Megha. Today, I’m going to be talking
    about large language models. Don’t know what those are? Me either. Just kidding, I actually
    know what I’m talking about. I’m a customer engineer
    here at Google Cloud and today I’m going to teach you everything you need to know about LLMs. That’s short for Large Language Models. In this course, you’re going to learn to define large language models, describe LLM use cases, explain prompt tuning, and describe Google’s Generative
    AI developmental tools. Let’s get into it. Large Language Models, or LLMs, are a subset of Deep Learning. To find out more about Deep Learning, check out our introduction to
    generative AI course video. LLMs and generative AI intersect and they’re both a part of Deep Learning. Another area of AI you
    may be hearing a lot about is generative AI. This is a type of artificial intelligence that can produce new
    content including text, images, audio, and synthetic data. All right, back to LLMs. So what are large language models? Large language models refer to large general
    purpose language models that can be pre-trained and then fine-tuned for specific purposes. What do pre-trained and fine-tuned mean? Great questions, let’s dive in. Imagine training a dog. Often, you train your dog
    basic commands such as sit, come, down, and stay. These commands are normally
    sufficient for everyday life and help your dog become
    a good candidate citizen. Good boy. But if you need special service dogs, such as a police dog, a guide dog, or a hunting dog, you add
    special trainings, right? The similar idea applies
    to large language models. These models are trained
    for general purposes to solve common language problems such as text classification,
    question answering, document summarization, and text generation across industries. The models can then be tailored to solve specific problems
    in different fields, such as retail, finance,
    and entertainment, using a relatively small
    size of field data sets. So now that you’ve got that down, let’s further break down the concept into three major features
    of large language models. We’ll start with the word “large.” Large indicates two meanings. First is the enormous size
    of the training dataset, sometimes at the petabyte scale. Second, it refers to the parameter count. In machine learning, parameters are often
    called hyperparameters. Parameters are basically the memories and the knowledge the machine learned from the model training. Parameters define the skill of a model in solving a problem
    such as predicting text. So that’s why we use the word large. What about general purpose? General purpose is when
    the models are sufficient to solve common problems. Two reasons led to this idea. First is the commonality of human language regardless of the specific tasks. And second is the resource restriction. Only certain organizations
    have the capability to train such large language
    models with huge data sets and a tremendous number of parameters. How about letting them create
    fundamental language models for others to use? So this leaves us with our last terms, pre-trained and fine-tuned, which mean to pre-train
    a large language model for a general purpose
    with a large dataset, and then fine-tune it for specific aims with a much smaller dataset. So now that we’ve nailed
    down the definition of what large language models, LLMs, are, we can move on to
    describing LLM use cases. The benefits of using
    large language models are straightforward. First, a single model can
    be used for different tasks. This is a dream come true. These large language models that are trained with petabytes of data and generate billions of
    parameters are smart enough to solve different tasks,
    including language translation, sentence completion, text classification, question answering, and more. Second, large language models require minimal field training data when you tailor them to
    solve your specific problem. Large language models
    obtain decent performance even with little domain training data. In other words, they
    can be used for few-shot or even zero-shot scenarios. In machine learning, few-shot refers to training
    a model with minimal data and zero-shot implies that
    a model can recognize things that have not explicitly been
    taught in the training before. Third, the performance
    of large language models is continuously growing when you add more data and parameters. Let’s take PaLM as an example. In April, 2022, Google released PaLM, short for Pathways Language Model, a 540 billion parameter model that achieves a state
    of the art performance across multiple language tasks. PaLM is a dense decoder-only
    transformer model. It leverages a new pathway
    system, which enabled Google to efficiently train a single model across multiple TPU v4 Pods. Pathway is a new AI architecture that will handle many tasks at once, learn new tasks quickly, and reflect a better
    understanding of the world. The system enables PaLM to orchestrate distributed
    computation for accelerators, but I’m getting ahead of myself. I previously mentioned that
    PaLM is a transformer model. Let me explain what that means. A transformer model consists
    of an encoder and a decoder. The encoder encodes the input sequence and passes it to the decoder, which learns how to
    decode the representations for a relevant task. We’ve come a long way from
    traditional programming to neural networks to generative models. In traditional programming, we used to have to hard code the rules for distinguishing a cat. Type, animal, legs, four, ears, two, fur, yes, likes yarn and catnip. In the wave of neural networks, we could give the network
    pictures of cats and dogs and ask, “Is this a cat?” And it would predict a cat. What’s really cool is that
    in the generative wave, we as users can generate our own content, whether it be text, images,
    audio, video, or more. For example, models like PaLM,
    or Pathways Language Model, or LaMDA, Language Model
    for Dialog Applications, ingest very, very large data from multiple sources across the internet and build foundation
    language models we can use simply by asking a question,
    whether typing it into a prompt or verbally talking
    into the prompt itself. So when you ask it, “What’s a cat?”, it can give you everything
    it has learned about a cat. Let’s compare LLM development
    using pre-trained models for traditional ML development. First, with LLM development,
    you don’t need to be an expert. You don’t need training examples, and there is no need to train a model. All you need to do is
    think about prompt design, which is a process of creating
    a prompt that is clear, concise, and informative. It is an important part of
    natural language processing, or NLP for short. In traditional machine
    learning, you need expertise, training examples, compute
    time, and hardware. That’s a lot more requirements
    than LLM development. Let’s take a look at an example of a text generation use case to really drive the point home. Question answering, or QA, is a subfield of natural
    language processing that deals with the task
    of automatically answering questions posed in natural language. QA systems are typically trained on a large amount of text and code and they’re able to answer
    a wide range of questions, including factual, definitional, and opinion-based questions. The key here is that you
    needed domain knowledge to develop these question
    answering models. Let’s make this clear
    with a real world example. Domain knowledge is required to develop a question answering model for customer IT support or
    healthcare or supply chain, but using generative QA, the model generates free text
    directly based on the context. There’s no need for domain knowledge. Let me show you a few more
    examples of how cool this is. Let’s look at three
    questions given to Gemini, a large language model chat
    bot developed by Google AI. Question one, “This
    year’s sales are $100,000. Expenses are $60,000. How much is net profit?” Gemini first shares how
    net profit is calculated, then performs the calculation. Then Gemini provides the
    definition of net profit. Here’s another question. “Inventory on hand is 6,000 units. A new order requires 8,000 units. How many units do I need to
    fill to complete the order?” Again, Gemini answers the question by performing the calculation. And our last example, “We have 1,000 sensors
    in 10 geographic regions. How many sensors do we have
    on average in each region?” Gemini answers the
    question with an example on how to solve the problem
    and some additional context. So how is that in each of our questions, a desired response was obtained? This is due to prompt design, fancy. Prompt design and prompt engineering are two closely related concepts of natural language processing. Both involve the process of
    creating a prompt that is clear, concise, and informative. But there are some key
    differences between the two. Prompt design is the
    process of creating a prompt that is tailored to the specific task the system is being asked to perform. For example, if the system is being asked to translate a text
    from English to French, the prompt should be written in English and should specify that the
    translation should be in French. Prompt engineering is a
    process of creating a prompt that is designed to improve performance. This may involve using
    domain specific knowledge, providing examples of desired output, or using keywords that
    are known to be effective for the specific system. In general, prompt design
    is a more general concept, while prompt engineering is
    a more specialized concept. Prompt design is essential, while prompt engineering is
    only necessary for systems that require a high degree
    of accuracy or performance. There are three kinds of
    large language models, generic language models,
    instruction-tuned, and dialog tuned. Each needs prompting in a different way. Let’s start with generic language models. Generic language models
    predict the next word based on the language
    in the training data. Here is a generic language model. In this example, “The cat sat on,” the next word should be, “the,” and you can see that the is
    most likely the next word. Think of this model type as
    an auto complete in search. Next, we have instruction-tuned models. This type of model is
    trained to predict a response to the instructions given in the input. For example, “Summarize a text of X.” “Generate a poem in the style of X.” “Give me a list of keywords based on semantic similarity for X.” In this example, classify
    text into neutral, negative, or positive. And finally, we have dialog tuned models. This model is trained to have a dialog by the next response. Dialog tuned models are a special case of instruction-tuned where requests are typically
    framed as questions to a chat bot. Dialog tuning is expected
    to be in the context of a longer back and forth conversation and typically works better with natural question like phrasings. Chain-of-thought reasoning
    is the observation that models are better at
    getting the right answer when they first output text that explains the reason for the answer. Let’s look at the question. “Roger has five tennis balls. He buys two more cans of tennis balls. Each can has three tennis balls. How many tennis balls does he have now?” This question is posed
    initially with no response. The model is less likely to get
    the correct answer directly. However, by the time the
    second question is asked, the output is more likely to
    end with the correct answer. But there is a catch,
    there’s always a catch. A model that can do everything
    has practical limitations, but task-specific tuning
    can make LLMs more reliable. Vertex AI provides
    task-specific foundation models. Let’s get into how you can tune with some real world examples. Let’s say you have a use
    case where you need to gather how your customers are feeling about your product or service. You can use a sentiment
    analysis task model, say provision tasks. If you need to perform
    occupancy analytics, there is a task specific
    model for your use case. Tuning a model enables you to
    customize the model response based on examples of the task that you want the model to perform. It is essentially the
    process of adapting a model to a new domain or a
    set of custom use cases by training the model on new data. For example, we may collect training data and tune the model specifically for the
    legal or medical domain. You can also further tune
    the model by fine-tuning where you bring your own
    dataset and retrain the model by tuning every weight in the LLM. This requires a big training job in hosting your own fine-tuned model. Here’s an example of a
    medical foundation model trained on healthcare data. The tasks include question
    answering, image analysis, finding similar patients, et cetera. Fine-tuning is expensive and
    not realistic in many cases. So are there more efficient
    methods of tuning? Yes. Parameter-efficient tuning methods, PETM, are methods for tuning
    a large language model on your own custom data
    without duplicating the model. The base model itself is not altered. Instead, a small number of
    add-on layers are tuned, which can be swapped in
    and out at inference time. I’m going to tell you
    about three other ways Google Cloud can help you
    get more out of your LLMs. The first is generative AI studio. Generative AI studio
    lets you quickly explore and customize generative AI models that you can leverage in your
    applications on Google Cloud. Generative AI studio
    helps developers create and deploy generative AI models by providing a variety
    of tools and resources that make it easy to get started. For example, there is a
    library of pre-trained models, a tool for fine-tuning models, a tool for deploying models to production, and a community forum for developers to share ideas and collaborate. Next, we have Vertex AI,
    which is particularly helpful for those of you who don’t
    have much coding experience. You can build generative AI search and conversations for customers and employees with Vertex
    AI search and conversation. Formerly, GenAI app builder. Build with little or no coding and no prior machine learning experience. Vertex AI can help you
    create your own chat bots, digital assistants, custom search engines, knowledge bases, training
    applications, and more. And lastly, we have PaLM API. PaLM API lets you test and experiment with Google’s
    large language models and GenAI tools. To make prototyping quick
    and more accessible, developers can integrate
    PaLM API with MakerSuite and use it to access the API using a graphical user interface. The suite includes a
    number of different tools such as a model training
    tool, a model deployment tool, and a model monitoring tool. And what do these tools do? I’m so glad you asked. The model training tool helps developers train machine
    learning models on their data using different algorithms. The model deployment tool helps developers deploy machine learning
    models to production, but the number of different
    deployment options. The model monitoring tool helps developers monitor the performance of
    their machine learning models in production using a dashboard and a number of different metrics. Gemini is a multimodal AI model. Unlike traditional language models, it’s not limited to
    understanding text alone. It can analyze images,
    understand the nuances of audio, and even interpret programming code. This allows Gemini to
    perform complex tasks that were previously impossible for AI. Due to its advanced architecture, Gemini is incredibly
    adaptable and scalable, making it suitable for
    diverse applications. Model Garden is continuously
    updated to include new models. See, I told you way back in
    the beginning of this video that I knew what I was talking about when it came to large language
    models, and now you do too. Thank you for watching our course and make sure to check
    out our other videos if you want to learn more
    about how you can use AI.