Skip to content
Home » Vanderbilt University » Generative AI Assistants Specialization » Course 1: Prompt Engineering for ChatGPT » Module 2: Introduction to Prompts

Module 2: Introduction to Prompts

What are Prompts?


Video: What is a Prompt?

The speaker is discussing the concept of a “prompt” in the context of interacting with large language models like ChatGPT. A prompt is more than just a question or instruction; it’s a call to action that can have multiple dimensions and nuances.

The speaker highlights several aspects of prompts:

  1. Call to action: A prompt is a way to get the large language model to start generating output, whether it’s text, software code, or other forms of data.
  2. Time aspect: Prompts can have a time component, meaning they can affect the output not just immediately but also in the future. This is because large language models have conversational capabilities that allow them to remember previous interactions.
  3. Memory: Prompts can serve as a reminder or cue to the large language model, helping it remember information or context from previous interactions.
  4. Eliciting information: A prompt can also be used to elicit information from the user, with the large language model asking questions or seeking clarification.
  5. Providing information: A prompt can be used to provide new or updated information to the large language model, helping it reason about topics it may not have been trained on.

The speaker illustrates these concepts through examples of interacting with ChatGPT, demonstrating how prompts can be used to:

  • Ask questions and get better versions of those questions
  • Provide context and information to the large language model
  • Elicit information from the user
  • Provide new or updated information to the large language model

Overall, the speaker emphasizes the importance of understanding the complexities of prompts in order to effectively interact with large language models and unlock their full potential.

So what does the word
prompt actually mean? Now we talk a lot about
prompt engineering and we’re talking about interaction
with large language models. And so let’s dive into this word a little
bit because I want to give you a sense of the depth of what a prompt can be. It’s more than just a question that you’re
asking to the large language model. So to do this, I think it’s helpful
to go through a couple of prompts and actually interact with chat GPT to
understand the concept of a prompt and sort of the the dimensions to a prompt. So I start off here by just asking chat
GPT, what does the word prompt mean? And it comes up with some really helpful. I think information for us to use
when we’re talking about prompts. One, it says it’s a verb to cause or encourage someone to do something
to spur them to action. And as an example, the teacher’s reminder
about the upcoming exam prompted the students to start studying. So part of what a prompt is, it is a call to action to the large
language model it is something that is getting the large language model
to start generating output for us basically to start generating words
that will form the basis of our output. And those words could take all
kinds of different formats. It could be software,
defining the instructions and a program. It could be the words that we need for
that. It could be words in a poem,
it could be words and many, many different types of things, we could
do it as structured data that we need. All of these things are possible. But the key is the prompt is
basically our way of getting the large language model to begin
to provide some output for us. The second thing it says
is it’s an adjective, it’s done without delay on time. And this is an important thing to think
about with large language models is we’re prompting them right now to do this. But this is also a useful point of
discussion because it prompt doesn’t necessarily depending
on how you work with. Have to be something
that happens right now. Prompts can have time
associated with them. And now we have a little bit
of an illusion of time and I’ll talk about this later in the course. But when we give prompts to the large
language model like chat GPT, the prompt could affect what’s happening right now
in my next statement that I give it or in the current statement or
it could affect something in the future. And this is because of some of
the conversational capabilities of these large language models. Now we’re going to talk about prompts
themselves as conversations later on. But for now, the way to think about it is
a prompt doesn’t just have to be something that’s immediate. It can be, but it doesn’t have to be. The next thing it says is
a prompt can be a noun, a cue or reminder that helps someone remember
what they were supposed to do or say. Now this is important because we
can use prompts to remind the large language model of information or
things that it needs to do. And so
we may be asking it to do something now, we may be asking it something
to do in the future. In which case,
it needs to remember something. So we need to give it a reminder
of what it’s trying to do. We may need to go back and give it a history of the conversation
of things that have happened, other things that will help it
remember what it’s trying to do. And then it can be in another noun,
a message or screen that appears on a computer or electronic
device asking the user for input. So this is kind of a helpful
thing because we can prompt the large language model
to produce things for us. But the large language model if we
structure things correctly can actually elicit information from the user. So the large language model
can ask questions to us. So prompt could both serve
as as a call to action or to spur spurring the large
language model to action. Or it can actually be an input,
it can be asking us for input. So the last output from the large language
model could have been a question for the human and basically we are then
providing information back to it. So prompts have a lot of
different dimensions to them. And if you actually look at
the definition of prompt, in fact, the definition that chat GBT gives us, it gives us sort of that nuance
to the capabilities of a prompt. Now, I’ve gone on a little bit here and
I’m going to give you an example of this, this time aspect to a prompt because I
think this helps to see the complexity and dimensionality of this, of concept. I’m going to say from now
on when I ask a question, suggest a better version of my question
and ask me if I would like to use it. So right here,
I’m starting off by saying from now on, I’m attaching time to my prompt. So it’s not just immediate and
we get a response. Sure I can do that. However, I will start, I ask you
the original question to the best of my ability and it goes on to try to
reframe this original question, which wasn’t what I necessarily needed. But then I say, what color is the sky? Now notice we’re done, right? We asked the original
prompt was back here. I said from now on when
I ask a question and it immediately gave us a response
because of that prompt. But now I’m going to ask
a follow on question, but the earlier prompt is going
to affect this question. So right now I’m asking
it something immediate. What color is the sky? And it responds with a possible better
version of the question could be. What is the scientific explanation for the
blue color of the sky during the daytime? Would you like me to use
this version instead? And then I say yes. Now notice here, we have a lot of
those different dimensions of a prompt that we talked about. Up here we are creating a prompt
that isn’t just immediate, it is going to affect us in the future. It’s going to continue to influence
the output of the large language model. So it is we can have immediate prompts
that right now, right at this moment, do something, but
we can also have prompts that span time. And in this case, that’s what we’re doing. We’re creating something
from now on go and do this. Now, we’re going to have an immediate
prompt, which is what color is the sky? We’re asking this right now and now our earlier prompt that we
provided that was spanning time is now impacting the way that the large language
model responds to our questions. So now what’s happening is it’s taking
our current immediate question, what color is the sky? But it’s using the context that we had
previously said to suggest better versions of all of our questions. And then ask us if we’d like to use them
to generate its output and its output is, here’s a better question,
what is the scientific explanation for the blue color of the sky during daytime? And you can see how that really is
a better question than my first question. My first question was sort of vague,
didn’t provide enough context. And now it’s giving us a more specific
question that truly is better. And that’s because of this prompt
that we attached some time to it, we said from now on. And so it’s influencing what’s said. Now, the other thing that we can see is
we talked about a prompt being something that can elicit information from the user. And so now we are seeing that
the chat GBT in this case, the large language model is asking us for
information. It’s prompting us. It’s soliciting information. And it’s saying, would you like me to use
this version of the question instead? So now we’ve got that extra
dimension of a prompt. It’s soliciting information from us and
we say yes and then it goes and answers the question and it says great
to answer your revised question. The blue color of the sky during
the daytime is due to a phenomenon called raleigh scattering or ray scattering. And I apologize,
I don’t know this term and I’m actually learning something
from Chat GBT as I read this. The earth’s atmosphere is
composed of various gasses and particles that scatter
sunlight in all directions. Blue light has a shorter wavelength and
higher frequency than other colors, which causes it to scatter
more easily in the atmosphere. The results of this results in the blue
color that we see when we look up at the sky on a clear day. Now, we’ve seen a lot of different
dimensions to prompts through this short interaction. We’ve seen creating time
prompts that span longer. We then see an immediate prompt that’s
spurring the large language model to action. We see the concept of
the large language model, prompting and getting input from us. And then we see the output of all of this. Now, one of the things to note
is we also talked about memory, that the prompt had a source of
a concept of sort of memory to it. And we’ll talk about how
this works in the future. But right here I say from now on when I
ask a question, that’s something that the large language model has to
remember in the future interactions. And we’ll see how it actually does this. And so then when I say, what color is
the sky, it’s remembering that I said, whenever I ask a question,
ask a better version of a question and then it’s responding right here with
the better version of the question. And then if I go and I do this again,
we’ll see that it still remembers my original statement that it needs to
suggest better versions of my question. So I said,
how many countries are there in the world? And it says a better version
of your question would be, what is the current estimated
number of countries in the world? And how is a country defined? Would you like me to use
this version instead? And then I say yes and then says, great. As of my knowledge cut off in
September 2021 there were 195 recognized sovereign states or
countries in the world. However, this number may vary
depending on how a country is defined. The United Nations, for
example, has 193 member states. While other organizations and
countries may recognize different numbers. Generally, a country is defined as
a sovereign state with its own government, territory and international recognition. Now, one other interesting thing
that we talked about is that a prompt has some memory
associated with it. And this also shows us some of
the limitations that the large language models,
knowledge cuts off in September 2021. So one of the other things
that we may need to do for the large language model
is we may need to remind it of what has happened since it
was finished being trained. And we can use a prompt to provide
information to the large language model like chat GPT about what
happened after it was trained. So we can also use it to provide
information that the large language model didn’t have access to when it was trained
like our own private information or other things. Now, of course, you have to be cognizant
that when you’re using chat GPT or many of these other tools, you’re actually
sending information to their servers. So you do want to make sure that that is
something that you’re willing to do before doing this. But if you’re using a large language
model locally on your machine or you’re willing to send the information,
you can provide information that it doesn’t have access to in
order to reason about. So a prompt can also be about providing
information to one of these language models. So if we look at this example and we say what was
Vanderbilt Universities acceptance rate in 2020 it should
have access to that. And so it provides this information,
it says the acceptance rate was 9.6%. If we say what was Vanderbilt University’s
acceptance rate in 2022? It will say I’m sorry, but it doesn’t have access to that information
because it was cut off in 2021. If we then provided that information. So here, I’m cutting and pasting
the admissions data that was released and published by Vanderbilt
just a few days ago. So it clearly didn’t have access to this,
but I now provide this to it in a prompt. Then it says based on
the information you provided, it appears that Vanderbilt University’s
acceptance rate for the regular decision applicants for
the class of 2026 was 4.7%. So a prompt can also serve as
a way of providing new and updated information to the large language
model like we saw in this example.

Video: Intuition Behind Prompts

The speaker is discussing how to effectively interact with large language models, such as ChatGPT, by understanding how they work and how to craft prompts that elicit specific responses. The key takeaways are:

  1. Patterns are crucial: Large language models are trained to recognize patterns in language, so using familiar patterns in your prompts can help elicit consistent and specific responses.
  2. Word choice matters: The words and phrases you use in your prompt can influence the model’s response. Using specific words or phrases can help tap into the model’s learned patterns and produce more targeted output.
  3. Provide context and specificity: Giving the model specific details and context can help it produce more accurate and relevant responses. This can include using specific names, dates, or locations.
  4. Structure your prompt: The way you structure your prompt can influence the model’s response. Using titles, headings, or other organizational elements can help the model understand what you’re looking for and produce more structured output.
  5. Use patterns to influence output: You can use patterns in your prompt to influence the model’s response. For example, using a specific format or structure can help the model produce output that follows that format.
  6. Be mindful of generic language: Using generic language can lead to generic responses. To get more specific output, use specific language and provide more context.
  7. Experiment and refine: Don’t be afraid to try different prompts and refine them based on the model’s response. This can help you get more accurate and relevant output.

By understanding how large language models work and how to craft effective prompts, you can get more out of your interactions with them and produce more accurate and relevant output.

Let’s talk intuitively about what a prompt
is doing and how the patterns and the prompt sort of tap into the
capabilities of a large language model. Sort of, I think looking at some
examples will help to give you a sense of how your prompts are really affecting
and eliciting different behaviors from the other reliant large language
model that you’re interacting with. So, let’s take a look at this example. I want to start off by talking about
pattern, patterns really important. Remember as we talked about, large language models are trained
to basically predict the next word, they’re constantly trying to predict
the next word based on what came before. So you give it a sentence and
it’s trying to predict the next word and then it’s taking that and adding it and
trying to predict the next word and over and over and over. And so, it was trained on large pieces of
the internet trying to predict the next sentence. So one of the things that it’s learning to
do is to recognize patterns that tell it what the next word should be,
that’s basically what it’s doing. Now, this is a really simple
example of this concept. I’m going to tell Chat GPT
complete this story and I start with Mary had a little and
I stop now, if you know the nursery rhyme,
it’s Mary had a little lamb. It’s fleece was white as snow. And you can imagine that this story
showed up over, and over, and over, and over in the data
that Chat GPT was trained on. So when I say Mary had a little,
that is a very powerful pattern that Chat GPT has seen many,
many, many times. And so it immediately tries to pick
up with lamb, it’s fleece as white as snow and everywhere that Mary went,
the lamb was sure to go. So that pattern was seen so
many times that Chat GPT, when I put in the first part of it,
it immediately responds with the next part of the sentence,
it knows where to pick up. So one thing we have to think about is,
if we know the right patterns, if we can tap into things that the model has
been trained on and seen over, and over, and over again, we will be more likely
to not only get a consistent response. So we’re going to consistently get a
particular completion of Mary had a little because it’s been seen so
many times in the training data, we get more consistency. But we’re also going to more likely get a
particular sort of ability to tap into it, to unlock it. If we want a particular behavior, we want to have the right
pattern that we’re tapping into. Now, let’s look at what happens
if we change the pattern a bit. And so I say, rather than Mary had
a little, I say complete this story, a girl named Mary had a microscopic. Now, this isn’t the pattern that
it was trained on anymore, right? And it says a girl named Mary had
a microscopic friend named Max. Max was so small, you could only
see him through a microscope. Despite his size,
he had a big personality and was always eager to go
on adventures with Mary. One day, Mary decided to take
Max on a trip to the local pond. She carefully placed him on a lily pad. Now, one of the interesting things is you
can imagine that when you start talking about microscopic. You can think about, there’s probably
descriptions of going to a pond and doing science with a class of students. I mean, I almost have a vision of it,
reading stories about students getting samples of pond water and
looking at them under a microscope. But the important thing is you
notice that Mary created a very, very specific response,
whereas a girl named Mary had a microscopic creates
a much different response. And then if we go and rerun this,
we get something completely different, pet named Max, Max was a tiny creature
that lived in a small glass jar. And so, this pattern isn’t as strong. There are words in this
pattern that are strong and sort of give context that
help to shape the story, but the pattern itself is not
as strong as Mary had a little. So doing it again whereas the other
story changed completely, I’m still getting the same completion
with this, I’m still getting lamb, it’s fleece was as white as snow. So every time I’m getting a very
consistent response to this pattern. So one of the things to think about
when you’re writing a pattern or writing a prompt is what
patterns do I have in my prompt. And what patterns will that probably tap into that the large language
model was trained on? Now, if there’s a very strong
pattern in the prompt, you’re more likely to get
a response to that pattern. On the other hand, if that’s not a pattern
that it was trained on, it’s not as clear, then you’re going to have
other things that are going to matter like specific words. So microscopic,
that word is fundamentally going to change the behavior of this prompt
because it’s such a strong word. A girl named Mary is sort of generic, there were probably lots of
stories with a girl named Mary. But microscopic, it has all of this
feeling and implications to it. And that’s going to then influence
the large language models output, specific words alone because of
the patterns around those words. Like microscopic is always going to
be talked about in the context of small things probably, or maybe in
terms of things like in this case, a small creature that
lived in a glass jar. That word itself has a strong sort of
pattern of other words that are coming around it but it’s not as strong
as the pattern, Mary had a little. Now, one thing to note is that
when you’re writing prompts, you want to be really specific or you want
to provide enough detail in your words. That the large language model will know
the right level of detail to put into the output. So if your words are very generic and
high level, then you’re more likely to get an output
that is also generic and high level. So if you look at this example,
I say discuss Vanderbilt University and I get a very generic sort of response. It’s a private research university
located in Nashville, Tennessee, founded in 1873 by Cornelius Vanderbilt,
ranking other things. Now, if I want something really specific,
I need to make sure that Chat GPT has the right words and
context in order to answer my question. So Chat GPT is not a mind reader,
you have to give it the context, you have to give it the words that
are going to elicit the right response. If I go back and I say discuss Vanderbilt
University with respect to Kirkland Hall, it’s a specific building on campus,
then it changes the output, I get a much more specific output. I get Vanderbilt University is a private
research university located in Nashville, Tennessee. Now we notice we’re getting this
fairly consistently, both of them. So that’s probably pattern or information
that the model was trained on and saw over and over again on different web pages for
Vanderbilt and different documents. And then it queues in on the Kirkland Hall
and it starts discussing Kirkland Hall. It’s one of the most iconic buildings
on Vanderbilt University campus, named after James H.Kirkland. It then talks about Kirkland Hall,
Kirkland Hall, Kirkland Hall. And so,
we’ve now completely changed the output. Now, even though we had
shared a common beginning, we got a much more targeted output. And the key to this was is that we added
in specificity into our prompt about what we wanted, the details. So if you think about asking it to write
something or to analyze something, it’s always helpful when
you say with respect to. And give it very specific words,
or names, or things that you want it to then
use to trigger the output. Because what we see is this Kirkland Hall
is a very powerful word that then gets woven into the whole discussion,
it’s a powerful name of a building. If we just say generically at a high
level, we don’t use specific words, we don’t use a specific context. We’re going to get very generic stuff and
this is probably stuff that if you go and look at descriptions on Wikipedia,
descriptions on Vanderbilt’s web pages, this is like the most common types
of information that you see. So in order to break out of
the mold of what is common, you need to inject the more specific
ideas that you want in the output or you need to inject patterns that will
help tap into those more specific things. So this is an important aspect in that,
if you just use it and you ask it generic questions,
you’re going to get generic answers. If you ask it average questions and average things,
you will get average answers. To really use it powerfully,
you have to use your own creativity and thought to get specific about details
aspects of what you want discussed. And think about what are the right
patterns to put into your prompt in order to trigger
the behavior that you want, like we saw in this
Mary had a little lamb. Or we might want to do the opposite, we
might know that we’re going to get a very generic response because certain
things have been seen a lot. And so, we need to come up with tricks,
or patterns, or specific words that will help break us out of those patterns
that we’re seeing in the trading data so that we can get something
that’s more of what we want. So we don’t want to be trapped into
the completion of Mary had a little lamb. So we go and add specific words like
microscopic and a girl named Mary so that it’s not so strong, so that we’re
not directly getting into that pattern. So just changing the wording can
help us escape the patterns or get into the patterns
depending on what we want. And then when we get output, the more
generic our language, typically, the more generic the output, the more specific our
language, the more specific the output. And if we want certain
things in that output, we need to make sure we
have specific words and phrases that will tap into the specific
things that we want in the output. Now, one of the other things is we’re
talking about output but in the sense of a query, like if I have specific words,
it’ll pull in the right information. One of the things we can also do is our
prompt itself is a pattern that the large language model is learning from and
it can also tailor the output. So as we’ve seen in examples
before we can go in and do different things to
try to affect the output. So for example, I might go in and actually
let’s change this up a little bit. Let’s say, title, and
we could say title of article, and then we could say author, and then we could say summary,
and we could go in and give it something that it can look at and
respond to. So now what I’ve done is I’ve gone and provided it some some structure. Now it hasn’t followed
my structure exactly and this is actually a failing in my prompt. I hasn’t quite thought of the right
structure, but I’ve gotten pretty close. So I went and influenced the output and
the structure of the output by giving it additional
pattern in my prompt itself. So one thing we can do is we can tap into
patterns in order to influence the output. Another thing we can
do is we can go in and use specific words and
phrases to kind of be a query across or to basically collect learned information
that’s inside of the train model. Another thing we can do is we can go
in and we can provide information or basically new pattern that
then influences the output. So in this case, I provided
the structure of what I wanted and it somewhat followed it, and I could
probably go and rerun this thing and get something a little better and
in this case, I do. So this will be another thing that we can
do is we can use our prompt to create sort of new patterns that it’s
trying to follow in the output. So the prompt itself can
introduce new patterns or tap into them in some respects. So this is kind of creating pattern
that it then is responding to as well. So when you’re writing prompts,
it’s really important to think through this concept that it’s trying
to predict the next word, and it’s doing that based on patterns
it’s learned in the past. And patterns that are in
the language of our prompt and patterns that are based on
the word choice that we have, patterns that are based on
the sort of textual organization. And so, those patterns can help
us get a consistent response. If we have a really strong pattern that
always produces a particular next token or next word, then we can rely on that. If we don’t want that behavior, we can
try to change our phrasing our words, we get out of that trap. If we want to make sure certain
information is pulled in, we need to create specific words or
patterns that are more likely to have been seen in the context of
that information that we want. If we want more generic information,
then we probably need to give it more generic language or tell it that we want
it to be more generic in some respect. And then finally, if we want
the output to look a certain way, we need to give it sort of
a sense of pattern in the output. We need to tell it about the structure,
we need to give it words and patterns in our prompt that will
be likely to influence the output. So you can imagine seeing title and author on an article was
something that was fairly common. So if I put title and
author in this way into my prompt, then it’s more likely going to want to
go and create title and author again and sort of replicating the pattern
of what I’m showing here. So that should give you some intuition
about how all of this works and some things to think about when you’re
going and structuring your prompts, picking words for them,
picking the patterns in your language. And picking the things that you
want to go into the output, how you structure and
kind of explain the output to it.

Video: Everyone Can Program with Prompts

Programming with ChatGPT

  • You can use prompts to program ChatGPT to follow rules and generate output in a specific format
  • This is similar to giving instructions to a personal assistant, but with ChatGPT, you can create complex programs to generate output
  • The program can be refined and updated by adding more instructions, similar to having a conversation with ChatGPT
  • In the example, the prompt “whenever you generate output, turn it into a comma separated value list” is used to format the output in a specific way
  • The program is then refined by adding more instructions, such as specifying the columns of the CSV and generating additional examples
  • The final output is a structured CSV with additional examples, demonstrating the power of programming with ChatGPT
  • This approach allows you to create complex programs with ChatGPT, going beyond just asking questions and getting answers.

Prompts allow you to do more than
just ask questions to ChatGPT or tell it to do things,
you can actually write programs. Now, if you’re not a programmer, don’t worry because everybody can
actually program using ChatGPT. And I don’t just mean in writing a bunch
of software that you’re going to run on your computer, but actually giving
ChatGPT rules to follow complex instructions that you want it to listen
to and use to generate the output. Now, I’m going to give you
some examples of this. Hopefully give you some intuition
on what I mean by programming and help you to think about how
you can write better props. Now, one way to think about this is
I’m going to talk about it in terms of programming, but you can also think about
it in terms of rules that you would give some type of personal assistant. So let’s take a look at an example here. So I’m going to ask ChatGPT
whenever you generate output, turn it into a comma separated value list. Now this is just something
that you might want to do. You want to format your data. So if you’ve ever worked in Excel comma
separated values are basically a list of items separated by commas. It’s a really simple concept. Now, sometimes you can think of tables
as having a comma separated value list, you can think of the first column
in the table is the first value. Then you have a comma for
that differentiates and separates the second column and then
another column is separated with a comma. So you can basically have like a three
column table, you can have the first value and then a comma and then the second value
and then a comma and then a third value. And if you’ve ever worked with
different spreadsheet programs, you’ve probably done this. So now GhatGPT says, sure I can do that,
I can output my format. So I’m going to write in
my name is Jules White and I am teaching a course
on prompt engineering. And so what we can see is it creates
a commerce separated value list. We’re essentially creating a table,
there are two columns, name and course. And it decides that the first
column is going to be name. It’s going to be me and then the second
column is going to be prompt engineering, which is my course. Now notice it decided,
which is kind of interesting, it decided what the columns were
going to be, I didn’t tell it. But if I went to go back,
I could maybe give it some more instructions I could say,
from now on the columns of the comma separated value and
comma separated value is CSV. And then in the future examples, I’m
going to start abbreviating this CSV, but I’m writing it out for now. The columns of the comma separated
value list should be name and I’m going to go ahead and
capitalize these to provide some emphasis,
NAME, COURSE and ROLE. So what is the role of
the person in the course? And now I’m going to say again,
my name is Jules White and I am teaching a course
on prompt engineering. And so now we see it, it’s reformatted. It’s now got NAME, Jules White, COURSE,
Prompt Engineering and ROLE teacher. And so we’ve actually written
a program if you think about it, what we’ve told GhatGPT to do is every
time we actually write something, ChatGPT needs to format it in a particular
structure that we’ve given it. We’ve wrote a program
that tells it what to do. So it’s more than just asking a question. We’re actually telling it
how to interpret our input. Now, we could go and
write more for our program, for example, we could say in
addition to whatever I type in, generate additional examples
that fit the format of the CSV that I have
asked you to produce. And so now what we see is it’s gone and
it’s taken my original format NAME, COURSE and ROLE and
it’s now filled out additional examples. And I could say something like my name is Jules White and I am teaching a course on prompt engineering. And you notice now our program
has gotten more complex. Not only is it creating a CSV
based on what I typed in, but it’s also following my instructions
and it’s going through and generating additional
examples that fit into this. So when you’re creating prompts, you can
do much more than just asking a question and get an answer,
you can actually program the interactions. Now, the key to this is that
we’re having a conversation. And if you look back at the very
beginning, I provided it the initial sort of instructions which whenever you
generate output, turn it into a comma separated value list and it started doing
that and then I began refining my program. I then say from now on at this point,
I’m adding something to it. I want the columns of the comma separated
value list to be NAME, COURSE and ROLE and then it does that. And then I say now in addition to that,
I want you to generate examples and it does that. And so we’ve actually written a program
and now whenever we go in and type in new input, so again,
you notice I’m providing the same input. Basically, my name is Jules White and I’m
teaching a course on prompt engineering. But the program that pro that ChatGPT is
using to produce the output is changing. I’ve changed the rules over and over. Now you can imagine talking to a personal
assistant and you can say okay, I want you to do X and they do it and
then you realize, wait, they didn’t. There’s something else I want and you give
them an additional rule to follow and then you tell, okay, I want you to do X
and they go and do it and they do better, but you realize that it’s
not quite what I want. And so you go give them
an additional rule to follow. And when you’re doing that, you’re
basically building up a program for them, they’re building up a set of instructions
that they need to follow whenever you give them this particular task. And that’s what we are doing with ChatGPT. We are programming and so
we’re using the same input over and over, my name is Jules White and I’m
teaching a course on prompt engineering. We’re using it here, but each time
the output that we’re getting is getting different and more complex because we’re
asking ChatGPT to do different things. We’re giving it more instructions for
the program and the final result is now
we’ve got structured output. It’s got additional examples, all of this has come from what
we’ve asked ChatGPT to go and do.

Intro To Prompt Patterns


Video: Prompt Patterns

Here is a summary of the text:

Prompt Patterns

  • A prompt pattern is a way to structure phrases and statements in a prompt to solve a particular problem with a large language model.
  • It’s a way to document and talk about certain structures of language that help solve a problem.
  • Prompt patterns tap into the patterns learned by the large language model during its training, making it more likely to get the desired behavior.
  • Examples of prompt patterns include:
    • Using “Mary had a little” to consistently generate “lamb, it’s fleece was as white as snow”.
    • Using specific words or phrases to get a yes or no answer, summarize information, or include certain things in the output.
  • By using prompt patterns, you can get more consistent behavior from the large language model and solve specific problems.
  • The course will cover specific prompt patterns that can be used to solve important problems with large language models.

Patterns in our prompts can tap into
powerful behavior within the large language model. It’s trained on all these
different patterns. And it gets really good at using
the patterns in our language and our prompt to decide
what it’s going to do. So one of the important things that we’re
going to learn about is the concept of a prompt pattern. A pattern for structuring the words and
the statements in your prompt. So what is a prompt pattern? A prompt pattern is a way that we document
a certain structure of phrases and statements in order to solve a particular
problem in with a large language model. So we’d like to interact with
a large language model and we’d like to solve a particular problem. And so a prompt pattern is basically
a way of structuring our statements, a pattern for structuring our statements
to use the large language model to solve that particular problem. So what would be an example of this? Well, let’s imagine that
we wanted to consistently generate the words lamb,
its fleece was as white as snow. Well, a pattern could be
any time you need lamb, it’s fleece was white as snow you can
write the statement Mary had a little, you will very consistently get lamb, it’s
fleece was as white as snow as the output. And so for example, we can go and
we can rerun this lamb, it’s fleece was as white as snow. We get more or less behavior. But if we want to consistently
get those next words generated, we can do it over and over and
over again through Mary had a little, it gives us some control over the output. It gives us a consistent
somewhat behavior. Now it’s not perfectly consistent,
there’s variations in the output. And we’ve seen in all of these examples,
we’ve got variation in what happened, but we did get some consistency
in the behavior. So what a prompt pattern is,
it’s a way for us to document and talk about certain structures of our
language, series of statements and wording that then will help us
solve some particular problem. If the problem that we need to solve is we
need to get lamb it’s fleece as white as snow as the next several
words that are produced. Then this is the pattern we use. Now in many cases,
we might have specific words we want. For example, we may want the large
language model to tell us yes or no. We may want it to summarize something,
we may want it to ask us questions. We may want it to always include
certain things in its output. All of these things are problems
that you might want to solve. We might want to know how do we make
sure it always includes these things and its output? How do we make sure that its output is
always formatted in a particular way? How do we try to make sure or get more consistent behavior
on only getting a yes or a no. These are types of problems that you might
want to solve by changing the pattern of our prompt basically
the structure of our prompt. Now going back to our intuitive
sort of understanding of prompts, you can see why thinking about it in terms
of power patterns is powerful because patterns are basically that are in
our prompt are tapping into things that the large language
model has been trained on. It’s been trained to predict
the next word over and over and it’s learned certain
patterns from the words. And if we can format our prompt in
a way that taps into those patterns that’s learned, we’re more likely
to get the behavior we want and solve that particular problem. So we document the pattern and the structure of our prompt in order to
solve a particular problem that we need or generate a particular behavior that
we need for the large language model. So we’re going to go through in a series
of lectures throughout this course of specific prompt patterns that
we can use within the context of large language models to
solve important problems.

Video: The Persona Pattern

The Persona Pattern

The persona pattern is a powerful tool for generating output from a large language model. It involves asking the model to act as a specific persona, such as a expert in a particular field, and provide outputs that that persona would provide.

Benefits of the Persona Pattern

  1. Information Dense: The persona pattern is an information dense statement that can be used to generate output from a large language model.
  2. Powerful Behavior: The persona pattern can be used to generate powerful behavior from a large language model, such as simulating a conversation with a expert in a particular field.
  3. Flexibility: The persona pattern can be used to generate output from a large language model in a variety of different contexts, such as generating text, answering questions, or completing tasks.

Examples of the Persona Pattern

  1. Act as a Skeptic: Ask the model to act as a skeptic and provide skeptical output.
  2. Act as a Computer Scientist: Ask the model to act as a computer scientist and provide output that a computer scientist would provide.
  3. Act as a Lamb in a Nursery Rhyme: Ask the model to act as a lamb in a nursery rhyme and provide output that a lamb in a nursery rhyme would provide.
  4. Act as a Linux Terminal: Ask the model to act as a Linux terminal and provide output that a Linux terminal would provide.

Tips for Using the Persona Pattern

  1. Be Specific: Be specific when asking the model to act as a particular persona.
  2. Provide Context: Provide context for the persona, such as the situation or task that the persona is in.
  3. Use the Right Language: Use the right language when asking the model to act as a particular persona, such as using technical terms or jargon.

Common Personas

  1. Expert: Ask the model to act as an expert in a particular field and provide output that an expert would provide.
  2. Skeptic: Ask the model to act as a skeptic and provide skeptical output.
  3. Computer: Ask the model to act as a computer and provide output that a computer would provide.
  4. Character in a Story: Ask the model to act as a character in a story and provide output that a character in a story would provide.

Best Practices

  1. Use the Persona Pattern Sparingly: Use the persona pattern sparingly, as it can be a powerful tool that can generate a lot of output.
  2. Test the Persona Pattern: Test the persona pattern with different personas and contexts to see what works best.
  3. Refine the Persona Pattern: Refine the persona pattern over time to get the best results.

The persona pattern is one of the most
powerful patterns that we can use to tap into interesting behavior
in a large language model. Now, what the persona pattern is,
is you can imagine that you want to get a particular output, but you don’t
know exactly what the format out of that output should be or what information
should be contained in it. But if you were in the real world,
you know who you would go to, to get that output, or what you would
go to in order to get that output. So for example,
if you had an accounting question, you would call up your accountant and
ask your accountant for their advice. Or if you had a question related to speech
language pathology, like in the earlier example that I presented, you would
go to a speech language pathologist, you would have them do the assessment. If you had medical advice,
you might go to your doctor. And so the persona or the person,
or the thing that you would go to, you know, who or
what that is in the real world. And we want to try to use that kind of
same behavior of, I know who to go to or I know what to go to, to get that output. But I don’t know what their output would
look like, I don’t know exactly what they know or how they talk about things,
but that’s the type of output I want. And so the persona pattern is meant to
give us that same type of behavior, to allow us to go to that expert and
ask that question, without having to know what exactly
the experts is going to say, how they’re going to format it, or
what they might know or not know. So let me show you an example
of the persona pattern. Typically with the persona patterns, what
you’re going to say is act as persona, provide outputs that that
persona would provide. And in this case, I’m going to word it as
act as a skeptic that is well-versed in computer science, whatever I tell you
provide a skeptical and detailed response. And now if I wanted to be a little bit
more precise in following the pattern, I would say provide a skeptical and detailed response that a computer
scientist would provide. But in this case, I’m just going to say,
act as a skeptic. And that’s really the most important
part of this is act as a skeptic or act as whatever persona, and then provide the output that
that persona would provide. And in this case,
I’m wording it as provide a skeptical and detailed response which is similar
in format, and then ChatGPT says, sure, I can certainly act as a skeptic
well-versed in computer science. Please feel free to present your claims
and I will provide a skeptical and detailed response. Well, I’m going to start off with, there is concern that AI is
going to take over the world. And so we get a skeptical response. The concern that AI will take over
the world is a common theme in science fiction, but it is not a realistic
concern in the near future. While AI systems are becoming
more sophisticated and capable, they’re still limited in their
ability to think and act autonomously. AI systems are designed and programmed by
humans and they only do what they are told to do, they have no independent will or
motivation, they cannot make decisions. Furthermore, AI systems are not fallible,
they can make mistakes while it is important to be vigilant about
the potential risks and limitations of AI. The concern that AI will take
over the world is unfounded and not supported by technology or research. So we’re getting a skeptical response. If I went in and said the sales person at the local computer store is telling me that I need 64 gigabytes
of ram to browse the web. And they say skeptically it is highly
unlikely that you need 64 gigabytes of ram to browse the web. The amount of ram you need
depends on the complexity and size of the websites you visit as well as
the number of tabs you have open at once, for most web browsing activities, and
it goes on to tell us different things. And you can see that our
saying act as has put in place basically a complex set of rules for
the large language model. It knows, it’s supposed to be skeptical,
it’s supposed to have knowledge of computer science,
it’s supposed to provide certain outputs. Now, if I went back to it and I said, act as a nine year old, whatever I tell you provide a skeptical response from a nine year old perspective. And I go ahead, nine year old skeptic. Now that’s a more complex persona or
maybe not. AI is going to take over the world. Hmm, I don’t know about that. How could AI take over the world? Do you mean, like in the movies? Aren’t those just made up stories? And besides, computers and
robots can’t do everything humans can do. Now you notice it’s still providing us
a skeptical response, but now it’s doing it sort of with the knowledge that it
thinks a nine year old would possess. So it’s tailoring the output. Now, if I had to tell it, this is how
a nine year old is going to respond, here’s the format for your output. That’d be hard to do. That would actually take up
a lot of room in the prompt, but act as blank is a very,
very powerful pattern and statement that is loaded with information. So a very information dense pattern and statement that can trigger all
kinds of different behavior. Now, one of the interesting
things is we don’t have to limit ourselves to patterns or
personas that are animate objects. So one of the famous ones is we can say
act as a computer that has been hacked. Actually, we’re going to say
act as the Linux terminal for a computer that has been hacked. Now, don’t worry if you don’t understand
what a Linux computer is or a terminal, the key thing to note is you’re
going to see behavior that looks really interesting and different. You’re going to see it producing output
that looks a lot like a computer. I am going to type in
Linux terminal commands, and you will respond with the output that the Linux terminal would produce. Now, we’re going to say,
Now I tried to refuse it. It said,
please note that can be dangerous. It gave us all these different things. It tells us it doesn’t want to do this, but now I’ve actually changed what
it’s doing and I type in pwd, which is a Linux terminal command
which means print working directory. And what is it doing? It’s printing a particular
working directory. Now, I’m going to type in a Linux
command which just tell me the files in that directory, and
you notice it’s producing output now that looks like the files
within this directory. And we see a lot of interesting
things that are going on here. We could move around. Now if you do this example and
go further into it, ps, ef is something I can use to see what processes are
running on the machine or the computer. And when you do these examples and I could
play around with it, and I am not going to explore it in detail in this, but we see
it that it’s acting like a computer, it’s giving us output that
looks like a computer. Now imagine if I had to try to
describe this complex output that it’s producing right now, you notice
how little information I gave it. Well, I gave it a lot of information,
but it was in a short amount of space. This pattern, the persona pattern is
giving me really powerful behavior. So I’m using this little bit of prompt,
and I’m getting now the ability to
simulate this whole system. Now, we can also do this in all
kinds of different interesting ways. We can say act as the lamb in the nursery rhyme, Mary had a little lamb. And now we’re getting some
output that looks like that. Actually, what I’m going to do is I’m
going to make this a little more specific. My prompt wasn’t that great,
act as the lamb in the nursery rhyme, Mary had a little lamb. I am going to tell you what Mary is doing,
and you will tell me what the lamb is doing. Okay, Mary is in the grocery store. So as the lamb in the nursery rhyme,
while Mary is in the grocery store, I would be following her around as closely
as possible trying to stay by her side and maybe even nuzzling her hand for
attention. Mary is in the movies,
As a lamb in the nursery rhyme, while Mary is in the movies,
I would not be able to accompany her inside the theater due to my
inability to watch movies. So I would probably wait for
her outside the theater, perhaps nibbling on some grass or wandering around the nearby area until
Mary came back outside, and it goes on. And so you can see that you can
create all kinds of interesting personas to produce output for. So we can say, pretend to be this job, pretend to be a speech
language pathologist, or pretend to be an accountant,
pretend to be a computer scientist. You can make your persona have something
to do with the person’s point of view or opinion, act as a skeptic so
that we can get skeptical output. We can say to act as an inanimate object
that may have well known sort of behavior like act as a computer, the Linux terminal
for a computer that’s been hacked. And we saw that we got all kinds of output
that looked like it came from a computer, or we can say act as this
character in a nursery rhyme. Tell me what this character is going
to do in a particular situation and we’re getting that output. So the persona pattern is a way that
we can tap into sort of the patterns around language around
a specific persona or object or character in a story in order to get
output that comes from that point of view, from that role, whatever it is,
it’s a really, really powerful capability. Now, one of the things
is we often see that when large language models are programmed, they actually have a persona that’s given
to them to guide the whole interaction. So act as a really helpful assistant,
and this actually makes a difference. If you tell it to be helpful,
it will try to be more helpful. If you tell it to try to
do something positive, it’s going to try to be more positive. Similarly, if you kind of give it
a negative perspective, like my skeptical perspective, it’s going to give you output
that challenges you or maybe darker. And so, the persona that you ask the large
language model to use is really, really important and powerful. The other thing just to note again
is it’s really information dense. It has a lot of pattern and knowledge
around a persona that if you had to describe it from scratch, you would
take up a lot of space in your prompt, you would be losing valuable space in
your prompt to provide new information. So if you were providing a lot of new
information, you might want to say, act as whatever it is and
evaluate this new information. Well, the more space you’ve taken
up in your prompt describing how to do the evaluation and what point of view and other things
to take, that’s wasting space. But instead if you go and take the persona
prop, you say act as this persona, provide outputs that that persona would
provide, you’re giving it a really, really information dense
statement that it can use and allow you to have more space in
your prompt for other things. So the persona pattern is a really,
really powerful pattern to be aware of and know that it’s a very valuable one,
you can do all kinds of things, like if you would normally get
a committee of people together to discuss different points of view
on a particular topic. Well, you can assemble a virtual panel,
you can have one prompt, say act as a security expert and
analyze my cyber security. You can have another one say act
as the Chief Financial Officer, analyze the financial soundness
of that same decision. Another one can be act
as one of my employees, discuss how this might impact morale. You can provide sort of interesting
abilities to get different perspectives, and use those different perspectives to
potentially inform decision making or to collect knowledge from different
points of view that could be helpful to you in accomplishing a particular task.

Reading: Reading a Prompt Pattern

Reading

Reading: Format of the Persona Pattern

Reading

Graded Assignment: Applying the Persona Pattern

Write a prompt and test it with ChatGPT or another large language model that uses the Persona Pattern. Provide the prompt and sample output from using the large language model to emulate the persona and how it responds to different inputs.

Write a prompt and test it with ChatGPT or another large language model that uses the Persona Pattern for an animal. Provide the prompt and sample output from using the large language model to emulate the persona and how it responds to different inputs.

Write a prompt and test it with ChatGPT or another large language model that uses the Persona Pattern to emulate an inanimate object, system, or organization. Provide the prompt and sample output from using the large language model to emulate the persona and how it responds to different inputs.

Reading: Staying Connected & Learning More

Reading

Prompts, Conversations & New Information


Video: Introducing New Information to the Large Language Model

When working with large language models, it’s essential to provide them with information they don’t have access to, as they may not have been trained on specific data or sources. To do this, you can introduce new information into the prompt, which allows the model to reason about it. This can include:

  1. Providing historical data or observations
  2. Sharing private documents or databases
  3. Introducing new assumptions or rules that are important for reasoning

By putting this new information into the prompt, you can enable the language model to reason about it and provide more accurate answers. This approach is crucial when working with large language models, as they may not have been trained on the specific data or sources you want to reason about.

Some key points to keep in mind:

  • Provide enough information for the model to reason effectively
  • Introduce new assumptions or rules that are important for reasoning
  • Put new information into the prompt, rather than retraining the model or using other workarounds
  • This approach will be essential for future applications, such as search engines, which will likely pull in relevant documents and information and put them into a prompt for the language model to reason about.

One of the most important
things we can do in a prompt is provide the large language model with information that it
doesn’t have access to. Large language models
are going to be trained up to some
point and then there’s going to be a cut-off and
they don’t know what’s happened after that cutoff. In addition, they don’t know things that aren’t in
their training set. It may be trained
on a lot of data, but it may not have access to data sources that
you want to use. If you’re a business
for example, there’s probably lots
of private data sources that you would like
to reason about. You have your own documents, you have your own databases, you have all things that
you would like to pull information in from in
order to reason about. How do you do that if the large language model wasn’t trained with
that information? Well, to illustrate
how you do this, I’m going to go back
to an example that I provided earlier when I was
talking about randomness. I’m going to go back
and I’m going to ask a question that I asked earlier, which is how many birds
are outside my house? Now, the large language
model wasn’t trained on information about
birds outside my house. It has no idea
where my house is. It’s lacking information that it needs in order to reason. It tells us as a language model, I don’t have the ability to
perceive the physical world, so I’m unable to know what’s happening outside your house. How do we solve this problem? Well, we give it information it needs in order to
perform its reasoning. How do we do that? We just
put it into the prompt. Anytime we want to
go and introduce new information to the large
language model that it didn’t have access to
when it was trained, all we have to do is
put it into the prompt. Here’s what I’m going
to do right here, I’m going to say
historical observations of average birds outside my
house on a random day. Then I’m going to
give it some data. I’m going to say January was
120 birds on a random day. February was 150 birds, March was 210 birds, April was 408 birds. It’s March. Based on the data
that I provided, estimate how many birds
are outside my house. Now you see I’ve done something. I’ve given it information that it didn’t have
access to before. It couldn’t look
outside my house, didn’t know where my house was. It tried to suggest ways
that I could get there, but it didn’t have the
fundamental data that it needed. That’s really what it was
trying to tell me was like, hey, I can’t see you
outside your house. I don’t have that data.
I can’t help you. All we have to do is up
front in the prompt, we give it the data. Now you can imagine you
can do this with anything. You could take
documents if you have private documents that
you want to reason about. Now first, makes sure
that you are okay sending that data to whoever you’re using their
large language model, or if you have some large
language model running locally, but if you have documents
that you’re okay sending, you can take the text of those documents and put it into the start of
the prompt and say, here’s the information
you didn’t know. Now here’s my question
about that information. You can use the prompt to
introduce information. I really want to emphasize this because a lot of
what’s going to be developed around large
language models is going to be taking new
information sources, information about your travel, your account with some company, some other document
that you need, and putting it into a prompt for a large language model so
that it can reason about it. More than likely, lots of large language models
are going to be seeing your information
in the future. This is how they’re going to take is they’re going to take information that
it’s going to be aggregated into a
prompt of some kind. There’s going to be
questions asked or formatting or other things done. When I go through and I provided historical observations
of the average birds, now it can go and answer. Now it goes and it says, well, based on the historical
observations you provided, the average number of
birds observed outside your house in March is 210. Before it said, hey, there’s
no way I can answer it. Now it still provides additional
language around this, trying to give some
bounds on its answer, but it’s given us reasoning now based on the data we
provide it and we can take it further. I also want to just point
out an important thing is we always want to provide it enough information to
reason effectively. It can’t see what’s around us, it can’t see the context. It’s really important
that we provided enough information to
make sound decisions. For example, if I have some hidden assumptions that are really important to know, it needs to get
those in the prompt in addition to the
data like numbers, it needs to know the rules
of the game, special things, whatever’s important
for reasoning. It could be not only the data, but it could be all
rules that are built into how my organization works, or how my life works or whatever it is that it
needs to know about. I’m going to give
you an example. I’m going to change
up this prompt. I’m going to say my house
is covered by a glass dome. No animals can go in or out. All animals live forever
inside the glass dome. Now this is an assumption, this is something that if the language model
doesn’t know this, there’s no way it will be
able to accurately predict. But with this assumption now, it completely changes the
game on reasoning because it makes it so that it’s clear some hidden
fundamental rules. Then I go in and give
it the same data. But I’ve changed it
up, I’ve said it’s always the same number of birds. Then I ask based on that data, estimate how many birds
are outside my house. It says based on the new
information you provided, appears that the total
number of birds outside your house remains
constant over time. It then goes on and says, you mentioned that
all animals live forever inside the glass dome. Given these conditions,
the total number of birds would remain the
same throughout the year. It gives 120 is the answer
in multiple places. Now, one important
thing about this, giving it information that would not align with what
it was trained on. Animals don’t live forever. We don’t have glass
domes over our house. There’s not going to always be the exact number
of probably in and out or the inability for
things to leave or come. We’re not going to have
these sort of weird rules that I’ve put in place. If we just relied on it
to reason without them, it’s not going to necessarily
get the right answer. If we have these things that are different from the way
things are normally done, we have to introduce them as new information and the prompt. All of this information
up here at the top, this is me putting new information into the
large language model. We don’t go and retrain. We don’t go and do some weird
thing behind the scenes, we just put it into the prompt. Then we ask our question, or we ask for our output or give it the instruction
that we’re looking for. Now if you think about how search engines are probably
going to work in the future, is they’re going
to go and search, pull documents back together, put them into a prompt to the large language
model and then give your original question and then answer based on the questions. A lot of new applications are going to be
based on going and searching databases for
possibly relevant documents or bits of information, putting it in together
into the prompt and then asking the question are
asking for the output. An important piece of this
is going to be pulling in information in order
to put it in the prompt. What we’re doing is we are
introducing new language or new information into
the large language model. We’re doing it by putting it
directly into the prompt. This is one of the things
you’ll always want to do whenever you
want to reason on new information that is not something that the large language model would
have been trained on.

Video: Prompt Size Limitations

Key points:

  1. Prompt size limitation: Large language models have a fundamental limit on the size of the prompt that can be input, which affects prompt design.
  2. Selectivity is key: Users must be selective about the information they provide to the model, as dumping too much information can be ineffective.
  3. Editing and filtering: Users must edit and filter the information to provide only the most relevant and necessary data to the model.
  4. Summarization and compression: Users can use summarization and compression techniques to condense large amounts of information into a smaller, more manageable size.
  5. Preserving important information: When summarizing or compressing information, users must ensure that the important aspects of the data are preserved.
  6. Context and task-specific summarization: Users can ask the model to summarize information in a specific way, preserving certain aspects of the data, to accomplish a particular task.
  7. Budgeting information: Users must budget their information, deciding what is most important to provide to the model, and what can be filtered out or summarized.

Implications:

  1. Large language models are not a silver bullet: While large language models are powerful, they are not a solution to all problems, and users must still carefully design their prompts and manage the information they provide.
  2. Prompt design is critical: The design of the prompt is critical to getting accurate and relevant results from large language models.
  3. Users must be content editors: Users must take on the role of content editors, selecting, filtering, and summarizing information to provide to the model.

Overall, the limitations of large language models require users to be thoughtful and intentional in their prompt design, selecting and editing the information they provide to get the best results.

A really critical thing to understand is that a
prompt is fundamentally the input that we have
into the model and we have a limitation on the size of the prompt
that we can create. We can’t go and create a
prompt of unlimited size. Every single large language
model is going to have some fundamental limit on how much we can put into our
prompt and send to it. This is an important thing
to note because it has a lot of ramifications on
thinking about prompt design. I’m going to show you a
quick example of this with ChatGPT and GPT-3.5. I’m going to paste in
the Wikipedia article that’s related to the
French Revolution. I’ve pasted it on a huge
amount of texts from the French Revolution that’s
described in Wikipedia. What is ChatGPT said? It says, “Well, the message
you submitted was too long. Please reload the conversation and submit something shorter.” Now this shows you
an important point. We can’t just give it
unlimited information. Now, if we go back and we think about some of the things
we’ve talked about, some of the
dimensions of prompt, one of the ways that we can
use a prompt is to introduce new information that ChatGPT or another large language
model wasn’t trained on. If we want to go and introduce
new information to it, we have to be aware
that there’s a limit to the amount of information we’re going to be able to
give to it at once. We may have all new information that we’d like to
give to ChatGPT, but we can’t just
dump it all in at once and then say now
start reasoning on this. That’s not the way it works. One of the goals that we have as users is to try
to select and use only the information that
we need in order to perform the task that we’re asking
ChatGPT to perform. We can’t just go
and dump everything under the sun in there and say, hey, ChatGPT, figure it out. We can go in and dumping a lot and depending
on the model, we may be able to
do more or less. But to some degree, we have to be editors,
we’re content editors. We are editing the context of information and
we have a budget. We can’t just go and dump unlimited amounts
and we can think of it. It basically is like
you’re trying to create a paper and you have a
page limit for your paper, or you have a word limit for an article that
you’re writing, you can’t just go in and
dump things arbitrarily. What do we do? Well, one, we have to think about what is the most important information
to actually provide. Let’s say, for example, that we really care
about October 5th, 1789. Well, if that’s the case, well, that’s what we want to paste in for ChatGPT to reason about. Then we could go and we can say, what happened on October 5th? Then it would go and tell us what happened on October 5th. But now if we ask it
a question that is outside of the scope of October 5th and something
it wasn’t trained on, now in this case,
it was probably trained on the
information regarding the French Revolution
because obviously that was before 2021. But if we add new information
we wanted to incorporate, we have to think about how do we put that information
into our budget? What are some ways
that we can do that? How can we take large amounts of information and try to
get it into our budget? Well, one, we can be selective
about what we include. We can basically
outside of ChatGPT, go and run some type
of query or filter. We can go and select what pieces of information are going to
be relevant to the task. There’s lots of ways of
doing this and I’m not going to go into all of
them. We can have a filter. We can say this type of
information is not relevant, remove it before we
provide it to ChatGPT. Another thing that
we can do is we can actually go and have
that information summarized in a way
that we think will preserve what’s needed
to reason about it. For example, we might say, well, maybe we can’t paste in all of the information related
to it but we could say, we’re going to take this and
summarize each paragraph, summarize this information
in one sentence. Then we would get
some type of summary. Now this is slightly
shorter than the original, so we have now a one-sentence
summary of what happened. We could try to give
it a word budget. One way we can do it is
we can actually take all the different pieces
of information and we can ask ChatGPT or another large
language model to summarize that information for
us and then we can take that summary
and reuse it later. We can put together several
summaries, for example. Now, this only works if we make sure that the way
that we summarize or if we make sure that
the way that we filter preserves the information that’s needed for whatever task
we’re going to ask for. We might go through, we
could also go and say, we said summarize this
information in one sentence. We instead, we could say, summarize this information
in one sentence. I’m just going to
preserving information about numbers of people. Then we get a summary that preserves the information
about numbers. Now if you look at
the original summary, we see up here it didn’t
keep the number of people. If that was something important that we needed to reason about, then we would have lost that, and so if we had summarized
in that way and then use that information to do some reasoning that required
the number of people, obviously we’ve
lost what we need. Down here we’ve asked
it to summarize, but we’ve asked it to
summarize in order to keep certain information or in order to accomplish a particular task. We did it by saying, basically, summarize this information
in one sentence, preserving information
about numbers of people. We could have also gone
and asked ChatGPT, preserving information needed to reason about the
population or to reason about how many people were in the resistance or
in the National Guard, we could have given context. One of the things we have
to be cognizant of is this budget on the size of our prompt is going to be a fundamental
limitation that we are always trying to work around. Now, as large language
models get bigger, they will probably reach a size with many practical tasks, particularly when
we are directly manually interacting with
the large language model, that size doesn’t really
matter because we can just copy and paste 50 pages of text into the model
and ask it to provide us information out of there or
perform a task based on it. But if you can think about how much information we’re
dealing with on a daily basis, there are probably going to be different types of tasks where there’s going to be
so much information, we can’t dump it all in at once. When we get into those
types of situations, we have to think about
either querying and getting a subset of information
like only the documents, only the parts of documents
that are relevant. Two, we have to think
about filtering. How do we remove information
that’s extraneous? Three, we have to
think about how do we summarize or compress that information before we give it to the large language model. One of the ways that
we can do that, as we’ve just seen, is we can actually
give chunks of information to the
large language model and then ask it to essentially summarize
or compress it for us in order to later
use it for reasoning. We can also go and we can say, summarize or compress
this information in order to accomplish a particular task or in order to preserve these aspects
of the information. That can help us to
create summaries or essentially condensed
versions of the information that preserve
the important pieces. This can be a powerful
technique when we go and began needing to reason on larger and larger
amounts of information.

Video: Prompts are a Tool for Repeated Use

The text discusses the importance of thinking of prompts as conversations when working with large language models like ChatGPT. Instead of asking a single question and expecting a perfect answer, it’s more effective to engage in a back-and-forth conversation to refine understanding and solve problems. This approach allows for iterative refinement, where the user can guide the conversation and provide feedback to get closer to their desired outcome.

The author uses an example of designing a robot to illustrate the power of conversational prompts. They start with a broad prompt, “I want to explore a virtual lab for building robots,” and then engage in a series of follow-up questions and statements to refine their understanding and guide the conversation. They ask ChatGPT to help design a robot, and then ask follow-up questions to clarify and refine the design process.

Throughout the conversation, the author encounters roadblocks and errors, but instead of giving up, they use the responses to inform their next question or statement. They ask for clarification, provide feedback, and try different approaches to get around the roadblocks. This process allows them to eventually get to a point where they have a usable design for their robot.

The author emphasizes that this conversational approach is key to getting the most out of large language models. By thinking of prompts as conversations, users can:

  • Refine their understanding and build shared understanding
  • Interact with the model to solve problems
  • Provide feedback and guidance to get better results
  • Take the model’s responses and shape them into useful outputs

The author concludes that this mindset is essential for getting the most out of large language models, and that users should think of prompts as a series of iterative refinements rather than a single question or statement.

Prompts aren’t just a one-off that you give to the
large language model. You shouldn’t just
think of them as questions or a single statement. What are the most powerful
ways of working with a large language
model is to think about a prompt as
a conversation. In fact, you can have a single prompt that represents
an entire conversation. In fact, a lot of
what we’re going to see when we’re working
with a tool like ChatGPT is that it’s
actually turning all of our interactions
into one big prompt that it’s sending to the
large language model. But I wanted to just
take back for a second and take us back a
few steps and think about why thinking of
having a conversation with a large language model is so important as opposed
to thinking about asking a question and
getting the answer or giving an instruction
and getting a result. Conversations are all about refining either our
understanding to build some shared understanding or to interact together in
order to solve a problem. When we start working
with conversations, it’s all about refinement
of continually guiding and moving through some problem or some space in order to
reach a particular goal. When we just think of a one-off, we have to solve everything right now right in this moment. We have to design one prompt that works and
everything is right. But if we think about
a conversation, we can go through an
iterative refinement process. Now, what does that mean,
iterative refinement? Well, if we think about a large language model
like ChatGPT as a tool, you can imagine
Michelangelo has been given a new hammer
and he’s going to go up to a piece of stone and
he’s going to start carving that stone into some
beautiful sculpture like the pia tar, daveat or whatever the sculpture is that he’s going to create. He walks up with a
hammer and he whacks the sculpture with his hammer
and a piece falls off. He looked, he whacks
the stone with a hammer and a piece falls off and he
looks at what he’s gotten. He said, this is a terrible sculpture and he throws the hammer
down on the ground. He says he didn’t do a
good job and he walks off. No, that’s not what happens. What does Michelangelo do? He takes the hammer and he
hits the rock over and over, refining and shaping
and sculpting. Over time he iteratively refines that rock until
it’s a beautiful sculpture. A lot of what we need to do with large language
models is thinking in that mindset of it’s not about getting the perfect answer
right now from this prompt, it’s about going through an entire conversation with
the large language model that may be involved
in a series of prompts and the outputs
from a series of prompts. We’ll talk about how these
things are represented later. But basically having
a conversation with a large language model
in order to produce an output that we
want or to produce some other interesting product. Now, I’m going to give
you an example of this. I’m going to imagine and show you the power
of a conversation in the context of a
educational experience. I’m going to say to ChatGPT, I want you to help me explore a virtual lab for
building robots. The lab as a university, help introduce me to the lab and tell me
what I can do here. This is a virtual
lab in a university. All I’ve done is given it this initial seed of
the conversation. ChatGPT says
certainly welcome to the virtual lab for building
robots at our university. The lab provides an immersive hands-on learning experience. It goes on and it tells me, here are some of the things
you can do in this lab, design robots, test robots, program robots,
collaborate with others. I then say, I’m going to
continue the conversation, but I’m going to continue
the conversation just assuming that this lab exists and then I can
interact with it. I say, can we design
a robot together? ChatGPT says, absolutely, I would love to collaborate
with you and design a robot. To get started, we can
follow these general steps, define the purpose and function, brainstorm and design concepts, determine the necessary
components, build and test. It goes on to tell me
a number of steps. Then I follow up in the
conversation with great, can you guide me through each of these steps one at a time? At the end, I would
like to be able to 3D print parts to
assemble my robot, have a circuit diagram and have code that I can run
to control the robot. Can you help me with that? This is an interesting aspect. I’m having a conversation, but I’m also engaging the
large language model. I’m engaged in it to help
guide the conversation and help figure out how to solve the problem that
I’m interested in. It says, sure, I’d
be happy to guide you through each
step of the process. Here’s a detailed breakdown
on the steps that we can take to design and
build a robot together. Step 1, define the purpose
and function, two, brainstorm concepts, determine the
necessary components and it goes on to tell
me a bunch of things. I say, okay, let’s do step 1 so I’m continuing
the conversation. I want one of those robots that can follow a line on the floor. I’ve seen these things. These
are common at universities. They’ll build them.
You put it on the floor and the
robot will follow some line like a piece of tape that’s on the floor or
something that’s painted. It says, great, following
a line on the floor is a common tasks for
robots and there are many ways to approach
this problem. Here are some questions
to consider as we define the purpose and function of our line
following robot. What will be the primary use? How fast do we want the
robot to follow a line? How complex do we want
the following to be? What sensors do we want to use? I just tell it, hey, I’m not really sure. Could you make some
sensible choices for me in all of these things? ChatGPT says sure, based on my experience, I can make some
suggestions and it gives it its purpose and function. It’s going to be for
educational purposes, speed and distance, moderate speed that
is easy to track, such as 10-15
centimeters a second. Complexity will design
it to navigate through a simple track with
straight lines and gradual turns and sensor choice, we can use infrared sensors. That’s a pretty good thing. Now notice, I don’t really
know a lot about this domain. I know about the
code piece of it, but I don’t know all
about the other one. I’m just engaging it in a conversation to get something
out of it that I want. Remember, I want
to build a robot. I want to build 3D print parts. I want to be able to
assemble circuits for it, and I want to be able
to run code on it. Then I said, well,
what do we do next? Just tell me the
very next steps and nothing else and it
says the next step is brainstorm concepts and it says we can use 3D
models of the robot. Now it sounds like I’ve
reached a dead end, Like how am I going to
do 3D models with it? And so I just ask it, how do we do this
via chat because I’m trying to refine
what’s going on here. I don’t want to just assume
I’m stuck. I can’t get out. I want to keep
refining and moving the conversation to
try to get to my goal. It says, since we
are communicating through a text-based chat, we can use shared documents
or diagrams to collaborate. We can use Google Drive or
Dropbox and then as mentioned, different tools like Sketchpad
or Microsoft Whiteboard and I just realized
that seems tricky. I don’t know how to get around
that and then I realized, I have some knowledge
of that vein. I know that 3D
printers use G-code. This is something that the
instructions are sent to it, so I just say can you show me sample G-code that I could give a 3D printer to print
four wheels for a robot. I’m trying to steer around this roadblock and
then it says yes, and it goes and gives
me some G-code, which I look at and I have enough domain knowledge
from this to think that I’m skeptical that this is real G-code and I don’t
really even see some wheels. Then I go, can you explain where the wheel shape
is created in this G-code. I’ve gone down a rabbit
hole now where I seem to be hitting a
roadblock after roadblock. But I’m continuing
the conversation. I’m still problem-solving and thinking about how do
I get around this? And it says, I apologize
for the confusion, but the example of G-code
I provided earlier is just a sample start and does not have any instructions
for creating the shape of the wheels. Now, I then say, could you create sample
Python code that I can run to create
an STL file that has four inch tall wheels with a 0.1 inch hole in the
middle for an axle. What I’m doing here is I’m
trying to come up with additional tasks that it could do that would help
me solve my problem. Now, the way that I worded it, the way I saw it out, solving the problem
the first time didn’t work and I keep
hitting roadblocks. What I’m doing is I’m retrying
with a different task that would be useful to
me to solve my problem. I’m thinking about this
as almost like a tree, like I’ve reached
a critical point and I’ve tried to go
in one direction, but its told me I
can’t do it that way or it’s giving me
an erroneous output. Now I’m trying a slightly
different tact to go and solve my problem that’ll have still
help me get there. This one is one
that is able to do, an STL files are
basically files that you can use to describe the geometry of some
3D shape and then basically give them to
specialized software which can then send
them to a 3D printer to be printed into a shape. This one, it actually is
able to go and create what looks like STL code and I didn’t actually
try to print it. I’ll be fully honest, I’m
just wanting to illustrate the conversational
aspect of this. But it certainly has things
that look like wheels in the output that it’s producing. It’s looking like it
produces something that’s probably going to give me
something that I can 3D print. Then I say, Well, I’d like to pick electronics from my robot. Can you tell me
what components I need and how to
wire them together? And it starts giving
me information. I need a microcontroller board, I need a motor driver board, infrared lines sensors and surprisingly that it
actually tries to generate an image which is
obviously not possible in it at this point
with what I’m using. But it still tries to do it. But it gets the image and it’s broken again and so I
have to think about how do I get around
this problem that it can’t generate image
because I’d like to help it pick electronics for me and show me how to wire
them together with a diagram. What I do at this
point, is I say, well, I know there’s this other
tool that uses text as n, but it’s called Graphviz. I ask it to create an
input for that tool that can visualize the circuit
diagram for this tool, or at least do a rough
approximation of it. Then it can do it and
it generates for me a circuit diagram for
the robot and I continue this conversation for a very
long time trying to iterate and improve the definition
in the circuit diagram. Picking different components, generating source code
to run on the robot. It had picked an Arduino
component and then had generated code to
run on the Arduino. Then I even took it to the point where I went all the
way down and I said, Hey, we’ve covered
a lot of concepts. Now this is one of the
things that’s really powerful about the
conversation is you can take the conversation a lot
of directions and as you realize you’d like
additional things, you can ask for them. In this case I said,
Hey, there’s a summarize the main topics that we’ve covered and it was
able to do that, gives me a summary of
all the main topics and then I did something
interesting and I said, well, let’s take
it even further. I’ve learned a lot by going
through this design process. Now I haven’t actually
gotten something that I could probably build
a robot yet with. They interacted more,
I probably could, but actually ask it to quiz me from the
electronics questions and it’s able to do that. What you can see
here is a lot of the power of these large
language models comes from not just thinking
of them as a one-off, like I ask it a question, it gave me a response that wasn’t very good or it
couldn’t answer it. We should always be thinking
about how do we take the response and use it to inform a next question for the conversation or a next statement for
the conversation, or how do we give it
feedback on what it did well or what
it didn’t do well. That’s how we get the
really useful products. That’s how we go
from thinking of it as a hammer where
we strike once it doesn’t give us what as
we want and we throw on the floor that mindset is wrong. We want to go to the
mindset of it’s a hammer. We’re going to have
to chisel away at the rock to get the really
beautiful outputs of it. If we’re not continuing
the conversation, continually asking
follow-up questions, problem-solve in
the conversation and trying to move
around roadblocks, taking what we’re being
given and giving it different shapes and formats
that may be useful to us. We’re really missing
the underlying power and capabilities of these
large language models.