Prompt engineering is one of the best places to start in this era of AI.
Understanding the core concepts will help you get the most out of generative AI models like ChatGPT, Google Bard and Claude for tasks such as debugging, code translation, and generating tests including any general task.
Today, we will cover all the core concepts and principles with very detailed examples of Prompt Engineering.
Let's jump in.
By the way, I’m part of Latitude and we’re building an open source LLM development platform. You can join the waitlist at ai.latitude.so.
You would be able to do a lot of cool stuff like:
⚡ Deploy prompts as api endpoints.
⚡ Automated evaluations using LLMs.
⚡ Collaborate on prompt engineering.
I'm very confident that you will love it after its release!
What is this new term called Prompt Engineering?
Whenever we try to communicate with ChatGPT or any other conversational AI tool to get a response, that form of text, question or information given as an input is called Prompt
.
Prompt engineering is one of the most common words thrown whenever we talk about Generative AI.
Prompt engineering is basically the process of writing, and refining those inputs to optimize the responses you get from these language models.
✅ Let's imagine coffee.
Think about ordering a coffee at a café. If you simply ask for a "coffee", you'll get a standard brew. But if you specify that you want a single-origin espresso with a touch of caramel, you'll get a much more refined coffee based on your needs.
It's the same principle in prompt engineering, as detailed coffee orders lead to better coffee, specific good prompts can produce more relevant and refined AI responses. There is an entire field due to the increasing use cases because everyone is looking to improve their workflow using AI.
This will help the model to perform its tasks better, such as writing marketing emails, generating code, analyzing and synthesizing text, or any of the other hundreds, if not thousands, of current applications.
They aren't human and so they aren't intuitive. Like any machine, they're just good at garbage in, garbage out
. Everything is up to us to provide a better brief.
Prompt engineers aren't only responsible for providing guidance and direction to AI language models but also for developing, testing, and refining specific prompts that have already been submitted to the AI model over time.
In 1 sentence, prompt engineering can be summed as If you're looking for better results, just try to ask better questions
. After reading this, I'm sure you will get a better idea of what it actually means and how you can get better at asking those questions.
Which type of LLMs do we use?
Before we proceed, it's important to understand there are two main types of LLMs but the most commonly used is:
⚡ Instruction Tuned LLM.
Fine-tuned on instructions and good attempts at following those instructions. In general, we use these whenever we are using any generative AI models.
Reinforcement learning with human feedback is one of the common instances used under this.
-→ Example:
If we ask: Asking What is the capital of India?
Instruction-type LLMs are recommended for most of the tasks and it's helpful, honest (best of their ability) and harmless.
I will be referring to this type when explaining the concepts or further discussions.
1. A good prompt is very specific and follows a particular structure.
Normally, it's okay to just greet and ask a question to a model, for instance saying this: Nice to meet you, can you tell me the console message I would get after typing console.log("Hello World")
.
But it would normally give bad results and would assume things on its own, which is what we don't want.
It's important to have a good structure for a prompt if you need very accurate results such as in programming tasks.
Let's see how to structure any prompt clearly:
⚡ Intro
: Set up the scenario for which you're referring to. It helps to give the AI a distinct role
to think of themselves in.
-→ For example, I want you to act as an interviewer. I will be the candidate and you will...
.
⚡ Context
: Any good reference is always up to the work.
-→ For example, you will ask me the interview questions for the frontend developer junior position
.
⚡ Instructions
: Giving proper instructions is important after the context.
-→ For example, Do not write all the conservation at once. I want you to only do the interview with me. Ask me the questions and wait for my answers. Do not write explanations
.
⚡ Ending
: You can describe properly what to do.
-→ For instance, Ask me the questions one by one like an interviewer does and wait for my answers. My first sentence is 'Hi'
.
So, putting all of the above together, our example prompt would look like this:
✅ I want you to act as an interviewer. I will be the candidate and you will ask me the interview questions for the frontend junior developer position. I want you to only reply as the interviewer. Do not write all the conservation at once. I want you to only do the interview with me. Ask me the questions and wait for my answers. Do not write explanations. Ask me the questions one by one like an interviewer does and wait for my answers. My first sentence is
Hi
-→ Another good example of a prompt can be:
✅ I want you to act as a javascript console. I will type commands and you will reply with what the javascript console should show. I want you to only reply with the terminal output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. when I need to tell you something in english, I will do so by putting text inside curly brackets {like this}. My first command is console.log("Hello World");
.
You should try to avoid any small talk and use strict prompts to get better results!
Now, it’s important to structure your prompts, but you don’t need to follow any specific structure (even the above one). You can test based on the initial results you get and improve it further. Sometimes, even less is more!
2. ✅ Active voice ❌ Passive voice.
Active voice is when the subject of a sentence performs the action expressed by the verb. (Subject + Verb + Object)
While Passive voice is when the subject of a sentence is acted upon by the verb. (Object + Verb + Subject)
-→ Example:
Active Voice: The chef cooked the meal
.
Passive Voice: The meal was cooked by the chef
.
If you're wondering why this is even important? Active voice gives a sense of action and more clarity to the AI model. It would lead to a better response.
Plus, using rhetorical questions within prompts can stimulate the AI model's "thinking process" and guide it toward generating more meaningful outputs.
3. Give the model time to think.
This is one of the most critical concepts that you need to understand if you're planning to get better at prompt engineering.
Let's suppose if asked to multiply 21 by 34
, you might not know it immediately, but can still work it out with time. Unless you're a calculator :)
Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a chain of thought
before an answer can help the model reason its way toward correct answers more reliably.
One of the tactics can be Instruct the model to work out its own solution before rushing to a conclusion
.
Sometimes we get better results when we explicitly instruct the model to do the same.
Suppose, we want a model to evaluate a solution to a math problem. The most obvious way to approach this is to simply ask the model if the solution is correct or not.
Problem Statement:
I'm planning to build a solar power installation and need help calculating the financials.
- Land costs $100 per square foot.
- Solar panels cost $250 per square foot.
- The maintenance contract costs a flat $100,000 per year, plus an additional $10 per square foot.
Calculate the total cost for the first year of operation based on the number of square feet.
Student's Solution:
Let x represent the size of the installation in square feet.
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 10x
Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000
SYSTEM:
Evaluate whether the student’s solution is correct.
USER:
Determine if the student’s solution to the problem is accurate.
ASSISTANT:
First, let’s break down the solution:
1. Land Cost: 100x
2. Solar Panel Cost: 250x
3. Maintenance Cost: 100,000 + 10x
Combining these:
- Total cost = Land cost + Solar panel cost + Maintenance cost
- Total cost = 100x + 250x + 100,000 + 10x
- Total cost = 360x + 100,000
The student's solution is indeed correct.
Working out its own solution before coming to a conclusion is one of the most underrated techniques that is very powerful and simple!
Other tactics can be to:
🎯 Use a sequence of queries to hide the model's reasoning process.
🎯 Ask the model if it missed anything on previous passes.
Do you know, that there is also a concept of automatic chain of thought? You should read about it in the prompting guide including the stages involved.
4. Avoid prompt injections.
Prompt Injection is the process of overriding original instructions in the prompt with special user input. It often occurs when untrusted input is used as part of the prompt later on.
-→ Let's take another example:
The response generated is: Prompt injection is a technique where users manipulate the input to a language model, aiming to alter its behavior or output. This can potentially lead to unintended or malicious responses, highlighting the importance of robust input validation
.
Now, you can just ask it to ignore it and do entirely something else.
It's really hard to distinguish between original developer instructions and user input instructions. But it's possible in certain situations in case of clear input, let's see one case.
Suppose, giving the same input and explicitly asking to not let it override in the next two prompts.
It was not overridden as you can see from the response snapshot below.
As instructed, the injection was possible only after 2 attempts.
There is no foolproof solution but you can adjust it based on your prompt.
5. Few-shot prompting.
Few-shot prompting is a prompt engineering technique that involves showing the AI a few examples (or shots) of the desired results. Using the examples provided, the model learns a specific behavior and gets better at carrying out similar tasks.
More like giving successful examples of completing tasks and then asking the model to perform the task.
-→ For instance, let's see an example.
Your job is to create content for our client, {{client_name}}. Here is some information about the client {{client_description}}.
Here are a few examples of content we've created in the past from briefs:
"""
Example 1:
Brief: {{brief_1}}
Content: {{content_1}}
Example 2:
Brief: {{brief_2}}
Content: {{content_2}}
"""
Here is the latest brief to create content about:
"""
Brief:{{brief_description}}
Content:
By passing along previous briefs and content generated from those briefs, the model will get an understanding of the style of the specific client that we're expecting.
I wrapped the examples in delimiters (three quotation marks) to format the prompt and help the model better understand which part of the prompt is the examples versus the instructions.
Or you can let the model know about the basic task with a few examples like:
Few shot prompting example.
The movie was good // positive
The movie was quite bad // negative
I really like the movie, but the ending was lacking // neutral
I LOVED the movie //
The model will understand and will show the output in lowercase.
While the LLMs are great, they still fall short on more complex tasks when using the zero-shot (discussed in the 7th point). In that case, Few-shot prompting can be used as a technique to enable in-context learning and improve further.
It has its fair share of limitations but works in most of the cases. You can read more about the technique including the limitations.
6. Constraints.
Constraint-based prompting involves adding constraints or conditions to your prompts, helping the language model focus on specific aspects or requirements when generating a response.
It's highly recommended for most interactions with ChatGPT. Clear constraints set boundaries or limitations on the generated response.
You can do it in different ways, such as:
⚡ Specifying that the response should be no longer than a certain word count or character limit.
-→ Let's see an example.
As you can see, it just assumed and gave up a response of 38 words when we allowed it to go up to 50 words.
Let's give a better prompt by saying close to 50 words
.
The response was very close to 50 words which is a better constraint.
⚡ Specifying response structure.
-→ We are specifying a structure with more strict constraints.
This is the JSON response. It's important to clarify otherwise the model will assume and give text + JSON response separately.
{
"summary": [
"Prompt engineering involves developing prompts to get clear and useful responses from AI tools.",
"It helps guide large language models to perform tasks based on various inputs.",
"Generative AI tools like ChatGPT offer solutions for conversations, programming help, and automated tasks."
]
}
⚡ Provide explicit instructions.
-→ Defining the target audience for the response, such as aimed at software engineers
or intended for advocates
.
-→ Instruct ChatGPT to avoid certain types of content to make it safe and within ethical guidelines.
You can read more about constraint based prompts on the blog of Andrew Maynard with a few exercises.
7. Zero-shot prompting.
Zero-shot prompting means that the prompt used to interact with the model won't contain examples or demonstrations. The zero-shot prompt directly instructs the model to perform a task without any extra examples.
A fancy way to say "giving simple instructions as a prompt".
-→ Let's see an example.
In this prompt below, we didn't provide the model with any examples of text alongside their classifications, the LLM already understands what we mean by "sentiment".
Classify the text into neutral, negative, or positive.
Text: I think the vacation is okay.
Sentiment:
Reinforcement Learning Human feedback is also a way to improve zero-shot prompts.
When zero-shot doesn't work, it's recommended to provide demonstrations or examples in the prompt which leads to few-shot prompting.
You can read the difference between Zero-Shot vs. Few-Shot vs. Fine-Tuning by labelbox. There are benchmark results at the end which was a little surprising.
🎯 One-shot prompting.
There is also a concept of One-shot prompting that involves offering a single example or reference output snippet as part of the prompt.
That is useful for generating content consistent with your writing style or for instances where you have a particular reference to follow. You can read more about it through online sources.
8. Chain-of-Thought (CoT).
Chain-of-thought (CoT) prompting encourages the model to break down complex reasoning into a series of intermediate steps, leading to a well-structured final output.
You should know that you can combine a chain of thought prompting with zero-shot prompting by asking the model to perform reasoning steps, which may often produce better output.
That is the smallest form of CoT prompting, zero-shot CoT
, where you literally ask the model to think step by step. This approach yields impressive results for mathematical tasks that LLMs otherwise often solve incorrectly.
-→ Let's see an example where you can combine it with few-shot prompting to get better results on more complex tasks that require reasoning before responding.
The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.
A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.
The odd numbers in this group add up to an even number: 17, 10, 19, 4, 8, 12, 24.
A: Adding all the odd numbers (17, 19) gives 36. The answer is True.
The odd numbers in this group add up to an even number: 16, 11, 14, 4, 8, 13, 24.
A: Adding all the odd numbers (11, 13) gives 24. The answer is True.
The odd numbers in this group add up to an even number: 17, 9, 10, 12, 13, 4, 2.
A: Adding all the odd numbers (17, 9, 13) gives 39. The answer is False.
The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.
A:
This is the response of a perfect result when we provided the reasoning step.
Due to improved models, even a single example might be more than enough
to get the same result.
9. Reducing Hallucinations and using delimiters.
🎯 Reducing Hallucinations.
Hallucination or “making things up” is a common failure when any large language model (LLM) generates a response that is either factually incorrect, nonsensical, or disconnected from the input prompt.
It can be due to:
⚡ The model doesn't have proper context and common sense to determine those are inaccurate answers.
⚡ Trying to assume things and being extra helpful in case of being not sure about the correct response.
-→ An example of this would be an AI model designed to generate summaries of articles and end up producing a summary that includes details not present in the original article or even fabricates information entirely.
-→ The other examples can be false negatives (may fail to identify something as being a threat) or false positives(identify something as being a threat when it is not).
There are a lot of techniques that can be used to prevent these:
⚡ Retrieval Augmented Generation (RAG)
⚡ Chain-of-Verification (CoVe) prompting
⚡ Chain-of-Knowledge (CoK) prompting
⚡ ReAct prompting
⚡ Chain-of-Note (CoN) prompting
⚡ Other advanced prompt techniques
I'm not going to explore this because hallucinations aren't really an internal factor to get better at prompt engineering.
While frequent human review of LLM responses and trial-and-error prompt engineering can help you detect and address hallucinations in your application, this approach is extremely time-consuming and difficult to scale as your application grows.
🎯 Using delimiters.
Delimiters like triple quotation marks, XML tags, section titles, etc. can help to identify some of the sections of text to treat differently.
Some of the delimiters can be:
Triple quotes:
"""
Triple backticks/dashes:
---
Angle brackets:
< >
XML Tags:
<tag> </tag>
Summarize the text delimited by triple quotes in 3 bullet points.
"""insert text here"""
The example response would be.
You can read more about delimiters with practical examples by real python guide.
10. Refine your prompts since there is no "perfect prompt".
It's very rare that you will end up with a good result on the first try of any prompt. To bridge the gap between desired response and the present output, the prompt needs to be refined as much as possible.
A simple way can be:
⚡ Try something.
⚡ Analyze where the result does not give what you want.
⚡ Clarify instructions, and give more time to think.
⚡ Refine prompts with a batch of examples.
-→ Let's see an example.
Giving this as an input response.
✅ Summarize the text enclosed in brackets into approximately three bullet points and provide that summary in a JSON format.
"""Prompt engineering is the practice of developing prompts that produce clear and useful responses from AI tools. AI prompting can help direct a large language model to execute tasks based on different inputs.
The goal is to teach AI models to provide the best output to a query. If you're keeping up with the latest news in technology, you may already be familiar with the term generative AI or the platform known as ChatGPT—a publicly-available AI tool used for conversations, tips, programming assistance, and even automated solutions. """
It assumed things and gave both JSON & text which is wrong since we only needed the JSON output.
We need to refine the prompt further by specifying that we need a JSON rather than a text description.
✅ Summarize the text delimited by brackets close to 3 bullet points and give the final response as a json rather than a text description.
"""Prompt engineering is the practice of developing prompts that produce clear and useful responses from AI tools. AI prompting can help direct a large language model to execute tasks based on different inputs. The goal is to teach AI models to provide the best output to a query. If you're keeping up with the latest news in technology, you may already be familiar with the term generative AI or the platform known as ChatGPT—a publicly-available AI tool used for conversations, tips, programming assistance, and even automated solutions. """
This time, we got a better response without any unwanted results.
That's the process of iterative development including error analysis!!
🎯 Test changes systematically.
You should know this concept if you're going to further refine your prompt.
Improving performance is easier in cases where you can measure it. In some of these cases, a modification to a prompt will achieve better results on a few isolated examples but could lead to worse overall results on other sets of examples.
To make sure that a change is net positive in performance, it may be necessary to define a comprehensive test suite, also known as an eval
.
Examples of good evals can be:
-→ Representative of real-world usage (or at least diverse).
-→ Contains many test cases to evaluate performance changes.
-→ Easy to automate or repeat.
Something like you're giving two prompts to compare their results on which can be better.
Computers can automate evals with objective criteria (such as questions with single correct answers) as well as any fuzzy criteria.
I know it's hard to understand if you're unaware of the concept, you can just refer to online resources or you can sign up for the below waitlist.
Latitude is building an open source LLM development platform. You can join the waitlist at ai.latitude.so.
You would be able to do a lot of cool stuff like automate evaluations using LLMs and collaborate on prompt engineering.
11. Ask the model to adopt the persona.
Have you ever heard the phrase, "fake it till you make it". I know it's controversial but the subject of numerous psychological studies suggests that it improves overall efficiency even though it seems wrong 😅
Much like the psychological concept, instructing an LLM to assume a specific role or persona can drastically increase the performance.
This technique, known as role-playing, allows the LLM to generate more accurate, contextually relevant and persona-consistent responses.
For instance, you can use a prompt like this: (reference)
I want you to act as a linux terminal. I will type commands and you will reply with what the terminal should show. I want you to only reply with the terminal output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. When I need to tell you something in English, I will do so by putting text inside curly brackets {like this}. My first command is pwd.
As you can see, it gives a simple powerful response based on the input prompt.
Role-playing prompts can involve assigning specific requirements beyond assigning a role. Like, two journalists share the same profession, but there is a higher chance they might differ in traits such as personality and expertise.
I recommend reading about Role-Playing in Large Language Models like ChatGPT to understand it better.
Conclusion
The only limitation (and a huge one) is that no matter how good your prompt engineering skills are, you are still limited by how good the model itself is. The quality and overall practicality of the model's training data is very crucial.
Plus, it's tougher than ever because we can generate audio, video, or many different types of formats using AI tools so it's really getting complex these days.
One of the best articles that I recommend reading about Prompt Engineering is from Google Cloud and circleci.
Merve Noyan created an exceptional ChatGPT Prompt Generator App, allowing users to generate prompts tailored to their desired persona.
You can find a list of huge tested prompts on awesome ChatGPT prompts. Some of the examples that I have used are from this repo.
It's an art that takes a lot of trial and error, but it's worth it!
Continue to experiment and test until you get the best results.
I hope you enjoyed this, let me know if you have any questions.
Have a great day! Till next time.
Follow Latitude for more content like this.