How to use ChatGPT other than asking questions

📌 Text Embedding using ChatGPT and LangChain

Who is this article for?

This article is for those who possess a basic understanding of ChatGPT and are interested in further application and customization of the language model. As an example, we will show how to use OpenAI's latest model for Text Embedding, i.e., creating a context vector from the text. Using those vectors, it is possible to do many things, such as text-based similarity calculations and searches.

Those who read this article are supposed to:

How to use ChatGPT for tasks other than conversation

There are two approaches to using ChatGPT for tasks beyond the conversation. The first method involves modifying the underlying model itself by adding extra learning functionality to cater to specific tasks.

Alternatively, the second method involves crafting the input data given to the model without any changes made to the model itself. This approach is known as prompt engineering.

To retrain ChatGPT, you need to use ChatGPT Plugins, but it is not yet publicly available, so in this article, we will try prompt engineering.

Prompt engineering and LangChain

Prompt engineering is a technique used in natural language processing to create effective prompts for machine learning models like GPT-3. It involves crafting specific instructions or questions to elicit desired responses from the model.

This time, we tried a library called LangChain as a tool for prompt engineering. LangChain is a framework for building language model applications that connect to various sources of data and allow models to interact with their environment.

The framework provides modular components that are easy to use and implement, as well as use-case specific chains that allow for customization. The documentation is divided into two sections for components and use cases, with language-specific sections available for further guidance.

Preparing python environment

# install numpy, openai, and langchain
$ pip install numpy 
$ pip install openai
$ pip install langchain

Text Embedding using LangChain

import os
import openai
import langchain
from langchain.embeddings import OpenAIEmbeddings

os.environ["OPENAI_API_KEY"] = "Put your OpenAI API key here."

embedder = OpenAIEmbeddings()

text1 = "It is a sample sentence for Text Embedding."
vec1 = embedder.embed_query(text1)

print(vec1)
print(len(vec1))

Calculating cosine similarity between embedded sentences

import os
import numpy
import openai
import langchain
from langchain.embeddings import OpenAIEmbeddings

os.environ["OPENAI_API_KEY"] = "Put your OpenAI API key here."

embedder = OpenAIEmbeddings()

text1 = "It is a sample sentence for Text Embedding."
text2 = "It is another sample sentence for Text Embedding."

vec1 = embedder.embed_query(text1)
vec2 = embedder.embed_query(text2)

cosine = np.dot(vec1, vec2)/(norm(vec1) * norm(vec2))
print(cosine)

Summary

In this blog, we explore the exciting world of Large Language Models (LLMs). Today, we'll take a deep dive into ChatGPT, a powerful LLM, and explore how it can be used to create cutting-edge applications.

One method we'll explore is Prompt Engineering, a powerful way to generate input without having to modify the model itself. By harnessing ChatGPT's capabilities, we can convert any arbitrary text into a numeric vector, a process known as Text Embedding.

Text Embedding has a wide range of applications, including search and similarity calculations. In our experiment, we put ChatGPT to the test and discovered that its Text Embedding produces higher quality numerical vectors that retain more context than other language models, such as BERT.

So, if you're looking to take your language processing applications to the next level, ChatGPT's powerful Text Embedding capabilities may just be the solution you've been searching for!

References

Back