Everything you need to know about GPT-3

Mirko Vaars
Share on facebook
Writing emails for you, based on a few bullet points; Translating complex legal documents to plain English; Or generating a mockup website that has the look and feel of another one. OpenAI's newest language model called GPT-3 (short for Generative Pre-trained Transformer) really is a quick learner. But why has GPT-3 been so popular lately?
gptinputoutput

OpenAI

The organisation that developed GPT-3 is called OpenAI. There mission is to create a General Artificial Intelligence (AGI) that benefits humanity. An AGI can learn any intellectual task that a human being can. OpenAI was founded in 2015 by Elon Musk, Sam Altman, Greg Brockman and Ilya Sutskever. Their combined knowledge from their various backgrounds (Tesla/SpaceX, Y Combinator, Google Brain and Stripe) and a launch investment of 1b$ by Peter Thiel, Reid Hoffman, Infosys, AWS and YC Research enabled them to be a non-profit organisation that has proved to be capable of producing state-of-the-art developments in AI. Microsoft’s investment of another 1b$ in July 2020 and access to Azure, enabled them to train the largest language model yet, called GPT-3.

GPT-3

GPT-3 is a so-called Generative Pre-trained Transformer. In normal English, this means:

  • It is a certain type of language model (the “transformer“) which was introduced late 2017
  • It has been trained on a massive dataset, so that the users don’t have to
  • It is trained to generate text based on the input that the user gives.
So, what does it actually do?

The main task of GPT-3 is to predict what word should be generated, given an input sentence. It looks at the (partial) sentence that is has been given and analyses the relations between the words to create context within that sentence. Based on this, it chooses the next word that is statistically most likely. In the end, GPT-3 is a – very large – statistical model.

Courtesy of twitter @SamanyouGarg

Courtesy of twitter @jsngr

How did it learn to do this?

It was trained on a dataset containing 300 billion words. These come from a curated and filtered set articles and webpages such as news websites and Wikipedia. To fathom the actual size of this, Wikipedia was only 3% of the entire dataset. For the training, it executes the following steps millions of times:

  1.  Select a random piece of text from the dataset
  2. Remove a random part of that sentence
  3. Try to generate the removed part
  4. Evaluate whether it was correct or not
  5. If it is wrong, adjusts the paramaters in such a way that next time it would get this piece of text, it is more likely to output the right answer.
What can it do?

The researchers from OpenAI had the intuition that the GPT-3 model would need to learn a variety of skills in order to generate human-like test. It turns out the researchers were right. The following skills were learned in the training phase unsupervised (completely by itself).

  • Story completion using common sense reasoning
  • Answering Trivia questions
  •  Translation
  • Arithmetics

One task it learned, the researchers did not anticipate:

  • Code generation.

They were quite baffled it could learn how to generate easy applications. 

How to teach GPT-3 a new task?

GPT-3 excels at the so called few-shot learning. This means we give it a “few” examples of a input and correct output. With less than 5 examples most tasks are already taught and GPT-3 can answer your tasks with a high successrate. These tasks are of course limited to the generative kind, especially tasks based on the skills listed above. 

Application areas

GPT-3 is a breakthrough in terms of quality and versatility. But what application areas does it touch?

Search Engines

With the trivia answering skills, GPT-3 can be used as a search engine for facts or trivia-like questions. As seen in the video below, it can answer many questions without the need to actually search the internet for the answers.

Fact search engine: https://twitter.com/paraschopra/status/1284801028676653060

Making a layman an expert

It enables people without design, programming or finance skills to interact with systems attached to these fields of expertise. As can be seen in the examples below, the conversational interaction with GPT-3 is translated to SQL (code to query databases), generate working apps, designing websites or editing a balance sheet based on your expenses.

Making SQL accessible: https: //twitter.com/_bhaveshbhatt/status/1286294242579513351

Balance sheet: https: //twitter.com/itsyashdani/status/1285695850300219392

Layout coder: https: //twitter.com/sharifshameem/status/1282676454690451457

Creating apps: https://twitter.com/sharifshameem?lang=nl

Assisted writing

Next to actually writing your emails for you, it is also able to correct grammar mistakes.

Helping people

There are many people in our world who need extra human-like conversations, where not enough is provided. Lonely, old, depressed, traumatised or mentally impaired people could benefit from a proper human-like conversation. Current applications that provide such AI interactions lack the common sense reasoning which breaks the immersiveness of these applications. Ask such a tool: How many eyes does a human have? And it will answer “eight”. The improved common sense reasoning thta GPT-3 brings can make the experience for these people more immersive and help them out.

So.. GPT-3

All in all, GPT-3 is pretty good at a lot of tasks. It has made huge steps into solving the most difficult natural language challenges and is very versatile. We believe this will be used in processes such as support centres, help people make mockups easily, give acces to systems without the requirement of expert knowledge and helping people who need extra human-like conversations. 

Microsoft will make a commercial API available for people or organisations who want to use GPT-3 for their business or to experiment with.

Never cease to be amazed or read tips on the latest technologies by checking our Insights

Let's talk​

Looking to find your
opportunities through emerging tech?

Marco van der Werf
Part of the team

marco@wearebit.com
+3120 247 03 40

Skate where the puck is going

This website uses cookies to ensure you get the best experience on our website.