1 A 3 O R N

GPT-3 Can Solve Some of ARC

Created: 2023-01-03
Wordcount: 0.2k

I have found that the code-davinci-002 version of GPT-3 is able to get between 4% (evaluation dataset) and 9% (training dataset) of Chollet's Abstraction and Reasoning Corpus correct.

The ARC corpus presents a series of image-based reasoning and abstraction challenges. In each challenge, there are 2-4 examples of a mapping from an input image to an output image; the algorithm must then map from an input image to an unseen output image. The very few examples of such mappings is meant to make the task difficult for deep learning, as is the fact that you get the task wrong unless you get each pixel of the mapping correct.

GPT-3 is an autoregressive sequence model. As such, it was never meant to view images.

Nevertheless, using a very naive conversion technnique -- mapping each image to series of rows like 0 1 0 1 0 1 \n -- you can get 38 of the examples in the 400-case training set right, and 18 of the examples in (harder) the 400-case evaluation set right.

It appears that code-davinci-002 is not available for fine-tuning, but I'd be interested the gain on one dataset with fine-tuning on the other.