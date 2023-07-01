Thanks to ChatGPT and similar platforms, the rise of artificial intelligence has been one of the most headline-grabbing subjects of 2023. Not a day goes by without a new article coming out about some way AI tech spells eithe doom or salvation for the creative fields, your job, or humanity.
And if you’ve been reading these articles, you might have noticed one particular word being thrown around by tech executives recently: “corpus.” Reddit’s CEO has mentioned it; so has Wikipedia’s founder Jimmy Wales; and so has Microsoft founder Bill Gates.
Here’s what it means, and why it’s critical to understanding how artificial intelligence platforms like ChatGPT and Midjourney operate.
What is an AI corpus?
Those who studied Latin in school will immediately know that corpus means “body.” (The modern word for a dead body—“corpse”—is derived from corpus.) Others might recognize the word corpus because of its use in a legal mechanism still in place today: habeas corpus. This phrase literally means “you should have the body” and it ensures that anyone arrested has the right to appear before a judge (thus, the judge “has the body” of the person arrested) to determine if that arrest is lawful.
But when used in the artificial intelligence realm, the term “corpus” doesn’t refer to a physical body at all. Instead, it refers to the metaphorical “body,” or collection, of data that was used to train the AI. This corpus is the material the AI reviews to become intelligent in whatever it was designed for.
Every AI’s corpus will be different, because it is humans who decide what kind of data they want to train an AI on. And the corpus the humans decide to train the AI on will depend on what they want the AI to be proficient in.
Types of corpora
There is no limit to the types of corpora (the plural of corpus) that can exist. What makes up an AI’s corpus simply depends upon what the human creator of the AI intends for it to do.