Credit: Morgan, Flickr
Artificial intelligence and machine learning are all the buzz in marketing.
These fields draw from statistics and computer science, and they're dense with technical language.
We've worked on AI-related content for a handful of clients, and we've spent a lot of time learning the terminology. This quick-reference glossary is based on some of our internal cheat sheets, along with our articles on how to talk to your data scientist and how machine learning will turn the marketing world upside down, are part of our effort to share what we've learned.
We’ll update this glossary from time to time, so please send us your requests.
An algorithm is a set of specific mathematical or operational steps used to solve a problem or accomplish a task.
Algorithms play a central role in AI, transforming or analyzing data. That could mean performing regression analysis, classifying customers, or finding relationships between SKUs. Each of these analytical tasks would require a different algorithm.
Artificial intelligence (AI)
- Artificial narrow intelligence (ANI): ANI covers all of the public applications of AI currently available. It's called "narrow" because an ANI system's knowledge is specific to a limited set of topics. The AI that navigates a self-driving car can't also hold a conversation with the driver.
- Artificial general intelligence (AGI): Also called superintelligence, AGI means the system has intelligence across a broad range of subjects. True AGI does not currently exist, but it is what many people think of as AI because of pop-culture icons like Skynet, from the Terminator movie series. When AGI does come online, it would likely change our current frameworks.
Artificial-intelligence marketing (AIM)
AIM is a loose categorization of products, technologies, and services that use AI to enhance or automate marketing. This term has been around for a while, but it remains to be seen if it will catch on or be replaced.
- Google TensorFlow (TF): The Google Brain team made its secret sauce, TensorFlow, freely available to the public in 2015. TensorFlow is a machine-learning library that powers RankBrain, which handles a good portion of Google searches. Because it's open source, developers can use and modify its algorithms to build and train their own neural networks.
- IBM Watson: Watson may be the most recognizable brand name in commerical AI, having famously won the quiz show Jeopardy! in 2011 against past champions. Under Armour's UA Record app uses Watson to power what it calls a cognitive-coaching system, from recommending run paths and race preparation methods to helping shape an athlete's meal plan.
- Salesforce Einstein: Salesforce introduced Einstein in 2016; it is already built into its CRM product. Einstein, it says, is designed to take over mundane data entry tasks and enable sales representatives to better focus on their customers.
Artificially generated content (destined to inherit the AGC acronym)
Artificially generated content is a burgeoning field that relies on natural language generation (NLG) technology. The Associated Press uses an NLG product called Wordsmith to automatically write simple news stories, and Credit Suisse uses Narrative Science to generate commentary on research reports.
Fast-maturing martech applications of artificially generated content include the mass customization of communications based on specific details about each customer.
Bots are programs that interact directly with customers via natural language processing (NLP, see below). Many companies are already using chatbots for customer support, but the category is loosely defined, and new applications are cropping up. Marketers predict that they will embrace chatbots as enthusiastically as they did apps circa 2013.
However, Microsoft's experimental Twitter-based chatbot, Tay, serves as a cautionary tale. Tay made news in early 2016 when Twitter users quickly trained it to spew hate speech.
Computer vision is a field of technology that lets computers understand what they are seeing. With computer vision, apps can recognize and modify faces, as seen in Snapchat filters. It helps drones avoid obstacles. It helps self-driving cars navigate traffic.
Manufacturers have used computer vision for many years for quality control. Now, as the technology is getting smarter, retailers are using it for merchandising purposes, such as prompting a retail kiosk to generate a coupon for an item a customer is examining on a nearby shelf. Facial recognition also has lots of potential for customer service applications.
Corpus (plural: corpora)
A corpus is the body of text, images, or sounds used to "train" a neural network (giving it labeled examples from which to learn). From the perspective of the brand marketer, this may be the most important input we contribute to the neural network, as it shapes what the AI platform learns.
Data efficiency refers to a set of techniques supporting the storage of huge amounts of data.
This is an important concept for marketers because many systems based on machine learning require vast amounts of data. If you cannot supply the necessary volume of data, the conclusions drawn from the data are unlikely to be correct. A common example of this is in health care, where systems trained with machine learning are unlikely to be able to diagnose rare illnesses for which large amounts of data are unavailable.
Deep learning is a type of machine learning. Deep-learning systems use multiple layers of calculation. The first layers look at very simple features (lines in an image, for example) while the later layers abstract more complex features (such as faces).
Compared to a classical computer program, this is somewhat more like the way the human brain works, and you will often see deep learning associated with neural networks, which refers to a combination of hardware and software that can perform brain-style calculation.
It’s most logical to use deep learning on very large, complex problems.
Wikipedia’s definition of a feature is good: “an individual measurable property of a phenomenon being observed. Choosing informative, discriminating, and independent features is a crucial step for effective algorithms.” So features are elements or dimensions of your data set.
For example, features in a set of customer data might include demographics such as age, location, job status, or title, and behaviors such as previous purchases, email newsletter subscriptions, or various dimensions of website engagement.
In the machine-learning context, a graph can represent a big network of interconnected objects, people, places, or organizations. A graph, for example, could represent Facebook as a giant web of people and relationships.
Graphs can be huge. Advanced computing technology such as neural networking can help find patterns in this tangled mass of data. Product recommendation engines like those used by Netflix and Amazon.com typically rely on graph analysis.
Machine learning is the process through which a computer learns with experience rather than additional programming.
Let’s say you use a program to determine which customers receive which discount offers. If it’s a machine-learning program, it will make better recommendations as it gets more data about how customers respond. The system gets better at its task by seeing more data.
The simplest definition of a model is a mathematical representation of relationships in a data set. A slightly expanded definition: “a simplified, mathematically formalized way to approximate reality (i.e. what generates your data) and optionally to make predictions from this approximation.”
Models get complicated. A simple model might be illustrated with a two-axis graph, but if your data is more complex, the predictive model will be more complex. When you speak to your smartphone, for example, it turns your speech into data and runs that data through a model in order to recognize it. That’s right, Siri uses a complex speech recognition model to determine meaning.
Natural language generation (NLG)
NLG technologies transform data into written or spoken language that we understand. As experienced through services like Apple's Siri and seen in movies like Space Odyssey and Her, this technology is what most people think of first when they imagine interacting with an AI platform.
However, NLG is also behind marketing content such as push notifications, emails, and even short-form articles that you may not be aware are artificially generated.
Natural language processing (NLP)
NLP technology lets computers understand human languages. It underlies language translation services such as Google Translate, speech recognition services such as Apple's Siri, and text recognition capabilities for social-media sentiment analysis like Sysomos.
Chatbots (definition above) from H&M and Chase Bank show the potential of NLP in marketing and customer service.
Neural networks are computing systems that mimic the structure of the biological brain. Most modern AI products are built on neural networks to enable different forms of machine learning. Different neural-network architectures have different strengths. We highlight a few common architectures below; you can find more technical explanations at the Asimov Institute.
- Convolutional networks (CNN): CNNs—not to be confused with cable news networks!—are often used for image recognition. They power Google Images "leaping cat" searches, Facebook face recognition, and Snapchat face swapping.
- Deconvolutional networks (DN): DNs are reversed CNNs. They let you enter words, say "leaping cat," to generate an image of a leaping cat.
- Recurrent neural networks (RNN): RNNs are good at speech recognition and handwriting recognition. (This is not the same as a recursive neural network, also confusingly called an RNN. We'll post an update when we've clarified what these are best used for.)
Supervised vs. unsupervised learning
Machine learning can take two fundamental approaches.
Supervised learning is a way of teaching an algorithm how to do its job when you already have a set of data for which you know “the answer.”
Classic example: To create a model that can recognize cat pictures via a supervised learning process, you would show the system millions of pictures already labeled “cat” or “not cat.”
Unsupervised learning is how an algorithm or system analyzes data that isn’t labeled with an answer, then identifies patterns or correlations.