+5 votes
by (8.4k points)

1 Answer

+4 votes

This is an interesting question to consider if only because I think all marketers that lean heavily on Facebook (which is essentially all marketers) should have a basic understanding of the mechanics of FB's algorithm. I don't think this is any different from how a Formula 1 driver would have a fundamental understanding of a combustion engine.

But there are two reasons why answering this question is difficult:

1) Facebook is uncharacteristically guarded in describing how its algorithms work. Given the volume of content that Facebook publishes (both public-facing, public-consuming marketing content and academic papers / whitepapers from its staff), it's surprising that so little information about ad-serving logic has been made available by Facebook;

2) There is a tremendous amount of really terrible noise from marketing vendors about Facebook's algorithm online. Do a Google search for "How does Facebook's algorithm work?" and you'll have to sort through pages and pages of awful content marketing from "Facebook Gurus" and marketing vendors that is generally, obviously incorrect, if not directionally so. In general I don't trust content marketing and try to always find first-party resources about how things work, but the content around Facebook marketing from vendors is specifically, exceptionally bad.

I spent some time digging around and found a great resource that provides some great first-party insight into Facebook's ads platform: a YouTube video of a presentation given by Eric Sodomka, a research scientist at Facebook who specializes in auctions. Eric was actually filling in for two different people in giving the talk: Joaquin Quiñonero Candela, the Director of Applied Machine Learning at Facebook Research and Chinmay Karande, an Engineering Director in Facebook's ads delivery product group. The video is here:

In the first part of the presentation, the logic behind how Facebook imputes value onto and sorts content in the feed is discussed. The presenter provides some background on different approaches that the company has taken in the past before explaining how it currently (at the time, at least -- the video is from 2015) evaluates content for a given user:

image

The left side of this diagram represents probability sets that this user will click any ad, and the right side represents probabilities that this ad will be clicked (the text at the bottom, P( Click | User ID = 4, Ad ID = 1 ) is a conditional probability statements that reads: The probability that a click will happen given that the User ID is 4 and the Ad ID is 1. This probability statement is arrived at via two different predictive techniques: at the top (the set of hierarchical circles that form a triangle) is a decision tree, and at the bottom (the row of circles marked with Wx) is a logistic regression. Explaining these techniques goes way beyond the scope of this answer, but roughly: a boosted decision tree is a technique for combining (boosting) lots of weak sets of decision logic against many different features (like eg. user gender, age, average number of FB sessions per day, etc.) to come up with a weight that correlates to a likelihood of taking some action (eg. men over the age of 30 who log into Facebook every day are likely to "Like" content about horses), and a logistic regression is a way to quantify the impact of various features on the likelihood of some thing happening or being true (eg. age has more of an impact on the probability that a man clicks content about horses than frequency of Facebook sessions).

The innovation in this particular model, as the presenter points out in the video, is that nothing in the existing academic literature on classification systems indicated that combining these two approaches would produce improved results. The presenter explains that Facebook's internal analytics architecture is flexible and robust enough to allow them to try very many models quickly, and that dynamic testing environment is what led to the discovery of this model (this model is described in more detail in this research paper by Candela, one of the people who was supposed to present in the video).

The presenter talks about what exactly this model tries to predict against in the context of feed content at first, and there are a number of different "conversion" events that can happen with those pieces of content in the feed. A "Like" is one, but so is a click, a long video watch, a comment, a share, and a hide. Each of these events is given a "value" to Facebook: a hide has a high negative value because it means that the wrong content was shown to that user, but the positive events are also valued, with a share being the highest value. These values are what are used as inputs into an expected value calculation against the probability that is produced in the model above; the presenter used these values in the presentation, but he didn't specify whether these were the actual values used internally at the company:

image

So to put this into context: if a user logs in and there's horse content ready to be served in their feed, then Facebook would first calculate the probability of each of these actions being taken by the given user on that content and then multiply those probabilities by the actions' points to get an expected value for that action. The idea here is that Facebook wants to optimize the expected value of that placement: if there's a high probability of the content being shared and a low probability of it being hidden, the points help to referee whether the content gets shown or not (since hiding has -2.5x the points of sharing).

Note that these points, while likely informed by much experimentation, are somewhat arbitrary and objective. For ads, the value determination is a lot more straightforward: advertisers bid on actions. If I bid $100 for a 100% D1 ROAS user via a VO bidding strategy, or if I bid $3 for an install via a CPI bidding strategy, Facebook knows exactly what the value of each of those events is to them and to me. That leads the presentation into the second section, which is about the auction that Facebook uses to combine explicit (not imputed) value with the probability model from the diagram above.

image

The presenter introduces Facebook's ads platform by talking about how ads fit into the broader content mix on the site (he alluded a bit to mobile, but 2015 was just before Facebook's mobile ads revenue hit its stride, so the presentation is largely about desktop feed). As background, he explains that sponsored content (ads) and "LTV content" (content from Facebook that provides Long Term Value to users and is informational, being inserted into the top of the Right Hand Side content well) is served from a different source than organic content (actual user content that goes into the feed), and that organic content is already ranked and sorted when it is sent to "the auction." That is to say that the auction that chooses the configuration of the page is just deciding whether and where to insert ads into already-sorted organic content.

Delving into the meat of the auction mechanic, the presenter notes that a Vickrey-Clarke-Groves auction (VCG) is used by Facebook -- the deck he is presenting insists that it is a necessity -- because of its flexibility with handling different types of content to slot into different places in the page configuration.

image

A bit about the VCG mechanic: a VCG auction is what's known as a 2nd-price, closed auction. By closed, we mean that each person's bid is only known by them -- bids are hidden from other auction participants. And by 2nd-price, we mean that the winner is the person that submitted the highest bid, but the winner actually pays the price of the 2nd-highest bid (eg. if two bidders submitted bids, Person 1 for $10 and Person 2 for $8, Person 1 would win the auction and pay $8). The effect of the 2nd-price payment approach is that the winner effectively pays for their externality to others -- that is, utility that they are taking from the person that would have won the auction if they hadn't participated. Because the auction is closed and 2nd-price, bidders are incentivized to bid the actual underlying value of the object. For more background on VCG auctions, this paper and this lecture (starting at 19:30) do a good job of explaining the formal mechanics, and this video does a good job of contextualizing the VCG auction for digital advertising.

Bringing this back to Facebook, there are some manipulations that must be made to the auction process in order to fit the business context of the Facebook ads platform. The presenter explains those with this slide:

image

Putting aside the notation, what this is saying is that the value function (trying to maximize the amount of money generated by serving the ad impressions that are being bid on) is constrained by two things: 1) the desire of Facebook to not overload each user's feed with ads and thus diminish the user experience, and 2) the overall bid and budget limitations of each advertiser's campaign.

The presenter then brings the discussion back to the same format as was introduced in the discussion of the model for serving organic content: conversion probabilities are applied to value estimates (or, in the case of ads, bids) and the winner is chosen based on what produces the most expected value. In this case, the conversion events are not the possible interactions that users can have with organic content (like, comment, etc.) but rather the specific actions that the advertisers are bidding against for their ads: clicks, app installs, page likes, on-page events reported through pixel firing, etc. This notion is captured in this slide:

image

And this is how the classification model from earlier in the post is introduced into ad targeting: the ads shown to people are those selected based on expected value as determined through the conversion probabilities selected for via those feature weights. This brings a few things into perspective:

1) Bid levels impact ad delivery, but so does historical performance with various groups. If an ad isn't likely to be clicked on by some group of users based on historical performance, then its delivery will suffer: Facebook only gets paid when the events that advertisers are bidding on happen. Put another way, creative optimization, not just bid increases, can improve ad delivery;

2) Broad targeting for a niche product can tank delivery because of the point made above: when users are shown irrelevant ads and don't interact with them, the click probability part of the expected value equation changes to the detriment of delivery;

3) Advertisers should bid as close to the true value of a conversion event as possible since the VCG auction uses the 2nd-price mechanic to collect payment. 

by (8.4k points)
This is a really fantastic answer and the level of depth that best in class UA/DR teams should strive to have.  For anyone else who enjoys technical content on FB, I also find Smartly.io's blog interesting.

FWIW, there are (good, experienced) advertisers on FB that finding bidding above their goal (20-30%) is what works for delivery & performance at scale. Test, iterate, trade notes with other similar advertisers - find what works for you.
I agree that Smartly probably has the best content marketing of any ad tech vendor