Gale-Shapley Algorithm | Hemanth's Internet Home

Today's TIL was going to be so off the course, that if you had a map, you'd prolly need a new one. I got myself into understanding the T-Matic Clutch system in a bike called TVS Jive (on which I learnt to ride a bike). That rabbitholed into quickshifters and blipping and ECUs, etc. So while I totally decided on writing about that, this algorithm pops up out of nowhere, and well, the rest is history. (post edit notes: well that's still ending up as my til now xD)

Gale-Shapley Algorithm is also known as the Deferred Acceptance Algorithm, and is used to find a stable matching between two sets of equal size. Infact, this is a solution to the Stable Matching Problem. Let's understand what Stable Matching Problem is, and then get to the solution.

The stable matching problem takes 2 sets of participants of equal number, say $A = \{a_1, a_2, \dots, a_n\}$ and $B = \{b_1, b_2, \dots, b_n\}$, along with a list of preferences in being matched up with the other set for each participant in each set. A match must be made between a participant from Set A and another participant from set B only. And it is considered stable if and only if there is no pair $(a_i, b_j)$ such that both $a_i$ and $b_j$ prefer each other over their current partners.

In 1962, David Gale and Lloyd Shapley proved that for any equal number of participants in two sets with their respective preference lists, a stable matching always exists. And in order to find the stable matching, they put forward an algorithm (which we now know as Gale Shapley algorithm).

The algorithm moves in rounds, where the groups are called proposers and reviewers. In each round, a proposer proposes to the highest ranked reviewer in their preference list who they haven't proposed to yet. Then each reviewer considers all the proposals they received in this round and stays with the one highest on their list. The matched proposer need not go on the next round, and unless the reviewer gets a better proposal, they wouldn't leave their current proposal. This process is repeated until no more proposals can be made, at which point a stable matching is found.

Let's take an example to understand this better. Suppose we have two sets of participants, A and B, each of size 3. Let $A = \{a_1, a_2, a_3\}$ and $B = \{b_1, b_2, b_3\}$. Let the preference lists be:

$$ \begin{aligned} a_1 &: [b_1, b_2, b_3] \\ a_2 &: [b_2, b_1, b_3] \\ a_3 &: [b_1, b_2, b_3] \\[1em] b_1 &: [a_1, a_2, a_3] \\ b_2 &: [a_1, a_2, a_3] \\ b_3 &: [a_1, a_2, a_3] \end{aligned} $$

Here, $a_1$ proposes to $b_1$, $a_2$ proposes to $b_2$, and $a_3$ proposes to $b_1$.

Now, $b_1$ receives two proposals, one from $a_1$ and one from $a_3$. $b_1$ prefers $a_1$ over $a_3$, so $b_1$ accepts the proposal from $a_1$ and rejects the proposal from $a_3$.

$b_2$ receives only one proposal from $a_2$ and so $b_2$ accepts the proposal from $a_2$.

Thus, in the next round, $a_3$ is the only one without a partner. So $a_3$ proposes to $b_2$ (as $b_2$ is the next best option in $a_3$'s preference list). $b_2$ is currently matched with $a_2$. $b_2$ prefers $a_2$ over $a_3$. So $b_2$ rejects the proposal from $a_3$.

$a_3$ now proposes to $b_3$ (as $b_3$ is the next best option in $a_3$'s preference list). $b_3$ is currently unmatched. So $b_3$ accepts the proposal from $a_3$.

Thus, the matches are made, but are they stable? If you check for any pair $(a_i, b_j)$, you'll find that there is no pair $(a_i, b_j)$ such that both $a_i$ and $b_j$ prefer each other over their current partners.

Some More Details...¶

Intuitive Proof of Stability

This algorithm has an intuitive reason for always having a stable matching. It goes like:-

Any proposer $a_i$ will remain unmatched if and only if they have exhausted all of their preference list, and any reviewer $b_j$ would not leave their initial proposal if the offer was better than any subsequent ones. Thus, in a situation where the number of proposers and reviewers is the same, every proposer will end up with some reviewer, because every reviewer will remain matched!

It's an algorithm! What's the time complexity?¶

Let's assume we try to implement this. The data structures that we could pick are:-

A set of proposers with no matches.
A one-dimensional array indexed by proposers having preference index of the next reviewer in their preference list.
A one-dimensional array indexed by reviewers, specifying their current proposer.
A two-dimensional array indexed by a reviewer and a proposer, specifying the position of that proposer in the reviewer's preference list.
A two-dimensional array indexed by a reviewer and a number $i$ from 1 to $n$, naming the applicant who is each reviewer's $i^{th}$ preference.

This set of data structures can let us find a match for a proposer in $O(1)$ time, but setting up these data structures itself would take $O(n^2)$ time. Since each proposer proposes to each reviewer at most once, and there are $n$ proposers and $n$ reviewers, the overall time complexity of the Gale-Shapley algorithm is $O(n^2)$.

What's interesting is this algorithm is inherently parallelisable. This can be done by having each proposer be parallel functions, proposing to reviewers. Since each proposer need not wait for the other proposers to finish their proposals, but just the answer from the reviewer, the only synchronisation to be done is when multiple proposers propose to the same reviewer in the same round. In this case, however, we're dropping the set of unmatched proposers, because that will add overhead in terms of synchronisation.

(Algorithmic) Game Theory¶

There are subfields of Game Theory that study about designing systems/algorithms. Mechanism Design and Algortihmic Game Theory focus on designing rules for such systems. One such group of mechanisms are known as Strategyproof Mechanisms (or Truthful mechanisms). These mechanisms are designed such that any algorithm or system that implements them achieves the best possible outcome when all the participants in the system act truthfully.

Gale-Shapley has 2 types of participants, and hence it could classify differently for each group.

For the proposers, it is a truthful mechanism, because no proposer is gaining from misrepresenting their preferences. The same, however, cannot be said about the reviewers. The reviewers could, in fact misrepresent their preferences and get a better match! This comes with a massive caveat; the reviewer trying to game the algorithm must know the preference lists of the other participants as well. If this weren't the case, the reviewer would almost certainly always end up with a worse match, and even after the final match up gets released, a "better" misrepresentation in hindsight cannot be deduced. Hence, Gale-Shapley is classified as a regret-free truth-telling mechanism.

This algorithm has been seeing widespread use in dating apps, school choice programs, national resident matching programs, etc. for its fairness-stability trade-off. Hence why you would likely find a host lot of videos explaining this algorithm from the view of dating and marriage, where the proposers are men and the reviewers are women.

In 1984, Alvin Roth had noticed that this algorithm had already been seeing practical use since the 1950s as the Boston Pool Algorithm used by the Nation Resident Matching Program (NRMP). Using this, he, along with David Gale devised a simplified version of this algorithm for the same purpose, which was implemented by the NRMP in 1998. David Gale had unfortunately passed away in 2008. Alvin Roth and Lloyd Shapley were awarded the Nobel Memorial Prize in Economic Sciences in 2012 for their work on this algorithm and for solving the stable matching problem, highlighting just how important and impactful this algorithm has been, and continues to be.