The Payoff Matrix
October 5, 2013 (Revised November 25, 2013)
Anyone can play all day and all evening and collect a few masterpoints along the way. It’s much more enlightening to examine average partnership performance in a regular game. The ACBL can’t do this because they don’t have all the data. However, ACBLscore game files contain all the necessary data. I have written software called the Payoff Matrix that generates statistics for players, partnerships, and partnership interactions by directly reading ACBLscore game files. The output of the software is a set of tab delimited text files that can easily be viewed in a spreadsheet.
It is most useful to look at regular partnerships that play in enough sessions to generate decent statistics. For the nearly two year period from the start of January 2012 through the end of November 2013, a 10 session minimum cut reduces the number of partnerships to a manageable 37. Dave Oakley and Dave Walters lead the pack with a 56.96% average, followed by Mary and Ron Huffaker (56.15%), Suzanne Lebending and Roger Doughman (55.73%), and Steve Johnson and Mac Busby (55.49%). However, one can not state with certainty that any partnership is better than another but rather only state a degree of confidence that this is the case based on difference in average percentage and the error on each partnership’s average percentage. For example, Dave and Dave lead the Huffakers 56.96 ± 1.56% to 56.15 ± 0.97%. We must ask how confident we are that (56.96 − 56.15) = 0.81 ± 1.84% is greater than zero where the errors from each partnership have been added in quadrature. 0.81% is 0.44 standard deviations (0.81 / 1.84) away from zero. We can calculate the confidence using the error function; for example via 1 − erfc(0.44/sqrt(2)) / 2 in Matlab which yields 0.67. From this we can say with about 67% confidence that Dave and Dave are a stronger partnership than Huffakers. Similarly we can say with about 75% confidence that Dave and Dave are stronger than Suzanne and Roger.
Average partnership percentages are certainly interesting but bridge is also a social game. If you play against the same partnerships long enough, you are bound to wonder how well you do against each of the other regular partnerships. Perhaps you suspect you are routinely beating up another partnership, even one that seems to do roughly as well as your partnership against the field. Or conversely, maybe you always seem to get fixed by another partnership. It is impossible draw any firm conclusions from a round of two or three boards against another partnership. But after a year or two of play you might have played as many or more boards against certain partnerships than the 24-27 total that you play in a typical single session. That gives you enough data, if you can keep track of it, to draw some conclusions about the “payoff matrix”, i.e. which partnerships pay off to other partnerships and to what extent. The Payoff Matrix software calculates the payoff matrix in addition to the player and partnership statistics.
Visualizing the Payoff Matrix
There is a lot of payoff matrix data, an entry for every partnership-partnership interaction, easily several thousand interactions over the course of a couple of years. But many of the interactions only involve a few boards. The payoff matrix can be simplified by limiting it to the interactions between regular partnerships. However, this reduction still leaves us with several hundred interactions. It helps to see the reduced results visually, as shown below.
Both axes list the partnerships in order of decreasing strength. The 50% cutoff is about two thirds of the way down the list instead of half way because established partnerships tend to perform better than infrequent or one-off partnerships. Many weak partnerships have been removed by the requirement that a partnership has played in 10 or more sessions. Each square shows how well the partnership listed on corresponding row does against the partnership in the corresponding column. The color scale runs from solid blue (20% or lower) to solid yellow (80% or better) with grey at 50%. White squares indicate partnerships that have never played any boards against each other. Pink squares indicate partnerships that can not interact because they have one or more players in common. The diagonal is always pink.
Move the mouse over the image to view the details for each square (matrix element). The hovering tooltip will show the names of the two partnerships, each partnership’s average against the field, the number of boards the partnerships have played against each other, and in bold the average results of those boards including an error. The average result shows how well the partnership on the row did against the partnership on the column. The matrix is anti-symmetric about the diagonal. Squares with a faint red X denote statistics based on relatively few boards, fewer than 10 in this case.
The upper right corner of the matrix is mostly yellow. This is because stronger partnerships usually have an advantage against weaker partnerships. Conversely the lower left corner is mostly blue. However, there are interesting exceptions. For example, both Emily and Barry Berkov and Hanan Deeby and Bernard Figueiredo seem to cross up Suzanne Lebendig and Roger Doughman. Freda Anderson and Gail Dunham seem to cross up Elaine Chan and Mike Mezin as well as Mary and Roy Green but are troubled by Carolyn Casey and Lena Jelusich. Matthew Kidd and Trish Lane cross up Steve Johnson and Mac Busby as well as John Lagodimos and Kathie Angione only to lose to Ray and Alan Rowen. Fumie Graves and Joel Hoersch vex Rex and Sheila Latus. Mary and Roy Greene have it out for Marty and Leila Bloomberg. Stephen and Jill Seagren are beating up Lynne Anderson and Ursula Kantor.
Even a weak partnership might cause another partnership a lot of trouble by doing far better against that partnership than the difference in skill against the field would suggest should be the case. We could say that the weak partnership is overperforming against the stronger partnership.
How should we calculate this? What is the expected payoff when two partnerships meet? Consider two partnerships that average 54% and 50% respectively against the field. The weaker partnership is doing just as well as the field, so we expect the stronger partnership to maintain their 54% and the weaker partnership to be pushed down to 46%. If the stronger partnership only achieves 53% then we can say they are underperforming by 1% in the interaction and that conversely the weaker partnership is overperforming by 1%. More generally if the pairs achieve X and Y average percent against the field we expect them to have achieve 50 + (X−Y) and 50 − (X−Y) percent against each other.
But does it really work this way? Let’s look at some data.
Each point represents a partnership interaction where the two partnerships have played at least 10 boards against each other, that is it represents one of the bluish-yellow squares from the preceding figure which is not flagged by a little red X. There is a lot of scatter due to limited statistics and also the peculiarities of how one partnerships interacts with another. Nonetheless, the slope for the best fit is close to 1. This means for example that a partnership that does 5% better than another partnerships against the field, will on average achieve a 55% result against the weaker partnership. More generally it supports our expectation of how partnership-partnership performance should work.
An intercept of 50.45% instead of exactly 50% might seem concerning. This is an artifact from restricting the dataset to the interactions of the most established partnerships instead of all partnerships and also due to a failure to weight the linear fit by the error on each data point. Folding in all the low confidence partnership interactions would fix this issue but at the cost of a very messy plot. The fact that the slope at 1.08 deviates slightly from 1 is more interesting. This is permissible though it may also be an artifact. For the under/overperformance visual matrices below and the downloadable dataset, we will stick with our simple initial model where we assume the true slope is 1.
The matrix below corrects for the difference in partnership skill, showing how much each partnership overperforms or underperforms against another partnership. The blue to yellow scale runs from -30% to 30% with grey at 0%. Move the mouse of the image to view the details for each square (matrix element). As before the partnership names and average percentage are shown. The over-perform / under-perform percentage is shown in bold. The ordinary percentage from the previous figure is also shown. Note: the previous figure also shows the over/under-performance statistics after ‘OU:’ as in ‘OU: 9.7%’.
A look back at 2010-2011
I have also created the payoff matrix for 2010-2011. This dataset includes the full 24 months instead of the 21 months in the 2012-2013 dataset. A 12 session minimum cut reduces to 42 partnerships. The red X flagging was raised to 12 or fewer boards. During 2010-2011, Debbie and Alan Gailfus, who were La Jolla Unit members back then, come out on top with a stunning 59.09% average. Their 59.09 ± 1.89% versus Dave and Dave’s 57.05 ± 0.89% means we can state with 85% confidence that they were the stronger partnership during this time period. The Gailfus partnership variability was also quite high at 7.57%, higher than any of the other 41 partnerships, and significantly higher than the 6% average variability.
A few off diagonal pink squares appear because some players had two regular partnerships during this time period. Mac Busby did well with both Suzanne Lebendig and Greg Chaffee. Fumie Graves played with both Murray Goldman and Joel Hoersch. Yoko Davis, now Yoko Jordan, played with both Sumiko Inagaki and Sam Jordon. The Payoff Matrix software is not thrown off by the name change because it use the ACBL player number whenever possible and converts the life master letter to the corresponding digits to avoid duplicating an individual when they become a life master.
Switching to the over/underperform matrix below, we see Debbie and Alan Gailfus destroying Peter Lagodimos and Tom Tatham and significantly damaging Freda Anderson and Gail Dunham. Dave and Dave stuck it to Ray and Alan Rowen, Fumie and Joel, and Freda and Gail. Chuck Wilson and Barbara Norman were surprisingly good against Greg House and Maritha Pottenger and similarly Freda and Gail against Bill Grant and Lynne O’Neill. Alice Lane and Sandra Schumsky had a good track record against Freda and Gail. Emily Berkov and Barry Berkov vexed Hanan Deeby and Bernard Figueiredo.
A look back at 2008-2009
The payoff matrix for 2008-2009 includes the full 24 months. A 12 session minimum cut reduces to 41 partnerships. The red X flagging is again for 12 or fewer boards. The race this period between Dave and Dave (59.24 ± 1.00%) and the Gailfuses (59.07 ± 1.14%) is too close to call with any confidence. Steven Johnson and Diana Marquardt were a distant third at 55.56 ± 1.11%.
Switching to the over/underperform matrix below, the Gailfuses pounded the Huffakers. Steven Johnson and Diana Marquardt beat up on Bill Grant and Ed Layton. George Bessinger and Kathee Farrington vexed Hanan Deeby and Bernard Figueiredo. Manoochehr Bahmanian and Sally Ishihara did well against strong pairs only to give it back to weak pairs.
The six year payoff matrix
Below are the payoff matrices for 2008-2013, almost six years of unit game play. A 26 session minimum cut reduces to 48 pairs. The red X flagging is set for 16 or fewer boards.
Get the data