It’s tempting to say every improvement is a win. Perhaps you got an extra three percent on the response rate or the average gift nudged up.
The important question to ask when judging improvement is what goal are you optimizing toward?
Let’s say your goal is to climb the highest mountain. To reach that goal you climb the highest mountain you can see. Where I grew up in southwest Virginia that would get you to 5,730 feet. Where I live now the height would be about 486 feet above sea level. In both instances you’d have climbed the tallest peak around. BUT… you had a global goal – highest peak anywhere – and your tactic to get there was a locally optimized solution.
Analytics guru Avinash Kaushik talks about the opposite problem citing Microsoft and NFL football games as an example. The Microsoft Surface tablet shows up in ads and in negotiated ‘product’ placement in the hands of coaches and players all over the place.
They’re doing great on reach and brand lift for ad metrics; they are at their local maxima.
But Surface has a .29% market share. They haven’t solved for a global maximum goal of being remotely relevant in market share. This example has two implications: one for how we test to get our goal and the other for what our goals are.
A traditional A/B test for what teaser or envelope color to use is a local optimization goal. You may get improvement, but it’s specific to that communication at that time and hardly going to move the larger needle of retention or falling yield rates or stubborn flatness in your net revenue.
The problem is compounded if your idea generation process for best teaser or envelope color has no evidence or reason to believe behind it. Then you have a needle in a haystack chance of even being locally optimized – i.e. the best darn teaser possible instead of the best you could come up with at the time.
A way to break out of this is to go back to your seventh-grade science project: start with a hypothesis. More specifically, start with a hypothesis that, if proven one way or the other, will change how you do business.
Take the Nudge-Award-winning test that showed UNHCR posting a 42% lift when it presented its symbolic gifts symmetrically (e.g., five blankets, seven blankets, nine blankets versus blankets, radiator, and stove). This is something that, while small, changes every donation form and every response device they use. And the test ideas had theory and rationale behind them – i.e. a reason to believe.
As a side note, how do you know if you have theory or rationale behind your test idea? If your test wins or loses you have a very specific answer to the “why” question afterwards. If your answer is “don’t know” or superficial, “because people prefer chartreuse”, then you don’t.
But Kaushik’s look at global optimization also begs the question of whether we are looking at the right goals.
There is nothing wrong, and most things are right, about setting goals over elements like response rate, average gift, and net revenue per communication. Or larger goals like hitting your net revenue budget and aspirational file size.
But all these could use a refinement like the one Kaushik recommends for the Microsoft Surface: instead of measuring brand recall, test whether people are more likely to choose the Surface (since buying is the behavior you are shooting for).
So, for example, do you really want to measure file size? Or do you want to measure file size of the donor segment with a connection to your cause, which you’ve determined is way more profitable than those donors without a cause connection.
Donors are more valuable to nonprofits than organizations are to the donors. A global maximum metric needs to be, or at least approximate, whether you are leaving your donor potential greater than you started. A “successful” mailing that nets $100,000 isn’t successful if it turns off donors with lifetime values of $200,000.