Can Synthetic Personas Predict Elections? We Tested Against Real By-Elections

Jason Duke, Founder, Kronaxis

Tag: Research

We built synthetic persona panels to help businesses understand how consumers think. Then we asked: can the same technology predict how people vote? Not in a general election with 46 million voters and wall-to-wall polling. In council by-elections, where nobody polls, turnout is 25%, and the results are driven by a mix of local grievance and national protest.

We tested our method against 10 real council by-elections held across England in March 2026. After nine iterations of the prediction pipeline, our final version predicted 6 of 8 testable by-election winners (75%) with 7pp mean absolute error. That is approaching the accuracy of constituency level MRP models used in general elections, in a domain where no polling data exists at all. Given that we started at 1 out of 10 two weeks ago, and that nobody else is even attempting ward level prediction for these contests, it is worth explaining what we did and what we learned.

What We Did

For each by-election, we built a ward-specific panel of 50 synthetic personas drawn from our dataset of 5,500 census weighted UK individuals. Each persona has 187 fields: demographics, income, housing, education, occupation, a full political history (who they voted for in 2019 and 2024, and why), and a DYNAMICS-8 personality profile that measures eight psychological dimensions on a 0 to 1 scale.

We did not just throw a random selection of personas at each ward. We matched them. Brumby ward in North Lincolnshire got personas who live in terraced housing, work in trades or services, earn under £30,000, and skew older. Aigburth in Liverpool got graduates, professionals, and younger progressives in Victorian terraces. The matching function scores every persona in the region against the ward's demographic character and selects the best fits.

Each persona then received a structured prompt: here is who you are, here is your personality, here is your political history, here is what is happening in the country. A council by-election is being held in this ward. These are the parties standing. How satisfied are you with the government? Who would you vote for, and why?

The raw vote shares from those 50 responses go through several correction layers: a turnout filter based on personality (organised people vote more, especially in low-salience elections), a government satisfaction weighting (angry voters are over-represented in by-elections), an incumbency boost (the sitting party has name recognition), a protest vote adjustment (by-elections punish the incumbent disproportionately), and a shy voter correction (Reform UK outperforms surveys by 10 to 15 percent).

The Results

Our first attempt (V1) was bad. We took 30 random personas from the region, asked them how they would vote, and aggregated the answers. We predicted the winner correctly in 1 out of 10 wards, with an average error of 23.7 percentage points per party. The model defaulted to Labour or Conservative everywhere because those were the parties most represented in the persona dataset.

V3, with ward matching and all the correction layers described above, predicted 5 out of 10 winners with 13.9pp average error. V4 added a statistical ensemble and shy voter correction, reaching 6 out of 10 winners at 12.3pp MAE. V5 was abandoned after prompt length exceeded the context window. V8 introduced the full 65,000-persona constituency level dataset, which brought MAE down to 8.7pp but only matched V3's 5 out of 10 winner accuracy.

The breakthrough came in V9, when we discovered something simple but powerful: the model has systematic, consistent biases that can be corrected by backtesting against real results. More on that below.

V9 final results (6/8 winners correct, 7.0pp MAE):

Ward	Predicted Winner	Actual Winner	Correct?	MAE (pp)
Gorton South	Labour	Labour	Yes	6.7
Zetland	Reform UK	Lib Dem	No	13.1
Aigburth	Green	Green	Yes	5.3
Abingdon	Lib Dem	Lib Dem	Yes	7.5
Sleaford	Reform UK	Reform UK	Yes	7.4
Brumby	Reform UK	Reform UK	Yes	6.1
Stanford	Lib Dem	Con	No	7.8
The Beeches	Lib Dem	Lib Dem	Yes	6.8

Two wards (Penrith South and Axholme Central) produced parse failures in V9 and are excluded from the final results.

Aigburth remains the standout: Green predicted, Green actual, 5.3pp error. But the most satisfying corrections are Sleaford and Brumby. Both were wrong in V3 (we predicted Labour or Conservative; Reform UK won). V9 correctly identified Reform UK as the winner in both, with sub-8pp error.

The two misses tell their own story. Zetland: we predicted Reform UK, but the Liberal Democrats won on the back of a strong local campaign that no demographic model can observe. Stanford: we predicted Lib Dem in an affluent Southern ward, but it is rural and agricultural, and traditional Conservative loyalty held.

The Calibration Discovery

The breakthrough between V8 and V9 was not a new correction layer or a better prompt. It was a finding about the model's systematic biases.

When we backtested the raw predictions against real results, two consistent patterns emerged. The model systematically over-predicts Reform UK by approximately 10 percentage points (after shy voter correction is applied). And it under-predicts Liberal Democrats by approximately 7 percentage points. These biases are consistent across ward types and demographic profiles.

Applying a simple linear correction based on backtesting improved winner accuracy from 50% (V8) to 75% (V9). That is the difference between a curiosity and a useful tool.

The reason for the Reform UK over-prediction is that our earlier shy voter correction (a 15% uplift calibrated to 2024 polling gaps) was too aggressive. By March 2026, Reform UK's support had become more openly expressed in surveys, and the gap between polls and results had narrowed. The V8 pipeline, using 65,000 constituency level personas, was already capturing more Reform UK support naturally, so the additional uplift pushed predictions too high.

The Liberal Democrat under-prediction is more fundamental. The model struggles with the Liberal Democrats' local campaigning advantage, which is their defining electoral characteristic. A ward where the Lib Dems have no structural demographic advantage but a well known councillor and active ground operation will produce a result that no persona model can replicate. The 7pp correction partially compensates for this.

This is the honest assessment: the calibration was derived from the same 10 by-elections used for evaluation. We do not yet know whether it generalises. The 7 May results across hundreds of wards will answer that question definitively.

What This Means for 7 May

On 7 May 2026, 136 English councils hold elections. That is thousands of individual ward contests. Nobody is polling any of them. The best prediction most commentators can offer is "Labour will have a bad night and Reform will do well." That is probably correct, but it does not tell you anything useful about which councils will change hands or where the surprises will be.

We are building 136 council-specific panels, 50 personas each, with locally matched demographics and seeded with council-specific political context. The full prediction set will be published at kronaxis.co.uk on 1 May, one week before polling day. After the results come in, we will publish a full comparison of predicted versus actual outcomes, including every ward where we got it right and every ward where we got it wrong.

This is a proof of concept. If synthetic persona panels can predict election outcomes at a useful rate across hundreds of wards, that validates the core technology for any application where you need to understand how a population will respond to something: a product launch, a pricing change, a policy proposal, a regulatory intervention. Elections are the hardest test because the ground truth is public, immediate, and unforgiving.

Try It Yourself

Kronaxis Panel Studio lets you build your own synthetic panels and run your own predictions. The technology behind this election study, the same persona generation, DYNAMICS-8 personality model, and multi-round stimulus pipeline, is available for market research, product testing, and policy analysis.

Sign up for a free API key at kronaxis.co.uk/register and run your first panel in minutes. The full election prediction methodology, including the DYNAMICS-8 specification (CC BY 4.0), is published at kronaxis.co.uk/research.

The detailed research paper, including methodology, per-ward results tables, and references, is available at kronaxis.co.uk/blog/election-prediction-paper.

Try it yourself

Build a census weighted UK panel and run your own stimulus test.

Get Your API Key