Methodology

At its most basic level, the methodology is informed by universal swing. Naturally, this is limited when it comes to Ireland’s PR-STV electoral system, so there’s a number of layers additional to this. This page will explain the underlying methods for first preference vote (FPV) and transfers, the models used for baselining (there are six of them!) and then the final model used to create the overall outcome. This remains a work in progress and intend to find ways to refine and improve this as time goes on.

I’m going to try to explain this without using statistical or scientific notations and concepts, firstly because I find that language to be inaccessible, secondly because all this was done in a sprawling mass of Excel spreadsheets, and it would be a disservice to try to hide some of the more bootleg pieces, many of which are bootleg by necessity owing to gaps in the underlying data, behind an academic veneer, and thirdly because I don’t understand P values. ^{(this is a joke, please stop emailing me about P values)}

Note that the model currently assumes that if a party ran in a constituency in 2020, in will in the next election, and if it did not, it will not. This is a current limitation of the data, and filling in those gaps would require an awful lot of assumptions and work. It is something I will consider once we know who exactly is running where, but for now it would bring in too many problems. The only exception to this was an SF candidate added to Cork North West.

FPV Model

FPV calculations are based on provincial level polling data. This data is tracked on both a 5-poll and 10-poll rolling basis. This creates two data points indicating current support levels for each political party. Rolling polls are used to avoid outliers and to react to trends in polling rather than individual polls; this is especially important given the smaller sample sizes used in provincial level data.

These data points are then measured against the performance of each party in the 2020 election. For example, if SF got 20% of FPV in a certain province in the election and are now polling at 30%, the model applies a +50% swing to their numbers in that province. So if in a certain constituency within that province they got 10% FPV, the model will put them at 15%.

This creates situations where the FPV numbers in a constituency no longer add up to 100%, so the numbers are smoothed out so they add up to that total. For example, if in a certain constituency you end up with SF 35%, FG 35%, FF 20%, LAB 10%, GP 5%, SD 5% (totaling 110%) the model will take their FPV numbers as (approximately) SF 31.8%, FG 31.8%, FF 18.2%, LAB 9.1%, GP 4.5%, SD 4.5%.

Transfer Model

This one is more complicated and involves multiple steps.

The model takes the basic number of transfers per constituency from the 2020 election and adjusts them based on the same universal swing methodology used for FPV. This is not ideal, as it assumes transfers move at the same proportion as FPV, but there isn’t sufficient additional data on changes in transfer rates to indicate any other basis would be more valid. Similarly, there is no way to differentiate between second, third, fourth etc. preferences, so this is not currently considered.

To illustrate how this works, let’s say in a certain constituency 10% of SF transfers went to FF in the election. In that constituency’s province, FF got 20% FPV in the election but are currently polling at 16%. This is a -20% swing, so the same will be applied to the transfers, meaning the model will assume SF will transfer to FF at a rate of 8%.

Because elimination order, simultaneous eliminations, and non-transferable surpluses mean there are gaps in the data, some assumptions have to be made. To work around this I have used three methods, only applying later methods if there is no data for the prior one.

The first is reciprocity. This assumes that transfers flow equally between two parties. So if for the above example I have calculated that SF transfer to FF at 8% based on the past election, but have no data for FF to SF transfers, the model assumes that is also 8%. If there is no data at all (e.g. neither FF nor SF transferred to each other in 2020), the second step kicks in, which is regional averages. This takes the average of all transfers in that province between two parties and plugs it in for that specific constituency. Finally, if that is not available (which really only applies to Sinn Fein to Sinn Fein transfers in Munster and Leinster), the same process is done but using a national, instead of provincial, average. After this was bringing out too many anomalies, I switched to provincial/regional average, which has its own flaws but seems to produce less wildly moving outcomes. This is something that still needs improvement, as transfers are the most challenging part of the modelling.

Again, there is no perfect method, but I believe this is the best approach given the current data, and the deficiencies with it. These methods are used before the swing calculation is applied.

Finally, this inevitably results in total transfer numbers that do not add up to 100% of available ballots. The same smoothing to 100% that is applied to FPV is applied here.

Final Model

As it stands, the final model uses the below baselining exercises to figure out the number of candidates each party should rationally run to maximise seats. Naturally, this is considered independently for each party, so it doesn’t always create optimal outcomes for everyone.

An exception is made if a party has more sitting TDs in a constituency than it should rationally run. Then it will run as many sitting TDs as it currently has.

The FPV and transfer models outlined above are applied. If multiple TDs from one party are running, a vote split is then estimated based on 2020 performance. This gives a likely outcome, but some seats are very, very close, and the model is not going to be 100% on the money, so these situations will be discussed more on each constituency’s page as they arise.

This model will become more accurate once we know exactly how many people are running in each constituency, rather than relying on assumptions of optimal candidate strategy.

Baselining Models

Still with me? Brilliant. You’ve got through the stuff which matters. This part isn’t necessary to understand the overall model, but it’s a useful insight into how the baselining mentioned above works . Six models were used to create baselines on performance. These aren’t used to calculate final outcomes, but to game out what happens under different scenarios. The models are as follows:

National d’Hondt

This takes the FPV in each constituency, based on national-level swing, and distributes via the d’Hondt method, which assigns seats proportionally. The total FPV of each party is divided by the number of seats won plus 1 for each count.

So you have a five-seater that ended up FG 40%, SF 35%, FF 15%, everybody else 10% total, the distribution works like this:

Count 1: FG 40%, SF 35%, FF 15%, rest 10% – FG win a seat
Count 2: FG 20%, SF 35%, FF 15%, rest 10% – SF win a seat
Count 3: FG 20%, SF 17.5%, FF 15%, rest 10% – FG win a seat
Count 4: FG 13.3%, SF 17.5%, FF 15%, rest 10% – SF win a seat
Count 4: FG 13.3%, SF 11.7%, FF 15%, rest 10% – FF win a seat

This obviously has poor predictive value because it cannot account for transfers, but it’s useful to see what the purely FPV proportions of a constituency should indicate.

Provincial d’Hondt

Identical to the above, but uses provincial level instead of national level swings. This is a little more indicative as the rest of the model is based on provincial level polling, which is more nuanced even if it does have bigger margins of error.

Safe

This assumes that a party will only run candidates when it has a full quota for them (unless the party has less than one full quota, in which case they will still run one candidate). So if a party has 1.1 quotas or 1.9, it will still run 1 candidate. This is a highly conservative model that seeks to maximise the floor of candidates elected, and avoid putting seats at risk. It does throw up weird outcomes in five seat constituencies, where it gives unrealistic outputs for the final seat, but it’s brilliant for assessing three seaters.

This, and all subsequent models, use the transfer methodology outlined above.

Fianna Fail will probably want to strongly consider this approach in the next election. (Or they can run four candidates in Wexford again for some unfathomable reason and cost themselves a seat. I’m sure someone got paid very well to make that bizarre decision.)

Party Max

This is the opposite of the prior approach, and assumes each party will run one more candidate than quotas. So 1.1 quotas or 1.9 both result in 2 candidates. This is a very common strategy and therefore is fairly useful as a baselining tool.

2020 Candidates

This assumes that each party will run exactly as many candidates as it did in 2020, with a perfectly even division of votes. For example, say in a certain constituency FG got 20% in 2020 and ran two candidates, swing indicates they will get 24%, so each candidate is assigned 12%.

This is fairly useful, particularly for measuring the likely impact of transfers devoid of all other factors, but is also a fun way of gauging just how bad parties are at deciding how many people to run.

2020 Split

This is the same as above, but with the vote divided between candidates in the same proportion as in 2020, with the projected figures. For example, if in a certain constituency FG got 20% in 2020 and ran two candidates, one getting 15% and the other 5%. Swing indicates that FG will total 24% of votes in this constituency, so the candidates are assigned 18% and 6% respectively.

This is a really useful indicator of the impact of the transfer model on a “real world” scenario, but is based on the major assumption that vote divisions will remain static.

Share this: