Quali review and how stats can be deceiving: Races 1 to 4

No posts last week. I saw my mom after many years, and I believe that I made the right decision by choosing her during the weekend instead of the usual F1. I am back though, so let’s take a look at how quali has been going for the teams during this season and how the data can be manipulated (intentionally or unintentionally) to show results that are not necessarily the most accurate.

After three races, teams' average qualifying pace compared to fastest:
1 Mercedes
2 Ferrari +0.227secs
3 Red Bull +0.646secs
4 Haas +1.170
5 McLaren +1.377
6 Alfa Romeo +1.511
7 Renault +1.588
8 Toro Rosso +1.775
9 Racing Point +1.788
10 Williams +3.716
— Andrew Benson (@andrewbensonf1) April 24, 2019

Why is the methodology important?

First let’s start by taking a look at this tweet posted by Andrew Benson, the BBC Sport’s chief F1 writer. Mr. Benson here shows some good numbers, but there seems to be some important missing information. How were these stats calculated? Did he average the quali times for both drivers of each team? What if one driver missed the cut from Q1 to Q2 or Q2 to Q3? This information is surely important to understand properly how the teams are doing during quali.

Without showing how the information was taken, we can draw some wrong conclusions from it. Same thing would happen if the data was processed in a way that is not optimal. Sometimes we may think that a delta of one second from one driver to another is always the same, regardless of the race, but that is not necessarily true. In long tracks, like Spa-Francorchamps, one second is not a lot, since a full lap is of over 1 minute and 40 seconds. In short tracks however, like at the Red-Bull Ring, a second is worth a lot more, since a full lap is just over one minute. We cannot necessarily compare time as an absolute unit between tracks, and it may be better to compare percentages, taking the fastest lap as 100% and slower laps from other drivers as increments over that total (eg, 100.5%, 101%, etc).

Let’s see some examples from the first 4 races to show how methodology and data analysis can change the results in meaningful ways.

Delta per quali session

Here we have the average delta per quali season for the teams. How was this chart made? Well, in this case I did the following.

Selected the fastest driver from each team in each quali session.
Compared his time to the fastest time from that particular session. For example, if a driver moved on to Q3, I compared his Q3 time to the fastest Q3 time. If a driver only managed to move on to Q2, then I compared his time to the fastest Q2 time.
The table on the right shows how many times did the fastest driver from each team move on to Q1, Q2 and Q3.

Why not average the time from both drivers from each team? Well, the main issue for me is that sometimes one driver goes into Q2 or Q3 and the other one does not. We cannot average their times since track evolution plays a big role here and it would not be a fair comparison.

Why not take the data only from Q1? That is a good argument, but for me it was important to show not only the delta from the fastest time to each driver from each team, but also if they were moving consistently or not to Q2 or Q3.

Here the data shows that Mercedes has been the dominating team, with Ferrari and Red Bull following behind. Haas is the best team from the midfield, which is consistent with what we have seen so far during the season, but after them, things get a little bit sketchy.

Why is McLaren and Alfa Romeo so far behind? They both have had at least one driver qualify 3 times into Q3 and only 1 time into Q2, so surely they must be doing a good job. That is the main issue actually. Since they have gone into Q3 more often than not, then they have had to compete more often against Mercedes, Ferrari and Red Bull with their famous “party mode”. Racing Point seems to have a similar problem. While they always tend to start slow, surely they are not the slowest team during quali so far.

Delta per quali session (-Q3)

I decided to remove Q3 and redo the analysis. Now the data seems to make more sense. All teams have had at least one driver in Q2 during each race of the season (minus Williams, cough cough). The data between the midfield teams is now more representative, and we can see how Toro Rosso and Renault now move to the back of the pack. This makes sense, since they both have been able to qualify only once to Q3, while McLaren and Alfa Romeo move towards the front due to their good Q2 performances. Racing Point is also benefited, with Checo Perez doing a good job by making it 2 times into Q3.

The main issue with this new graph, is that the data from the top 3 teams is not as representative anymore. The top 3 teams have been able to put at least one driver into Q3 every race of the season, so it makes sense to compare their Q3 times and not their Q2 times.

A final issue is that here we are just showing time in seconds as an absolute unit, regardless of track length. Surely the data may change a little bit if we try to normalize that data by expressing delta as a percentage instead of as a time unit.

Delta per quali session (expressed as %)

Now we have what I would consider a more representative graph. This time the analysis was done with the following methodology:

Selected the fastest driver from each team in each quali session.
Compared his time to the fastest time from that particular session. For example, if a driver moved on to Q3, I compared his Q3 time to the fastest Q3 time. If a driver only managed to move on to Q2, then I compared his time to the fastest Q2 time.
Time was normalized by converting the fastest time to 100%, and the time from all the other drivers to the delta between 100% and their percentage. Eg: At the Azerbaijan GP, the fastest Q1 time (1:41.335) would be a 100%, while the slowest time (1:45.455) would be converted to 104.065%.
Top 3 teams were compared with their Q3 times, while the rest of the field were compared on their Q2 times (minus Williams, who has not been able to qualify to Q2.

This new methodology normalizes the track length and time differences between the first 4 races, and allows us to compare the top 3 teams with their best Q3 times, while at the same time allowing us to get a fairer comparison for the midfield teams with their best Q2 times.

If we take a look at the data, we can see that the midfield teams are actually closer than we may think by looking at the first graph. Haas, McLaren and Alfa Romeo all have moved 3 times into Q3, showing that they have been the fastest teams from the midfield, while Renault ande Toro Rosso now move to the back of the order, with Racing Point in between them.

Actually, Racing Point moves from now moves from 7th to 8th in the order after normalizing, which shows how normalizing track times can have an effect on our results.

The graph eventually shows how Haas is dominating the midfield during quali, with McLaren doing a good job standing firmly in 5th place.

Limitations

It is important to understand that the way I analyze things is not perfect. My analysis only takes into consideration the fastest driver from each team during each particular weekend, which for me, shows how the best car and best driver of each team is doing, but not necessarily represents the team as a whole.

For example, my analysis for Racing Point is being done only with Checo Perez’ times, since Lance Stroll has been woeful during qualification sessions, and has not been able to move from Q1. In this case, my analysis shows that the Racing Point car has good potential and is well capable of fighting for good positions, but only with Checo Perez (at least so far). Because of this, as I have said, the results may represent more what the fastest driver from each team can do, and not necessarily the team as a whole.

Finally, the main issue when the data is normalized is that the numbers are not very easy to interpret. Now we lose time as a unit, and it is hard to see what is the difference between teams. We can compare them relatively to each other, but saying “how far apart” are they, becomes much harder.

Conclusion

As we just saw, explaining the methodology is important since the results can change depending on how the analysis was done. In our case, the results change from showing Racing Point as one of the slowest teams during quali, to a team that is perhaps not as far from the rest as the original results may show.

As we may have expected, Haas is leading the midfield, with McLaren and Alfa Romeo not so far behind. Toro Rosso seems to be the slowest from the midfield, with Renault standing in a disappointing 7th position. Racing Point looks to be not so far behind the midfield teams in quali pace, and they will be looking to improve during the next races.

From the top 3 teams, we also have expected results. Mercedes has been dominant, taking pole in all races but Bahrain. Ferrari and Red Bull are behind, but the difference so far seems to be significant, albeit not insurmountable.

Williams is very far behind, and the truth is that they have no chance of competing during this season. Surely, their only aim is to improve the car and try to get within 102% of the fastest car in quali.

This was a slightly different article, which hopefully helps you to make your own decisions and question the data when it is presented. Mr. Benson did a good job with his numbers, but I would have liked to see his methodology to properly understand what do his numbers mean. My analysis is not perfect either, but that is the whole point, that despite stats and math, the end result can be different depending on how you analyze the numbers.

Hopefully you guys have enjoyed this article, please tell me what do you think in the comments below.

2 Comments

Fabian on May 7, 2019 at 8:13 am
Welcome back! Thanks for this great analysis (as always). I really like how you show us step for step what you have done and what your conclusions are. It’s great to have well thought out stats about Formula 1.
It’s always great seeing a new analysis by you, in fact after every GP I check your website for updates.
I’m already excited for your next analysis!
Cheers
Reply
- admin on May 8, 2019 at 5:46 am
  Thanks Fabian, it’s good to know that you have been enjoying the articles. I will try to post as often as I can.
  Take care.
  Reply