High variance between tests can make it difficult to tell if your site got slower, or if you're just seeing random noise.
Two tests of the same website will never match by the millisecond, but there are some strategies to reduce test variability.
There are two ways that a slider transition affects your metrics:
Let's say your website takes between 9 and 11 seconds to fully load, and the slider transition happens after 10s. If Lighthouse is still waiting for your page to load after 10s, then it will observe the slider transition and count it towards the page CPU time and download size.
Observing this page activity will also cause Lighthouse to wait longer before finishing its tests, to make sure the page has now fully loaded. This in turn will lengthen the time window where network requests or CPU activity are captured.
To avoid this, disable sliders and animations for your test. The easiest way to do that is to use a query string like
?disableSliders and then write some custom code to disable slider transitions on your page.
If your company is running A/B tests users will be randomly assigned to one version and sometimes to the other. That randomness will show up in your test results.
To fix this, pick one version and make sure you test it consistently. Again, you can usually use a query string or cookie to do this.
Ads, article recommendations, and other dynamic third-party content will be different every time. This will show in your performance metrics, as many third-parties are CPU-heavy and download a large amount of content.
A simple solution is to disable ads. However, this means you are now no longer testing your site the way a user would experience it.
A more complex option would be to create a mock third-party that always serves the same content. Load the page in your browser and save the third-party content. Then set up a static server that always serves the content you captured.
One issue here is that your server will have different performance characteristics than the third-party server. A static server might respond to a request within milliseconds, while the third-party might normally take seconds.
In the real world there'll always be variance. Other processes on your test machine can use up CPU capacity, and the network infrastructure between your device and the server might be busy.
To reduce variance in spite of this, you can run each test 3 or more times and then look at the median result. This will eliminate outliers and save time you'd otherwise spent trying to understand random one-off results.
If you use DebugBear to monitor your website you can enable multiple runs in the project settings.
Sometimes Lighthouse will not wait long enough for your page to fully load. For example, about 10% of the time it might finish a test before a chat widget appears. When this happens you'll see a spike in your performance score.