How can results for the same sample be different?

When in doubt customers sometimes take repeat samples or send them to additional labs for clarification, which very rarely will two results be exactly the same between labs. Results for two samples can be different in a few ways.

When the samples are taken – The first is in how the sample is taken. If the sample is taken at two separate times then factors such as oil topups, deterioration of the oil, filtration performance etc can lead to differing values on most tests. For instance measuring TAN and TBN over time on an engine, even if taken just a few days apart could have large differences depending on the operating conditions of the equipment.

Sampling technique – This is the most common difference in that when sampling dirt/contamination may enter the container so that even if two samples are taken within minutes of each other the amount of contamination of these non-soluble contaminants will be different. Even if you attempt to sub-sample from one to two other containers, the level of contamination will never be exactly the same in two containers. However, for properties that are not concentration based and do not settle out or cannot be filtered out such density, viscosity etc this should not be an issue.

The sample itself – if there is contamination or wear in a sample then because it is never truly dissolved you can get differences in measuring the values. For instance measuring silicon on the elemental analysis when sand is present could give quite significant differences depending if the instrument probe aspirated 1 grain or 2 grains of sand when it aspirated the sample, or none at all.

Error of the method – Even with some of the most precise equipment in the world there is always some error to each test. This is something that is a fact with every lab and test. There are two ways of measuring this error:

  • Repeatability – This is measuring the allowable difference on identical material analysed by the same operator, on the exact same operating conditions including temperature, run number, day etc. This is usually used for if you are doing statistical work or QC testing work where you have a large sample volume and run the same test perhaps 6 or 10 times to get a very good average and then use that average as the value. This tends to require large quantities of sample to perform and is generally only done for research purposes or very small QC labs. The issue with this, is that when you get to larger laboratory operations processing hundreds, or in our labs case thousands of samples per day each usually having 30+ tests, then it is not usually possible to do all this work on the same instrument with one lab operator as there would not be enough time to do this. For instance we have over 50 of some instruments that are used routinely for some tests. Additionally, if we re-test a sample we often try to run on a different instrument to rule out an instrument error to, so in this case repeatability is not the correct measure for the test to use.
  • Reproducibility – This is measuring identical material (i.e. the sample in the same bottle) using different equipment types/models, different operators (i.e. lab chemists) and can even extend to different labs too. In this case the reproduciblity is the the difference from the average of the values as a +/- that is allowable for the method (usually specified in methods by the known methods organisations such as ASTM.) So if you have a reproducibility of +/-10% and the average of the results is 50, then you can have values as low as 45 and as high as 55 and they still be considered the same value statistically. It is worth noting, that this rule is only for 95% of the times a sample is tested, so 1 in 20 times the test is run you could have a value of 58 and still be within reproducibility of the method.

In light of this, it is hard for the end client reading a report to establish what types of change in data are significant or not, as for instance on an industrial gearbox a value for iron of 3mg/kg and then 15mg/kg is a 500% increase and outside of reproducibility, equally the same would be the case with 25mg/kg and 125mg/kg of water. Most of this difference is because its is difficult to get the sample exactly the same as water and sediment will naturally drop out of the sample as soon as it is put onto the instrument. Even if the instruments have mixers, a certain amount of time has to be allowed for the sample to settle to prevent bubbles being aspirated into the sample. Hence this is always a factor to take into consideration. However, when comparing data it is important to take not only reproducbility and sample type issues, but also the significance of the change. For instance 3mg/kg to 15mg/kg on a system where the caution limit is 400ppm iron, despite being a 500% change is not diagnostically significant as it will not effect the diagnosis.

This type of judgement call sometimes requires some experience to this area, for which it is always best to talk to the lab about a report. Likewise it is useful to consider attending a training course on oil analysis. It is worth noting that if any sample shows a marked change, especially with the diagnosis flags changing too since the last sample, if the corrective action is likely to be costly, then it is always worth performing a re-sample just to rule out any sampling error as the cause of the change, a new confirmation sample is a much cheaper alternative to a corrective action that didn’t need to be made owing to a poorly taken sample.

A lab to give their customers extra reassurance as to the quality of the data can also choose to take part in round robin proficiency testing where the data is compared against multiple labs to compare how they compare to the average of the labs, which statistically will be the correct answer.