- To participate in the 911Metallurgist Forums, be sure to be Logged-IN
- Use
**Add New Topic**to ask a New Question/Discussion about GeoMetallurgy or Geology. - OR
**Select a Topic**that Interests you. - Use
**Add Reply**= to Reply/Participate in a Topic/Discussion (most frequent).

Using**Add Reply**allows you to Attach Images or PDF files and provide a more complete input. - Use
**Add Comment**= to comment on someone else’s Reply in an already active Topic/Discussion.

## Heterogeneity Test (14 replies)

Capping at the 95 percentile is usually too severe in the case of estimation. Usually you should use a cumulative probability plot and see where the points in the plot go off the trend of the line. This is usually above the 98 percentile. Have a look at the coefficient of variation as well. Ideally it should come down radically as you go from no cap to using a cap. At the absolute worst the CoV should come down below 2.5 or better still lower. Capping is a whole field of study in itself.

Good answer! Keep in mind, though somewhat off the question, cumulative probe plot can also aid in identifying multiple populations which may necessarily need to somehow be separated and then analyzed. Just my two cents.

I am assuming you tested for gold. If so, the Heterogeneity test work is mainly designed to measure the influence of any coarse gold. You can use the 100 sample particle method or the staged duplicate sample method - anomalies in both processes are what we are testing for so outliers cannot be removed. You cannot lower your FSE - it is fundamental - you can lower error in sampling on site by taking larger samples (within practical considerations).

There are several common statistical tests for outliers. One frequently used is Grubbs' Outlier Test and you are also correct that simply using a two standard deviation criterion is not recommended. Furthermore, there is much more to it than simply removing the detected outliers; I know of no recommendation offered in the statistical literature that supports doing this. Consider that given sufficient data, one can remove enough selected data points to reach almost any conclusion desired. To the contrary, a detection of one or more outliers should trigger an investigation of the cause and there are many possible causes. You mentioned one which is certainly a possibility. Also, common outlier tests assume a Gaussian (Normal) distribution. The distribution under scrutiny may not be Gaussian. It may be heavier in the tails and it may not be symmetric. Outliers also can be generated by unexpected errors/mistakes in the sample preparation or laboratory analysis process.

Indeed. If the cumulative probability plot is not a nice straight line up to the high 90 percentiles you need to see if you are dealing with two or more sub domains. See if there is any chance that you can separate these populations spatially.

Looking at all the samples CoV is 3.85, capping at 99% the CoV lowered to 2.25. Inside the resource model the CoV was 2.74, and capping at 98% the CoV lowered to 1.88. Also, the change of slope showing a different population is around 98%.

No sample from the heterogeneity test was higher than the capping grade, 57 g/t. The test was done on four different granulometries, and the CoV was 1.50, 1.79, 1.09 and 0.79. When I removed the results they were over the mean plus 2 times the standard deviation, the CoV changes to 1.25, 1.16, 0.78 and 0.59 respectively. Is this a good practice or could it mask the real heterogeneity of the ore? The results for alpha and K changes from 1.28 and 842 to 1.15 and 309, and the liberation size changes from 319 microns to 199 microns, which is closer to what we see.

I think the purpose is different: You are not estimating the resource; you are conducting a heterogeneity test. Grade outliers have an impact on those (see papers by Minnitt for some case studies) but from the sounds of it, not that extreme and the variance he's looking at may well be a "good" and "proper" reflection of the variance.

In dealing with the assays of gold-bearing samples, you discard assays at your peril. If you have coarse gold (>75 microns) in the ore and your sample mass is relatively small, you are in a prime position to have a highly skewed sample distribution for your results. Everybody LOVES to have Gaussian results, but in gold that is not a thing that happens unless you enjoy a fine gold grain size (volume) distribution in the ore and adequate sample masses. This is not the norm.

So I think you should abandon your objective to throw out results and learn from those that you have (unless of course your assay lab is unreliable).

I would suggest that you look at the ores (you are a geo!) in your domains and decide which ones might be providing the coarse grains and then embark on a campaign to find out just what you are dealing with. I have experience in this matter and tools with which to deal with it.

In a final statement, your assessment of the ore heterogeneity is probably correct (without throwing out data) and you should live with it, rather than trying to find means of reducing that heterogeneity result at your convenience.

Minnitt in his paper (A comparison between the duplicate series method) emphasizes the "importance to identify and eliminate outliers, samples with biases greater than 5%". I didn´t fully understand what that means, and looking at the graph in the paper I couldn´t figured out what that meant.

The samples used at the heterogeneity test had all of its mass assayed, so I thought that the results should be the best I could get. But the size of the gold grain is coarser than what I actually see on the cores and samples, which directly impacts on the mass of the sample.

Also, with the same ore I did the sample tree test, and its result was more feasible, looking at sample mass, and the gold grain.

In the interest of learning, a metallurgist asks a "dumb" question:

Aren't outliers simply data points that don't fit the model?

My experience is that outliers are often very useful for identifying that there are weaknesses in the model that should be only removed from the data set if a technical explanation can identify that they do not fit. e.g. near the ore body boundaries alternative conditions exist so are not reflective of the main ore body modelling.

I would suggest that you should treat the outliers with some scrutiny. It is a fallacy to remove outliers because they don't agree with the model - it is what gives statistics a bad name. It is much better to try understanding why they are outliers in the first place?

Please counter as appropriate. Hoping I might learn something.

If you have a look at Minnits paper from Sampling 2011where he shows the difference between using Grubbs Test and whatever (undefined method) used by "DFB" and also using other estimates. This data shows that the slopes if very sensitive to assumptions on outliers. Unfortunately the results without outlier removal are not shown.

I also think that you need to consider that this test is really just a "one-shot" guideline as to sample size and FSE. There are always questions about how representative the sample is of the ore body under consideration and how reliable sub-sampling was done for the test. I would like to see results of somebody who had run a duplicate test but expect this is not available due to the cost of the exercise. I reviewed some data from one heterogeneity test recently where the splitting of the finer fractions did not reduce the variability even though the gold was known to be quite fine - it probably a case of less than ideal sub sampling of the finer fractions.

I always feel a lot more comfortable when I have a few 100 duplicates from early drilling on which to based sampling precision judgements. Importantly these results are from the "real-world" sampling environment and generally capture the variation due to people and process sampling errors as well as the FSE error. Best to do more duplicates to start with then reduce the frequency when you understand the variability. http://is.gd/pegRFa

“Clean data” is a great temptation. It pleases managers, accountants, and tends to enhance the status, respect and confidence awarded to the presenter. Keep an outlier, identify a valid (reproducible) cause for a removed outlier, or clearly report the existence of removed outlier within any analysis or report. Removing an outlier without watchful follow-up or identification of a cause can be a slippery slope. If unusual conditions are not routinely investigated and linked to causes that have a degree of reproducible confidence, there is a strong potential of hidden "weaknesses in the model".

Used with caution, statistics are a practical tool for making sense of complexity and defining the reliability of that “sense.” Too much “cleanup” of data risks confirming the declaration that there are "Lies, damned lies, and statistics".

Time involved in explaining an outlier often pays greater dividends than iterative complex statistics. In the context of sub-sampling, if an outlier cannot be explained, a duplicate of that specific test should be performed rather than simply ignoring it.

Recently I did the heterogeneity test, following Pitard´s procedure, and the results showed a need to sample big masses in order to lower my FSE. I have not identified nor remove any outliers, because the grades retrieved from the test were lower than our capping. Is there any relation between capping grade and outliers? Or should I just look at the mean grade and remove what is over or fewer than 2 times the standard deviation? Our capping grade was defined by the grade which 95% of our samples were lower than, it is also a place where the probability plot shows a gap.