Comparison of methods to assess the accuracy of the incorporation of censored chemical data in descriptive statistical analysis of contaminated groundwate

Published
2023-01-29
Keywords: Censored data, Hydrochemistry, Detection limit, Descriptive statistics. Dados censurados, Hidroquímica, Limite de detecção, Estatística descritiva.

    Authors

  • Vinicius Rodrigues dos Santos Universidade Federal de Ouro Preto (UFOP), Ouro Preto, MG, Brasil.
  • Luis de Almeida Prado Bacellar Universidade Federal de Ouro Preto (UFOP), Ouro Preto, MG, Brasil. https://orcid.org/0000-0003-1670-9471
  • Cícero Antônio Antunes Catapreta Universidade Federal de Ouro Preto (UFOP), Ouro Preto, MG, Brasil.

Abstract

Chemical analyses of groundwater often present data sets with censored values, i.e., below the detection limit (LOD). When the proportion of censored values is significant, descriptive (mean, median and standard deviation) or exploratory geochemical analysis may be impaired. Ignoring such data or replacing them with some predetermined value is not always the recommended alternative. Thus, the objective of this research is to investigate the applicability of four methods in estimating censored chemical data from an area with contaminated groundwater. Three statistical methods were used: parametric (Maximum Likelihood Estimation, MLE), non-parametric (Kaplan-Meier, KM) and robust (Order Regression Methods, ROS), in addition to the traditional method of direct replacement of censored data, using LOD/2. The MLE, assuming a Gaussian distribution of the data (MLE-no), yielded allowable substitution factors, close to 0.5, similarly to the traditional substitution method (LOD/2). Validation with complete datasets with the same estimation methods and considering three artificial LOD attested to the good results of MLE-no and ROS with 25% and 50% of censored data, respectively, as well as LOD/2. The first two methods are preferable to LOD/2 as they are statistically based. It is recommended in future studies that such estimation methods be combined with other geostatistical treatments to improve the spatial analysis of hydrochemical datasets.

References

ANTWEILER, R.C., TAYLOR, H.E. Evaluation of statistical treatments of left-censored environmental data using coin-cident uncensored data sets: I. Summary statistics. Envi-ronmental Science & Technology, v. 42, p. 3732–3738, 2008. https://doi.org/10.1021/es071301c

ANTWEILER, R.C., 2015. Evaluation of Statistical Treatments of Left-Censored Environmental Data Using Coincident Un-censored Data Sets. II. Group Comparisons. Environmental Science & Technology, v. 49, p. 13439-13446, 2015 https://doi.org/10.1021/acs.est.5b02385

BACELLAR L.A.P., OLIVEIRA FILHO W.L. Caracterização defini-tiva da pluma de contaminação das águas subterrâneas da área do aterro sanitário da CTRS-BR040. Unplubished report, Belo Horizonte, 2009.

BARELLA, C.F., BACELLAR, L.A.P., NALINI, H.A. Influence of the natural oxidation of the leachate organic fraction from a landfill on groundwater quality, Belo Horizonte: Minas Ge-rais, south-eastern Brazil. Environ Earth Sci, v. 70, p. 2283–2292, 2013. https://doi.org/10.1007/s12665-013-2284-4

BACCARELLI, A. et al. Handling of dioxin measurement data in the presence of non-detectable values: overview of availa-ble methods and their application in the Seveso chloracne study. Chemosphere, v. 60, n. 7, p. 898-906, 2005.

https://doi.org/10.1016/j.chemosphere.2005.01.055

CARRANZA, E.J.M. Analysis and mapping of geochemical anomalies using logratio-transformed stream sediment data with censored values. J. Geochem. Explor. v. 110, n. 2, p. 167-185, 2011. https://doi.org/10.1016/j.gexplo.2011.05.007

CLARKE, J. U. Evaluation of Censored Data Methods To Allow Statistical Comparisons among Very Small Samples with Below Detection Limit Observations. Environmental Science & Technology v. 32, p. 177-183, 1998.

https://doi.org/10.1021/es970521v

COHEN, A.C. Tables for maximum likelihood estimates: sin-gly truncated and singly censored samples. Technometrics, v. 3, p. 535–541, 1961. https://doi.org/10.1080/00401706.1961.10489973

CROGHAN, C. W., EGEGHY, P. P. Methods of dealing with values below the limit of detection using SAS. Southeastern SAS User Group September, p. 22-24, 2003.

FIÉVET, B., VEDOVA, D.C. Dealing with non-detect values in time series measurements of radionuclide concentration in the marine environment. Journal Environmental Radioactive v. 101, n. 1, p. 1-7, 2010. https://doi.org/10.1016/j.jenvrad.2009.07.007

FISHER, R.A. Theory of statistical estimation. In: MATHEMAT-ICAL PROCEEDINGS OF THE CAMBRIDGE PHILOSOPHICAL SOCIETY. Cambridge University Press, p. 700–725, 1925. https://doi.org/10.1017/S0305004100009580

GIBBONS, R.D. Statistical methods for detection and quanti-fication of environmental contamination. United States: John Wiley & Sons, 2001.

GILLIOM, R.J., HELSEL, D.R. Estimation of distributional parameters for censored trace level water quality data: 1. Estimation techniques. Water Resources Research, v. 22, p. 135–146, 1986. https://doi.org/10.1029/WR022i002p00135

HELSEL, D.R. Insider censoring: distortion of data with non-detects. Human and Ecological Risk Assessment, v. 11, p. 1127–1137, 2005. https://doi.org/10.1080/10807030500278586

HELSEL, D.R. Fabricating data: how substituting values for nondetects can ruin results, and what can be done about it. Chemosphere, v. 65, p. 2434–2439. https://doi.org/10.1016/j.chemosphere.2006.04.051

HELSEL, D.R. Statistics for censored environmental data using Minitab and R. Second Edition. United States: John Wiley & Sons, 2011. https://doi.org/10.1002/9781118162729

HEWETT, P., GANSER, G.H. A comparison of several methods for analyzing censored data. Annals of Occupational Hygiene v. 51, p. 611-632, 2007.

HORNUNG, R.W., REED, L.D. Estimation of average concen-tration in the presence of nondetectable values. Applied occupational and environmental hygiene, v. 5, p. 46–51, 1990. https://doi.org/10.1080/1047322X.1990.10389587

KAPLAN, E.L., MEIER, P. Nonparametric estimation from incomplete observations. Journal of the American Statistical association, v. 53, p. 457–481, 1958. https://doi.org/10.1080/01621459.1958.10501452

KROLL, C.N., STEDINGER, J.R. Estimation of moments and quantiles using censored data. Water Resources Research, v. 32, p. 1005–1012, 1996. https://doi.org/10.1029/95WR03294

LEE, L. NADA: Nondetecs and Data Analysis for Environmen-tal Data. R Package, 2010.

LEE, L., HELSEL, D. Baseline models of trace elements in major aquifers of the United States. Applied Geochemistry, v. 20, n. 8, p. 1560-1570, 2005. https://doi.org/10.1016/j.apgeochem.2005.03.008

LEITH, K.F. et. al. A comparison of techniques for assessing central tendency in left-censored data using PCB and pp'DDE contaminant concentrations from Michigan's Bald Eagle Biosentinel Program. Chemosphere, v. 80, p. 7-12, 2010. https://doi.org/10.1016/j.chemosphere.2010.03.056

LEVITAN, D.M., SCHREIBER, M.E., SEAL, R.R., BODNAR, R.J., AYLOR J.G. Developing protocols for geochemicalbaeline studies: An exemple from the Cole Hill uranium deposit, Virginia, USA. Applied Cheochemistry, v. 43, p. 88-100, 2014. https://doi.org/10.1016/j.apgeochem.2014.02.007

LIU, S., LU, J.C., KOPLIN, D. W., MEEKER, W. Q. Analysis of environmental data with censored observations. Environ-mental Science & Technology, v. 31, p. 3358-3362, 1997. https://doi.org/10.1021/es960695x

REIMANN, C. Statistical data analysis explained: applied environmental statistics with R. United States: John Wiley & Sons, 2008. https://doi.org/10.1002/9780470987605

SANFORD, R.F., PIERSON, C.T., CROVELLI, R.A.. An objective replacement method for censored geochemical data. Math-ematical Geology, v. 25, p. 59–80, 1993. https://doi.org/10.1007/BF00890676

SINGH, A., NOCERINO, J. Robust estimation of mean and variance using environmental data sets with below detection limit observations. Chemometrics and Intelligent Laboratory Systems, v. 60, p. 69-86, 2002. https://doi.org/10.1016/S0169-7439(01)00186-1

TEMPL, M., FILZMOSER, P., REIMANN, C. Cluster analysis applied to regional geochemical data: problems and possibil-ities. Applied Geochemistry, v. 23, p. 2198–2213, 2008. https://doi.org/10.1016/j.apgeochem.2008.03.004

THODE, H.C. Testing for normality. First Edition. United States: CRC press, 2002. https://doi.org/10.1201/9780203910894

YOUNG, K.D., MENEGAZZI, J.J., LEWIS, R.J. Statistical meth-od: IX. Survival analysis. Academic emergency medicine, v. 6, p. 244–249, 1999. https://doi.org/10.1111/j.1553-2712.1999.tb00165.x

How to Cite
Santos, V. R. dos . ., Bacellar, L. de A. P., & Catapreta, C. A. A. . (2023). Comparison of methods to assess the accuracy of the incorporation of censored chemical data in descriptive statistical analysis of contaminated groundwate. Águas Subterrâneas, 37(1), e–30104. https://doi.org/10.14295/ras.v37i1.30104