SciELO - Scientific Electronic Library Online

Home Pagelista alfabética de revistas  

Journal of the Chilean Chemical Society

versión On-line ISSN 0717-9707

J. Chil. Chem. Soc. v.54 n.1 Concepción  2009 

J. Chil. Chem. Soc, 54, N° 1 (2009); págs: 93-98





a1 Laboratorio de Química Analtica y Ambiental, Instituto de Química, P. Universidad Católica de Valparaíso, P.O. Box 4059, Valparaíso, Chile. *e-mail:
a SQM Industrial, Aníbal Pinto 3223, Antofagasta, Chile,
Departamento de Química Analítica, Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario e Instituto de Química Rosario (CONICET), Suipacha 531, Rosario (2000) Argentina.


Multivariate calibration of UV-visible spectral data using partial least-squares (PLS) has been applied to the determination of the nitrate content in Chilean Caliche samples, in the concentration range from 1 to 20% NaNO3 The multivariate approach is required since the samples do also contain unknown interferences which are spectrally active in the useful wavelength region for nitrate quantitation (near 301 nm). A set of fifteen calibration samples was employed to build the multivariate model, selected using the Kennard-Stone methodology, starting from real Caliche samples whose nitrate content was previously determined using the reference Devarda method. The figures of merit of the multivariate model were satisfactory (the limit of detection and quantitation reached 0.04% and 0.12 % of NaNO3, respectively, with an average error of prediction of 0.3 % of NaNO3). Then, the PLS model was then applied to a set of independent Caliche samples. The results were compared with a univariate UV approach, and with the nitrate content determined by the reference method, using the linear regression of predicted vs. reference concentration values, together with the elliptical joint confidence region test for the slope and intercept of the latter regression. The results indícate that the univariate method is unsuitable for analyzing the presently studied samples, unlike the multivariate model. Finally, the analytical methodology proposed appears as reliable and cheap alternative for routine analysis of a large number of caliche samples.

Keywords: Nitrate determination, UV-visible spectrophotometry; Multivariate calibration; Partial least-squares


The north of Chile has the biggest Caliche mineral deposits known in the world and the only source of commercially-exploited natural nitrates in the planet. Caliche is the name of the raw material used for fertilizer production based on sodium nitrate. During the production process, Caliche ore is ground mechanically and transferred to a vat leaching plant, where its nitrate content is extracted and further processed to get crystallized as sodium or potassium nitrate. For this reason, a fast and accurate method to quantitate the content of mtrate is fundamental to evaluate the quality of the raw material and to control the lixiviation process.

There are several experimental approaches to evaluate the nitrogen content in different kind of samples1, but the Kjeldahl method still remains as the reference method. It is both precise and accurate, but it is very time consuming, which is a disadvantage for industrial applications where the analytical data are required to regulate a manufacturing process. Other faster methods have been proposed, such as the Dumas method2 for total nitrogen, or the Devarda method3 for nitrate determination, which is based on the reduction of nitrate to ammonia in an alkaline médium by Devarda's alloy (50% Al, 45% Cu and 5% Zn), distillation of the ammonia and titration with standard acid The latter ones are recognized as reference methods to be applied to different nitrogen matrices, and show lower analysis times in comparison with the Kjeldahl method. However, they require an average time of ca. 5 minutes for their application, and henee they are impracticable in our case, where the analysis of several hundred of samples each day is needed.

In this context, the UV determination of mtrate extracted from Caliche appears as a rapid and cheaper alternative. The nitrate ion presents two UV absorption bands in water: a broad and very intense band at 220 nm, and a less intense one at 301 nm. However, in both of these spectral regions several other inorganic anions absorb UV radiation, and therefore the latter may produce an overestimation of the nitrate content when the univariate calibration approach is employed.

Multivariate calibration methods applied to absorptive spectral data are being increasingly used for the analysis of complex mixtures4,5. Several chemometric tools have been reported in the literatura for spectroscopic data6, although the most popular are principal component regression (PCR)7, partial least-squares regression (PLS)8 and the recently introduced variants of hybrid linear analysis (HLA)9-15, which are based in the net analyte signal (NAS) concept16,17. AU these techniques have the advantage of using full spectral information, they allow for a rapid determination of mixture components, often with no prior separation, and the calibration can be performed ignoring the concentrations of all components except the analyte of interest in complex samples. This malees these methods especially appealing for the nitrate determination in Caliche, whose matrix components may show absorption spectra which are severely overlapped with those from the analyte. Sorne multivariate methods have been used to evaluate the nitrate concentration in flltered and unfiltered water samples from a water treatment plant by classical least-squares (CLS)18 and PLS19 with satisfactory results.

In this paper, we report the nitrate determination in Caliche samples by UV-visible absorption spectroscopic data and PLS multivariate calibration. The latter methodology is expected to provide a predictive model for the quantitation of nitrate in samples with severe spectral overlapping with other sample components, in contrast to the classical univariate UV calibration at a single wavelength.

After a careful selection of samples assisted by a chemometric methodology, the PLS multivariate approach was applied to a set of real Caliche samples, and the predicted nitrate content was compared with the reference Devarda method. It should be noticed that among the analyzed Caliche samples, the nitrate content could have varied between 1 and 20%. While the usual univariate method employs two sepárate calibrations, one for the lower concentration range and an another one for the higher range, the multivariate approach provides good prediction results at both extremes of this rather wide concentration range using a single calibration set of measurements. Furthermore, the total time of analysis was less than 1 minute per sample, indicating that the developed method is well suited to be applied in a routine laboratory of analysis.



UV-visible absorption spectra were obtained with a Unicam UV-VIS spectrophotometer Model 530 (Del Carpió Analysis, Santiago, Chile), using 1.00 cm quartz cells. For all samples, the spectra were recorded between 190 and 500 nm with an interval of 1 nm (311 data points per spectrum), saved in ASCII format and transferred to PC Pentium 4 for subsequent manipulation.

The reference procedure for nitrate determination in Caliche samples was adapted from the AOAC Official Method 892.01 for nitrogen determination in fertilizers. These analyses were carried out on a Gerhardt distillation system model Vapodest 6 (Gerhardt, Germany) with automated analysis of 12 samples. The nitrogen concentrations are reported as percentage of sodium nitrate (% NaNO3).


All experiments were performed with analytical-reagent grade purchased from Merck S.A. (Santiago, Chile). The solutions of boric acid and sodium hydroxide, both 0.1 mol L-1, were preparad on a daily basis for nitrate quantitation.

Univariate UV method

Standards for the univariate method are prepared by dissolving sodium nitrate in Milli-Q water. Two sepárate calibrations are routinely performed, one for low nitrate concentrations and one for high nitrate concentrations. The first calibration is done using three triplicate standards in the range 0.20-1.00 g L-1, with the following results: slope = 0.0910(8), intercept = -0.0023(5) and R2 = 0.9994 (standard deviations in the last significant figure in parenthesis). The second calibration is performed using five triplicate standards in the range 2.00-10.00 g L-1, leading to slope = 0.0845(4), intercept = 0.0058(3) and R2 = 0.9997.

Calibration and test sets for PLS

The calibration set was composed of fifteen Caliche samples, with contents of nitrate between 1 and 20% (expressed as % NaNO3), as previously evaluated by the Devarda method. They were selected from a set of 70 Caliche samples, collected in the period September 2006-February 2007 from the SQM facilities placed in the Atacama Desert in Chile. The material was grounded (< 100 µm), homogenized and stocked in polyethylene bags until analysis. The sample selection was based on the Kennard-Stone methodology20 applied to the UV spectra in the relevant region 250-350 mn (see below for details on its implementation). This chemometric procedure employs the distance between samples as a selection criterion. The test set was composed of the remaining 55 samples, which were therefore different than those employed for calibration.

For the UV-visible spectral measurements, 2.00 g of a grounded Caliche sample were placed in a 100 mL Erlenmeyer flask with 50 mL of HPLC-grade water. The slurry was mechanically shaken for five minutes and filtered with a Whatman 42 filter by gravity. The filtered solution was directly analyzed by UV-visible spectrophotometry as described below.

Theory and software

PLS is a well-known multivariate calibration methodology, and details on its implementation are easily available21. The method involves a two-step procedure: 1) calibration, where the relation between spectra and reference component concentrations is established from a set of standard samples, and 2) prediction, in which the calibration results are employed to estímate the component concentrations in unknown samples21. In the PLS-1 version , all model parameters are optimized for the determination of one analyte at a time. During the model training step, the calibration data are decomposed by an iterative algorithm, which correlates the data with the calibration concentrations using a so-called 'inverse' model21. This provides a set of loadings (P, size JxA, where J is the number of wavelengths and A the number of latent PLS variables), weight-loadings (W, size JxA) and regression coefficients to be applied to a new sample (v, size AX1). Given the spectrum of an unknown sample xu (size Jx1), the latter is projected onto the space of the loadings and weight-loadings to provide the test sample scores tu:

The sample scores are then multíplied by the regression coefficients to estímate the analyte concentration y:

Before calibration, it is usual to assess the optimum number of latent variables in order to avoid overfitting, by applying the well-known cross-validation method described by Haaland21 (see below).

The figures of merit were evaluated in according with IUPAC's recommendations22. The sensitivity for a given analyte k has been calculated as:

where ||·|| indicates the Euclidian norm and bk is the vector of regression coefficients provided by the PLS-1 model. The selectivity indicates the part of the total signal that is not lost due to spectral overlap, and can be defined in the multivariate context by resorting to NAS calculations:

SEL= ||Net analyte signal in unknown sample|| / ||Total signal of unknown sample|| (4)

where the net analyte signal is computed by projecting the spectrum for the unknown sample orthogonal to the space spanned by all other components except the analyte of interest23.

Finally, the limit of detection (LOD) and quantitation (LOC) were evaluated as:

where s0 is the standard error in concentration for samples of low or near-zero analyte concentration, and is given by:

where h is the sample leverage, a parameter placing the unknown sample relative to the calibration space, sc is the standard error in calibration concentrations, sr is the instrumental noise level and SEN the sensitivity.

The PLS-1 algorithm was applied using the Toolbox MVC124 written for MATLAB [25], because these routínes allow one to evaluate the figures of merit based of the ÑAS theory17,22.

It should be noticed that UV-visible spectra were measured in a rather wide spectral range, as can be appreciated in Figure la, where regions with both very high and very low absorbance are apparent. It is known that multivariate models may benefit from suitable wavelength selection, which avoids noisy or high-absorption regions, as well as heavily overlapped spectral zones. Although a useful spectral region can be visually identified between 250 and 350 nm, a fine tuning is possible with the tools provided by MVC1. A moving-window strategy implemented in this toolbox allows one to perform leave-one-out cross-validation in a set of spectral regions defined by a given first wavelength and spectral window. A comprehensive search of the optimum cross-validation variance is performed, as a function of first wavelength and window width. This permits the location of a suitable working region where calibration results are optimal. The Toolbox MVC1 is freely available on the internet26.

The Kennard-Stone algorithm was applied using an in-house MATLAB routine.



The nitrate ion presents two UV absorption bands in water: a broad and very intense band at 220 nm, and a less intense one at 301 nm. Figure 1a shows the spectra for the entire set of 70 Caliche samples, where nitrate signals are identified. From the inspection of this Figure and the zoomed window (Figure 1b, between 250 and 350 nm), it is apparent that some samples present a strong overlapping of the band ascribed to nitrate ion and at least one spectral interference, whose absorption maximum is near 294 nm. According to our knowledge of the samples, this spectral interference appears to correspond to either sulfur- or iodine-containing species. Principal component analysis (PCA) of these UV spectra (in the 250-350 nm region ) shows that two components explain ca. 94% of the spectral variability across this set of samples. The corresponding PCA loadings and scores are presented in Figures 2a and 2b respectively. As can be seen, most samples are grouped along the dashed line which is shown in the score-score plot of Figure 2b. For these particular samples, the spectral interference is minimal, and the univariate UV method provides nitrate concentrations which are comparable to those obtained by the reference Devarda method.

A group of samples, however, appear outside this region , some of them presenting significant contributions from the first PC score. On the other hand, the first PC loading shows a maximum at 294 nm (Figure 2b), a wavelength corresponding to the spectral interference described above. For these samples, the univariatc method yields nitrate concentrations which are signiflcantly higher than the reference methodology.

Sample selection for PLS analysis

When a multivariate model is required for the prediction of the analyte concentration in samples containing other UV responsive components, all possible sources of variation which can be found in test specimens must be included in the calibration set. A procedure for selecting samples more or less equally distributed over the calibration space is thus needed. In cases where only real samples are available, an experimental design is not possible, as in the presently discussed analytical problem. When many samples and their corresponding spectra are available, a representative set covering the calibration signal space can be built. Several approaches are available for selecting representative calibration samples. The simplest is random selection, but it is open to the possibility that some source of variation will be lost. Kennard and Stone proposed a sequential method uniformly covering the experimental region 20. The procedure consists of selecting, as the next sample to be included for calibration, the one which is most distant from those already selected. The distance is usually the Euclidean distance in the signal space. As starting points, either two objects which are most distant from each other can be selected, or the one closest to the mean (we have employed the first alternative). The distance is then measured from each of the remaining samples to each of the samples which have already been selected and determine which are the smallest. From the latter the one for which the distance is maximal is selected and added to the set. The calibration space is filled by successively applying the above scheme. Kennard and Stone called their procedure a uniform mapping algorithm20.

Samples were selected from the 70-sample available set of Caliche specimens following this procedure: for a given number of calibration samples, a PLS model was built in the relevant region 250-350 nm and prediction was made on the remaining samples. The root mean square error (RMSE) was computed for an increasing number of calibration samples. Figure 3 shows how the RMSE error varíes with the number of samples selected by the Kennard-Stone methodology. As can be seen, fifteen samples provide reasonably good prediction results on the remaining 55 samples. Moreover, it was confirmed that ca. half of the samples having spectral interferences were included in the calibration set, with the remaining ones left for prediction in the test set.

Univariate UV analysis

In the presence of interferences it is difficult to correlate absorbance values at a single wavelength (i.e., at 301 nm) to the concentration of nitrate, because an overestimation of nitrate concentration will be observed for these complex samples. For the sake of comparison with the multivariate approach described below, the set of 55 test samples was analyzed using the univariate UV methodology, and the results were compared with those provided by the reference Devarda method. Figure 4a shows the predicted vs. reference nitrate concentrations in the test samples, where clear outliers are visually detected, corresponding to samples containing the interferences discussed above.

One convenient way of comparing predictions made by two different methodologies when the concentration range is wide is to apply linear regression analysis of the values found by the method under test against those provided by the reference method27. The recommended test for the statistical study of the obtained results is the so-called elliptical joint confidence region (EJCR) of the slope and intercept of the regression line26. If the ellipse includes the theoretically expected values of (1, 0), this indicates that the proposed methodologies are comparable in accuracy. The specific results when the univariate UV method at 301 nm is compared with the reference Devarda method are: slope = 1 ± 1 and intercept = 1 ± 8. This clearly shows that the univariate method is not suitable for the analytical problem at hand, because of large uncertainties in defming the slope and intercept. Under these conditions, the EJCR is inappropriately large.

PLS Results

In order to build a multivariate calibration model for nitrate quantitation, PLS coupled to leave-one-out cross-validation was first performed in order to estímate the optimum number of latent variables, using the full spectral information recorded for the calibration samples. This was done by computing the ratios F(A) = PRESS(A < A*)/PRESS04) [where PRESS = Σ (yi,nom -yi,pred)2, yi,nom and yi,pred are the nominal and predicted concentrations for each calibration sample, A is a trial number of factors and A* corresponds to the mínimum PRESS], and selecting the number of factors leading to a probability of less than 75% that F> 121. For the full spectral data (190-500 nm), five factors were estimated on the basis of the above criterion.

Before proceeding with PLS calibration and prediction, UV-visible absorption spectra recorded for the calibration samples were restricted to the spectral region suggested by the moving-window strategy described above. A mínimum spectral window of 10 nm and five maximum factors were employed for conducting the search of the optimal calibration conditions. The results indicated that a suitable working region for the analysis of nitrate is from 251 to 310 nm (see Figure 5), where the cross-validation results led to the conclusion that three PLS factors were needed (see Table 1). The selected spectral region appeared to be reasonable in view of the known spectral properties of the nitrate ion (Figure 1) and is close to the one employed for PC A analysis on the complete set of samples (see above). Likewise, the reduction in the number of components (from five to three) stems from the selection of a restricted region where the number of contributing components is less than those responding in the full spectral width.

Using three latent variables in the spectral region discussed above, PLS was then calibrated and applied to the set of test samples. The predicted concentrations, compared with the values obtained by the reference method, are displayed in Figure 4b. For assessing the predictive ability of the method, the root mean square error (RMSE) and relative error of prediction (REP%) were estimated as 0.35% NaNO3 and 3.7% respectively, with an R2 value close to 1 (Table 1). This provides reasonable statistical indicators for the PLS predictions. It should be noticed that the RMSE quoted in Table 1 carries the uncertainty of the Devarda method, which is propagated to the PLS predicted values. Since the reference technique shows an average error of ca. 0.2% NaNO3 (expressed as absolute nitrate content), the PLS RMSE value can be corrected as28:

Equation (8) yields a corrected average error of 0.3% NaNO3 for the PLS method, comparable to that for the reference methodology.

Further insight into the comparison of PLS with the reference Devarda method is gained by linearly regressing the predicted nitrate contents vs. those determined by the reference method. The linear fit yields: slope = 1.02 ± 0.05, intercept = -0.09±0.4. Not only the individual confldence regions contain the ideal values (1 and 0 respectively), but also the elliptical joint confídence region does include the theoretically expected values of (1, 0), indicating that the proposed methodology is comparable in accuracy with the Devarda method (Figure 6).

Finally, the multivariate figures of merit for the PLS method, which are useful for method selection, are reported in Table 1 as estimated by the MVC1 Toolbox. As can be seen, they are appropriate for the determination of nitrate in the working range of nitrate concentrations, since the limit of detection and quantitation are well below the mínimum tested concentration in the studied Caliche samples (1.7 % NaNO3).


Multivariate calibration of UV-visible spectra for a set of Chilean Caliche samples allows one to develop a fast, reliable and precise method for nitrate determination in the presence of interferences. The results obtained with this method are comparable to the reference Devarda method, although it is considerably simpler and faster, and therefore highly suitable for the routine analysis of a large number of samples. In contrast, the univariate approach has been shown to be unsuitable because of the presence of spectrally active interferents at its working wavelength.


SQM is gratefully acknowledged for financial support. A. Olivieri thanks to Universidad Nacional de Rosario, CONICET (Consejo Nacional de Investigaciones Científicas y Técnicas, Project No. PIP 5303) and ANPCyT (Agencia Nacional de Promocion Científica y Tecnológica, Project No. PICT04-25825) for financial support.



1. Official Methods of Analysis of AOAC International, 17th edition, Current through Revision No. 2, 2003.        [ Links ]

2. D. Tate. J. AOAC Int. 11, 829, (1994)        [ Links ]

3. AOAC Official Method 892.01. Nitrogen (Ammoniacal and Nitrate) in fertilizers. Devarda Method, Official Methods of Analysis of AOAC International, 17th edition, Current through Revision No. 2,2003.        [ Links ]

4. J. A. Arancibia, G. Martínez-Delfa, C. E. Boschetti, G. M. Escandar, A.C. Olivieri. Anal. Chim. Acta 553,141, (2005).        [ Links ]

5. J. A. Arancibia, A. Rullo, A. C. Olivieri, S. Dionezio, M. Pistonesi, A. Lista, B.S. Fernández Band. Anal. Chim. Acta 512,157, (2004)        [ Links ]

6. H. Martens, T. Naes Multivariate Calibration. Wiley, Chichester (1989).        [ Links ]

7. H. Martens, M. Martens, Multivariate analysis of quality - an introduction. Wiley, Chichester, 2000.        [ Links ]

8. S. Wold, M. Sjostrom, L. Eriksson. Chemom. Intell. Lab. Syst. 58, 109, (2001).        [ Links ]

9. H. C. Goicoechea, A. C. Olivieri. Analyst 124, 725, (1999).        [ Links ]

10. H. C. Goicoechea, A. C. Olivieri. Chemom Intell. Lab. Syst. 56, 73, (2001).        [ Links ]

11. H. C. Goicoechea, A. C. Olivieri. Talanta 49, 793, (1999).        [ Links ]

12. H. C. Goicoechea, A. C. Olivieri. Anal. Chem. 19, 4361, (1999).        [ Links ]

13. H. C. Goicoechea, A. C. Olivieri. Analyst 126, 1105, (2001).        [ Links ]

14. C. E. Boschetti, A. C. Olivieri. J. NIR Spectrosc. 9, 245, (2001).        [ Links ]

15. A. Muñoz de la Peña, A. Espinosa-Mansilla, M. I. Acedo Valenzuela, H. C. Goicoechea, A. C. Olivieri Anal. Chim. Acta 463, 75, (2002).        [ Links ]

16. A. Lorber. Anal. Chem. 58, 1167, (1986).        [ Links ]

17. H. C. Goicoechea, A. C. Olivieri. Trends Anal. Chem. 19, 599, (2000).        [ Links ]

18. M. Karlsson, B. Karlberg, R. J. O. Olsson. Anal. Chim. Acta 312, 107, (1995).        [ Links ]

19. R. Linker, K. Amit, A. Shaviv, L. Singher, I. Shmulevich Appl Spectrosc. 58,516, (2004).        [ Links ]

20. R. W. Kennard, L. A. Stone. Technometrics 11, 137, (1969).        [ Links ]

21. D. M. Haaland, E. V. Thomas Anal. Chem. 60,1193, (1988).        [ Links ]

22. A. C. Olivieri, N. M. Faber, J. Ferré, R. Boque, J. H. Kalivas, H. Mark Pure Appl. Chem. 78, 633, (2006).        [ Links ]

23. H. C. Goicoechea, A. C. Olivieri, Trends Anal. Chem. 19, 599, (2000).        [ Links ]

24. A. C. Olivieri, H. C. Goicoechea, F. A. Iñón Chemom. Intell. Lab. Syst. 73, 189, (2004).        [ Links ]

25. MATLAB Version 7.0, The MathWorks, Nattick, Massachusetts, USA, 2004.        [ Links ]

26.        [ Links ]

27. A. G. González, M. A. Herrador, A. G. Asuero. Talanta 48, 729, (1999).        [ Links ]

28. K. Faber, B. R. Kowalski Appl. Spectrosc. 51, 660, (1997).        [ Links ]


(Received 21 July 2008 - Accepted 3 November 2008)