Comparison of different statistical approaches to evaluate the orthogonality of chromatographic separations: Application to reverse phase systems
Introduction
The demand for characterization of complex samples, i.e. containing several hundreds of compounds, is stronger than ever before and requires analytical tools to meet this increasing difficulty. Despite the recent progresses in column and instrument technology, the limitations of traditional one-dimensional analytical techniques such as liquid chromatography or gas chromatography are now reached since they only allow the separation of a hundred compounds in a reasonable time. Increasing the separation capacity is possible, but at the cost of a longer analysis time [1]. Because of their unequalled resolving power, multidimensional separations have received a great attention during the past few years for the detailed characterization of complex samples in the field of biology, pharmaceutical analysis, proteomics and metabolomics [2], [3], environment [4], [5] or petroleum industry [6]. If for volatile compounds, comprehensive gas chromatography [7] is a natural choice, for non volatile compounds, multidimensional liquid chromatography is the only option despite its lower degree of maturity.
The increase in resolution obtained in bidimensional liquid chromatography (2D-LC) depends on the degree of orthogonality of the coupled systems, i.e. on the significance of the difference between the separation mechanisms they involve [8]. Dissimilar separation mechanisms are obtained when the retention of solutes results from different interactions between the solutes, the stationary phase and the mobile phase, i.e. the organic modifier and the pH could also have a dramatic effect as illustrated by studies on column characterization [9], [10], [11]. For instance, in reverse phase liquid chromatography (RPLC), acetonitrile and methanol exhibit significantly different selectivity [12], [13]. Each organic modifier has a different influence on the solute–solvent interaction due to the difference in dipole moment, polarizability, hydrogen bond basicity and acidity. In RPLC, for charged molecules, the pH of the mobile phase also greatly influences the retention. The effect of pH and the fraction of the organic modifier have been widely studied [14], [15], [16], [17], [18], [19], with a single organic modifier, methanol [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33] or acetonitrile [34]. Moreover, the choice of the probe solutes has also a striking impact on the evaluation of the orthogonality of couples of chromatographic systems [35], but it has not been studied extensively so far.
Up to now, several approaches have been proposed to evaluate the degree of orthogonality of two chromatographic systems. Liu et al. [36] used the retention times to establish a correlation matrix, from which a peak spreading angle matrix was calculated using a geometric approach to factor analysis. In this paper, the authors defined the orthogonality using the effective area of the 2D separation space covered by the eluting peaks. Slonecker et al. [37], [38] developed criteria for describing the independence of separation modes using information theory, i.e. the informational similarity and the synentropy percentage for the description of data scatter plot in 2D-separation space. The authors also used additional descriptors, such as peak spreading angle and practical peak capacity (NP) introduced earlier by Liu et al. [36]. However, both mathematical approaches have some limitations. First, multiple descriptors are used to define the orthogonality. Second, the proposed methods may not satisfactorily describe the orthogonality for the situations where the analytes are not distributed diagonally along the 2D separation space but form several distinct clusters not intersecting the diagonal. More recently, Gilar et al. [39], [40] developed a geometric characterization of data orthogonality. The 2D separation space was divided in rectangular bins, and an orthogonality percentage was defined as the difference between the number of occupied bins and the number of bins on the diagonal, divided by the number of bins expected to be occupied in the case of an ideally orthogonal distribution. In [39], Gilar also showed that having different pH values of the mobile phase was a very powerful method for separating charged solutes in RPLC. The correlation coefficient is a frequently used parameter to evaluate the orthogonality of the two dimensions [35], [41], [42], [43], [44], [45], [46], [47], [48]. Van Gyseghem et al. [45], [47] used Pearson's correlation coefficient to evaluate the orthogonality for eight silica-based stationary phases that were applied in conjunction with four mobile phases at different pH values to determine the impurity profile of a drug. Similarly, Forlay-Frick et al. [49] attempted at comparing the three classical coefficients, Pearson's, Spearman's and Kendall's together with a generalization of the pair-correlation method (PCM) combined to different statistical tests [50]. Recently, another approach was applied to select the orthogonal columns for cationic drug solutes by using Snyder–Dolan (S–D) hydrophobic subtraction method of column classification [51]. The advantage of this model is that a single parameter called the “column selectivity function, Fs” can be used to quantitatively compare the overall selectivity of any two columns. This approach assumes that the column behavior is the same whatever the conditions (organic modifier fraction and type, temperature, solvent type, etc.). In [52], this model was also applied to non ionized solutes.
The present paper aims at comparing the criteria we consider as most relevant for orthogonality evaluation in RPLC × RPLC, at establishing which one(s) is (are) most appropriate, and at determining quantitatively the factors having the largest influence on orthogonality. To this end, a set of 63 test compounds, covering a wide distribution of physico-chemical properties, was built in order to probe orthogonality between couples of RP chromatographic systems in a generic gradient mode. This set includes neutral, acidic and basic compounds differing by their pKa values (between 0.6 and 14.0), their molecular mass (between 76.12 and 1485.71 g/mol), their hydrophobicity (log P-values are evenly distributed from −1.08 to 7.72) and the presence of heteroatoms. To ensure a perfect accessibility to the testing procedure, test compounds had to be easily available, meaning cheap and not forensic products with sufficient stability. The retention times of these compounds were measured with every combination of the eight different columns (i.e. stationary phases), the two different organic modifiers (methanol and acetonitrile) and the two different pH values (2.5 and 7.0), i.e. with 8 × 2 × 2 = 32 distinct chromatographic systems. The orthogonality of the 496 system couples was evaluated and ranked with the eight criteria we considered most relevant: the three classical correlation coefficients (Pearson, Spearman and Kendall), two geometric criteria characterizing the coverage of the 2D separation space, Slonecker's information similarity and two chi-square statistics of independence. Since we establish the equivalence between the rankings with the likelihood ratio statistics and Slonecker's information similarity, see Section 3.2, there are in fact only seven distinct criteria. Each of them was evaluated according to its capacity to reveal the influence of the factors acting upon orthogonality using ANOVA. Finally, the most orthogonal chromatographic systems among the ones evaluated are presented.
Section snippets
Instrumentation
Gradient separations were carried out using a Dionex HPLC system (Ultimate™ 3000 Nano HPLC) equipped with a UV detector (Ultimate™ 3000 variable wavelength) operated at 3 detection wavelengths: 220, 230, and 250 nm depending on the solute (rate of data acquisition was 2.5 Hz, time constant was 0.40 s, conventional 2.5 μl cell with 7.5 mm path length), two pumps (Ultimate 3000), a degasser (LPG-3000), a thermostatic automated autosampler (Ultimate™ 3000 series Nano/Cap) and a column oven (Ultimate
Statistical analysis
In statistics, two random variables X and Y are orthogonal if the mathematical expectation of their product is zero, i.e. E(X Y) = 0. Since cov(X,Y) = E(X Y) − E(X) E(Y), this is often the consequence of their independence or only absence of correlation, provided one of them has zero mean. In 2D-LC, the mean of the retention times playing no role in the selectivity, what is searched for is indeed the maximal independence of the separation mechanisms. Quantifying the so-called orthogonality of a
Orthogonality evaluation
The scores of the 496 couples of chromatographic systems were computed using the seven criteria (twice for those requiring discretization, with 5 × 5 and 10 × 10 grids), and ANOVA was performed. For all criteria, there was never a significant interaction between the difference in pH or in stationary phase and the difference in organic modifier.
Thus, we performed the ANOVA without the two non-significant interactions with the organic modifier difference; hence, we had only five parameters. We did
Conclusion
The orthogonality of 496 couples of systems was evaluated and ranked based on different criteria: the three classical correlation coefficients (Pearson's, Spearman's and Kendall's), two geometric criteria characterizing the coverage of the 2D separation space, Slonecker's information similarity and two χ2 statistics of independence. Kendall's coefficient showed the greatest sensitivity to all factors, and the largest score difference between couples of identical systems, and couples differing
References (71)
- et al.
J. Chromatogr. A
(2009) - et al.
J. Chromatogr. A
(2008) - et al.
J. Chromatogr. A
(2007) - et al.
J. Chromatogr. A
(2009) - et al.
J. Chromatogr. A
(2003) - et al.
J. Chromatogr. A
(2004) - et al.
J. Chromatogr. A
(2005) - et al.
J. Chromatogr.
(1978) - et al.
J. Chromatogr. A
(2002) - et al.
Trends Anal. Chem.
(1999)