Quartz Technology Ltd
en years or so ago
the concept of using a number of sensors ( an array) simultaneously and
analysing the combined response was put forward. Like many good ideas it had
been borrowed from nature and is the principle behind olfaction (hence the
popularity of the term ‘electronic nose’ in some quarters).
Analysing information from an array of sensors
requires intensive data processing.
The array concept has come of age with the advent of fast and relatively
cheap computers. Array sensor
devices contain a number of different individual sensors that respond to
stimuli. The extent to which each
individual sensor responds will depend upon its affinity for a given analyte.
The data provided by each of the sensors is recorded and sophisticated
information processing algorithms are used to make inferences about the stimulus
made (identity, quality etc.).
we are trying to sense a solvent, such as a paint thinner, we may be able to get
away with a single sensor. However,
if we wish to profile a real complex mixture such as coffee or a reaction product, this sort of sample will contain
hundreds of volatile components, all of which are labile.
In these cases
arrays can provide a degree of information just not possible with a single
sensor. Usually the approach is to
generate a pattern of responses from the array - a chemical fingerprint - and to
either match this pattern to a data library (template matching) or to extract
important features and match these.
approach to sensing is generally to try and strive for a single sensor with
maximum sensitivity and selectivity. For
this philosophy to succeed the interaction between the analyte and the sensor
needs to be strong. Generally, the stronger the interaction the less reversible
it is and this can lead to an inherent 'reliability' problem - almost by
definition, a 'good' sensor in terms of high specificity will have less than
ideal reversibility and hence will be prone to drift and long term instability.
Patterns in Data
This sample data set
(see below) shows the response pattern from an array of 8 sorbtion sensors to various diesel
This type of fuel is
frequently graded by 'sulphur' content although the exact form of sulphur will
vary. In the example we have two batches each of three different fuel types and
all are clearly distinguishable. However,
often the variations will be much more subtle and feature extraction methods
will need to be applied to reveal the required information from the array
A trivial example such as the diesel fuel profiles demonstrates the robustness of the approach; visual examination of the data suggests that any one of three sensors alone could probably carry out the classification. However, without the information of the other sensors it would be difficult to predict which sensors are 'good', especially for a 'new' unknown. Sensor 4 in this example would give confusing results if used alone.
Figure 1.0 Response of an Array of Sensors (1-8) to four pairs of diesel fuels
A number of
components make up an array system and currently there is no consensus on the
best overall approach between academics or instrument manufacturers.
To a certain extent the 'tool must suit the job' and different approaches
can benefit different measurement problems.
The purpose of the
following section is to review some of the most important
components of an array system and comment on the general criteria which
are important in developing a useful system.
Every current system
will contain the following elements:
the array of sensors
a system for delivering the
sample to the sensors
a method of collecting a data
· a routine for extracting the features of interest and displaying them in a useful form.
Probably the most important part of the system is the transduction platform, the sensors themselves.
Within an array we
need to vary either the nature of the interaction or the degree of interaction
in order to generate a response pattern. Parameters
which influence the interactions are dielectric constants, polarizability,
dipole moments, distance and charge.
It is important to
make the distinction between recognition of chemical vapour
patterns or fingerprints and olfaction.
The human nose combines sensor responses with a pattern recognition
system which is very difficult to emulate.
For instance, it is possible to produce two flavourings which, to the
human perception are identical, but have quite different chemical constituents.
Conversely small changes in molecular shape or chirality can dramatically change
the smell of a sample with very little change in chemical functionality.
A range of different sensors are used by different manufacturers, here the focus is placed upon quartz microbalance sensors, as the core technology used in Quartz Technology instruments.
Quartz Microbalance Sensors
Piezoelectric quartz crystals are currently used commercially for frequency control in communications equipment, selective filters in electronic equipment, the measurement of the temperature and the dew point of gases and in very accurate clocks. King (1964) first demonstrated that quartz crystals, sometimes called bulk acoustic wave (BAW) devices or thickness shear mode (TSM) devices, could be used as sorption detectors (mass-sensitive devices) by coating the crystals with liquid GC stationary phases, subsequently they have been used in a wide range of chemical sensor applications that have been reviewed by McCallum (1983) and McCallum. (1989).
Manufacture of Quartz Crystal Microbalances
The properties of a quartz crystal microbalance (QCM) depend on the plane in which it is cut. The AT cut is normally employed which has an orientation of approximately 35o to the z-axis. It is known to have a particularly low temperature coefficient. In the manufacture of the sensor a thin wafer about one tenth of a millimetre is cut from the quartz, polished and then electrodes deposited on the opposite faces of the quartz. The electrodes are deposited on the quartz wafer by evaporation or sputtering. In commercial devices aluminum is often used, however, for sensor application and higher tolerance devices gold and silver are preferred. The frequency of oscillation of the crystal is typically 10 to 30MHz. For this work 10MHz crystals were employed which had typical dimensions of 10.2 mm by 10.2 mm and gold electrodes. A quartz crystal is shown in Figure 1.2.
Figure 1.1 Schematic of a quartz crystal microbalance in HC49 package.
Quartz Crystal Microbalance Operating Mechanism
If a quartz crystal oscillator is coated with a material such as a gas chromatographic stationary phase the resonance frequency decreases at a rate quantified by the Sauerbrey equation provided the acoustic impedance of the coating material does not change and is similar to that of quartz:
DF = - 2.3 ´ 106 F0 2 Dm / A
is the mass of the crystal (g), A is the gas sensitive area (sm2), F is the related frequency change (Hz), and F0 is the initial frequency of the quartz
crystal (MHz). The graph in Figure 1.2 shows the frequency decrease on
application of a coating and subsequent decrease on exposure to a vapour. On
complete desorption of the vapour the frequency returns to the coated frequency.
1.2 Behaviour of a quartz
crystal microbalance during the coating and vapour detection process.
Quartz crystal microbalances (QCMs) are essentially partition sensors which simply weigh the gas partitioning into the sensor coating. A wide range of coatings, often gas chromatographic stationary phases, are available whose sorbent properties are well characterised and stabilities known. Gases and vapours will partition into the sensing layer or coating in a reproducible fashion described by the partition coefficient K which is equal to the concentration of the vapour in the coating (CC) divided by the concentration of vapour in the gas phase (CG).
K = CC / CG
where K describes the liquid/liquid partitioning ratios. A schematic of the partition mechanism is shown in Figure 1.3.
Figure 1.3 Partitioning of vapour or gas into a gas chromatographic stationary phase coated on a QCM device (not to scale) (KC,G is the coating/gas partition coefficient, CC is the concentration of the analyte in the sensor coating and CG is the concentration of the analyte in the gas phase).
4.1 Data Pre-Processing
Previous studies suggest the preprocessing algorithm employed is important in determining the performance of the pattern recognition method. Various preprocessing parameters have been used in the field of gas sensing, for example, Heiland (1982), Yannopoulos (1987) and Horner and Hierold (1990) used difference models, relative models, fractional difference models, and normalisation procedures. However, for the bulk acoustic wave quartz crystal microbalance data extraction from sensor response curves has not yet been fully exploited. The overall response magnitude is the most commonly used descriptor of sensor response however, the gradient of the initial response or the sensor recovery may also be a useful variable in pattern recognition regimes. Nanto et al. (1993) have used parameters which characterise the transient responses of quartz crystals to discriminate aromas. Edmonds et al. (1986) have also used initial rate measurements of quartz crystals and compared these to equilibrium shift values. Saunders et al. (1995) have examined the time-dependent frequency responses (termed kinetic signatures) of sensors. Using the initial (non-equilibrium) sensor response has two significant advantages (1) it will reduce analysis time and (2) it may increase the life time of the sensor.
Since the efficiency of pattern recognition methods relies on samples having different response patterns it is crucial to maximise these differences using the most appropriate pre-processing algorithm. The pre-processing methods that have been employed for gas sensing with arrays by other workers are listed in Table 1.1 with formulae applying to signals obtained from an array of quartz crystal microbalance.
methods which have been employed for gas sensing with arrays (F(t)ij
is the frequency of sensor i after t seconds exposure to gas j and F0(t)ij
is the initial frequency at t = 0 seconds, n is the number of sensors in the
array, s is
the standard deviation, `x
is the mean of the fractional frequency responses of sensor i to all
Suspected outliers can be detected by comparing the difference between it and the measurement nearest to it in value, with the difference between the highest and lowest measurements. The ratio of these differences is known as Dixon’s Q where
Q = | suspect value - nearest value | / (largest value - smallest value)
If the calculated value of Q exceeded the critical value for P = 0.05 as given by Miller and Miller, 1992 the suspect value was rejected. Despite identification of outliers using this method the data used for assessment of the pattern recognition methods investigated in this chapter were included to allow direct comparison of the ability of these methods to detect outliers in addition to classifying individuals.
Methods for classification can be divided into unsupervised and supervised approaches. The difference between these methods is that for supervised approaches a test (or training) set is required which means that certain samples of known origins and classification must first be analysed in order to establish a model. In unsupervised methods no prior test set is required. The most commonly employed unsupervised classification methods are clustering techniques. They fall within the categories hierarchical clustering and partitional clustering. Examples include cluster analysis and self-organised mapping (SOM) by a Kohonen network respectively. There are two main approaches to supervised learning, known as soft modeling and hard modeling. Soft independent modeling of class analogy (SIMCA) is an example of soft modeling. Examples of hard modeling include: linear discriminant function analysis (LDFA) and artificial neural networks using back-propagation, the former method employing numerical computation and the latter natural computation.
4.3 Unsupervised Classification - Principal Component Analysis
Principal component analysis aims to reduce the data from a large number of original measurements (e.g. 6 sensor responses) to a small number of principal trends (e.g. two or three main factors). To understand the principles of PCA a number of steps in visualization of data can be used. The first step is to visualize the data as a cloud of points in a low dimensional space. This idea is illustrated in Figure 1.5 for one and two dimensions:
Figure 1.5 Point clouds in (a) one and (b) two dimensions (X1 = sensor 1, X2 = sensor 2).
With a single sensor measurement on each of a number of samples the measurements may be plotted along a line, one point per sample, the distance along the line representing the value of the measurement (Figure 1.5(a)). With two sensor measurements two axes are needed to produce a scatter plot shown in Figure 1.5(b) in which as before each point is a sample and now distances in the two directions represent the sensor measurements. The extension to three sensor measurements per sample is also easy to visualize. Figure 1.5(b) is a two dimensional data set represented in two dimensions, that is, each sensor (dimension) has an axis. One easy way of reducing dimension is by projecting the points onto a smaller dimensional subspace. A projection from two dimensions (a plane) onto one dimension (a line) is shown in Figure 1.6.
Figure 1.6 A projection from two dimensions to one
There is one point on the line for each of the original points in two-dimensional space. Performing this projection has created a new and simpler picture, but it has also created a new sensor variable z. The value of this constructed variable may be obtained for each sample by measuring along the line from the origin to the point corresponding to that sample.
The value of z may also be established, with the aid of trigonometry, by measuring the angles between the line and the two axes and calculating for each sample
z = w1 x1 + w2 x2
where x1 and x2 are the two original sensor measurements for the sample and w1 and w2 are the cosines of the corresponding angles (and thus are constants once the line is fixed). Therefore the new variable z is a linear combination of the original sensor array data. Projecting onto the line has created a new factor, the values of z are the scores of the samples on this factor, and the w’s are called weights. Projection can be used to reduce from any number of dimensions to any smaller number.
In Figure 1.6 the line on to which to project the points could have been drawn at any angle to the axes and the results would have been very different. Some directions give much more interesting looking projections than others, in the sense that the projected points are well spread out. In other words the direction chosen gives a good spread to the points. A direction at right angles would give much poorer separation.
Carey et al. (1986), Ide et al. (1993) and Gardner et al. (1992a) have applied principal component analysis to analyse the response of piezoelectric devices and more recently to classify odours.
1.4.4 Supervised Classification - Linear Discriminant Function Analysis
One approach to supervised (discrimination) learning is hard modeling using linear discriminant function analysis (LDFA). Hard modeling uses a training set consisting of all classes of interest and then tries to set up a model that classifies an unknown sample unambiguously into one of the already established classes. This is achieved by calculating the Mahalanobis distance (Manly 1994) to group centroids and allocating it to the group that it is closest to. It is a measure that takes into account correlation’s between variables. Statisticians have used hard modelling for over 50 years.
A discriminant function divides the available feature space into regions that represent different classes. An unknown is classified according to the region in space to which it belongs. For example, for a very simple discriminant function x = 0, positive x values are indicative of one class and negative values of another class. The simplest form of discriminant analysis constructs linear boundaries in space. Quadratic or higher order functions can also be used however in this work a linear function was used due to limited amount of data available.
Multivariate analysis of variance (MANOVA) has been used by Ide et al. (1993) to determine several linear combinations (canonical discriminant functions) for separating groups. The first discriminant function gives the maximum possible F ratio (mean between groups divided by the mean within groups) on a one-way analysis of variance for the variation within and between groups. The second function gives the maximum possible F ratio for a one-way analysis of variance provided there is no correlation between the first function and the second function within groups.
The relationship between groups can be visualized by plotting these two functions for each individual in a similar manner to PCA when the principal components are plotted.
Send mail to email@example.com with questions or comments about this