Introduction

The detection of early glaucomatous visual field (VF) damage is of great clinical interest, especially considering that early treatment is effective in preventing and/or delaying glaucomatous damage and progression.1 Standard automated perimetry (SAP) is still the clinical gold standard in the diagnosis and follow-up of glaucoma;2 however, SAP does not selectively test a specific retinal ganglion cell (RGC) pathway,3 which may lead to a limited sensitivity (Se) in detecting early glaucomatous damage partly due to an inherently redundancy of the visual system.4 Several other perimetric techniques have been developed over the past 20 years that have shown to detect VF defects before SAP, which include short-wavelength automated perimetry (SWAP),5 frequency-doubling technology (FDT),6, 7 rarebit perimetry (RBP),8 and pulsar perimetry (PP).9 These new psychophysical tests were designed to isolate a sub-population of RGCs by selectively evaluating a specific visual function,10, 11, 12 thus potentially limiting RGCs redundancy and improving early glaucomatous VF damage detection.

Frequency-doubling technology is a perimetric technique that has shown to be useful over the past 10 years.6, 7 It is based on the frequency-doubling effect first described by Kelly,13 in which a sinusoidal grating of low-spatial frequency undergoing high temporal frequency produces a perceived spatial frequency that is doubled. Numerous studies have shown that FDT is able to detect VF loss before SAP14 and that it can predict the progression and morphology of future SAP VF defects.15

Rarebit perimetry is a perimetric method developed by Frisén8, 16 in 2002. It utilises spatially and temporally minute test stimuli (microdots or ‘rare bits’) to avoid the simultaneous stimulation of numerous retinal receptive fields, which may give an underestimation of the VF defect. RBP has shown promising preliminary results in the early detection of VF damage in patients with neurological disorders8 and glaucoma.17, 18

Pulsar perimetry is a psychophysical test developed by Gonzales de la Rosa and co-workers in 2000,9 which tests several visual functions, including colour perception, motion and temporal modulation, and contrast sensitivity function (CSF). Its use in glaucoma diagnosis is relatively recent.19 The PP T30W test was designed for glaucoma testing, in which temporal and spatial CSF are simultaneously tested.9 The T30W test examines the central 30° VF utilising a circular sinusoidal grating pattern that undergoes a counter phase pulse motion at 30 Hz, in which both spatial resolution (SR) and contrast are simultaneously modified.19

The aim of our study was to compare the ability of FDT, RBP, and PP in discriminating between healthy and early glaucomatous eyes.

Materials and methods

This prospective observational cross-sectional study included 108 consecutive subjects composed of 54 normal subjects and 54 patients with primary open-angle glaucoma (POAG). The study was in compliance with the tenets of the Helsinki's Declaration, and informed consent was obtained from all participants before testing. All subjects underwent a complete ophthalmologic examination and VF test with SAP, FDT, RBP, and PP. Normal subjects did not have previous experience with VF testing; all glaucomatous patients had undergone at least three SAP tests before enrolment, but were not familiar with the non-conventional testing methods. All patients underwent brief training sessions and practice runs with all instruments before testing. VF tests were carried out in random order within a 3-month time period. Only one eye per patient was randomly selected for analysis. Normal subjects were recruited from staff members and volunteers. Glaucoma patients were sequentially recruited from the Department of Ophthalmology Glaucoma Clinic at S. Maria della Misericordia Hospital, Udine, Italy. These patients were all suspect or early glaucoma subjects, who were sent for further assessment and diagnosis to our clinic for being glaucomatous due to repeatable abnormal SAP results. The study was approved by our ethics committee and Institutional Review Board (IRB). We certify that all applicable institutional and governmental regulations regarding the ethical use of human volunteers were followed during this research.

The following were inclusion criteria: best corrected visual acuity better than or equal to 0.7 decimal; open anterior chamber angle; absence of ocular pathology other than glaucoma; reliable VF test results; and willingness to provide informed written consent. Exclusion criteria included the following: ametropia >±5 dioptres, pupil diameter <2 mm; anterior chamber angle alterations; secondary causes of glaucoma; advanced glaucomatous VF defects; diabetes mellitus, neurological disorders, medication that could modify VF results; and, previous intraocular surgery.

Optic nerve head (ONH) and retinal nerve fibre layer (RNFL) appearance were assessed by an expert ophthalmologist with slit-lamp indirect ophthalmoscopy and a 78-dioptre lens. Normal ONH and RNFL appearance was clinically defined as: inter-eye vertical cup-disk asymmetry <0.2; cup-to-disk ratio <0.6 (in normal sized optic discs); and, the absence of diffuse or focal rim thinning, cupping, localised pallor, optic disk haemorrhage, or RNFL defects.

Standard automated perimetry testing was carried out using the Humphrey Field Analyzer (HFA) II 750 (Carl Zeiss Meditec Inc., Dublin, CA, USA) 30–2 test with standard Swedish Interactive Thresholding Algorithm (SITA) strategy. Reliability criteria for HFA tests based on the manufacturer's recommendations included false-positive responses <15%, false-negative responses <33%, and fixation losses <20%. A normal SAP testing result was defined according to the Hodapp et al20 criteria. SAP tests were classified as glaucomatous according to the Anderson and Patella21 criteria, in which at least one of the following was present: (1) a cluster of 3 points in the pattern deviation probability (PDP) plot, located in areas which are typical of glaucoma, having a probability level of P<5%, with at least one point having a probability level of P<1%; none of the points could be edge-points unless they were located immediately above or below the nasal horizontal meridian; (2) pattern standard deviation (PSD) with a probability level of P<5%; (3) GHT ‘outside normal limits’. Only early glaucomatous SAP VF defects, having a mean deviation (MD) better than −5.0 dB and a PSD <5.0 dB, were included.

The patients were classified into two groups according to the following criteria:

  1. 1

    Control group (54 eyes): normal intraocular pressure (IOP), ONH and RNFL appearance, and SAP results; no family history of glaucoma and other ocular pathologies;

  2. 2

    POAG group (54 eyes): IOP >21 mm Hg before medication, and reproducible glaucomatous SAP VF defects.

Instruments

Frequency-doubling technology

The FDT test was carried out with the FDT N-30 full-threshold procedure (Welch Allyn FDT, Skaneateles Falls, New York and Carl Zeiss Meditec, Dublin, CA, USA), which has been described elsewhere.7 In brief, the FDT test stimulus consists of a sinusoidal grating of low-spatial frequency (0.25 cycles/degree) undergoing counterphase flicker at high temporal frequency (25 Hz) superimposed on a dim background; contrast is modified according to a Modified Binary Search strategy.22 The threshold value for each test location is defined as the minimal contrast at which the pattern is detected. The FDT N-30 test includes 19 stimuli (18 square 10° × 10° targets and one central 5° × 5° circular target. The mean background illumination is 100 cd/m2 (31.5 asb) and the contrast ranges from 56 dB (0%) to 0 dB (100%). The FDT reliability criteria used were fixation loss, false-negative, and false-positive responses <33%.

Rarebit perimetry

Rarebit perimetry procedure has already been described elsewhere.8, 16 In brief, RBP (version 4.0) is carried out on a standard PC with a 15-inch liquid crystal display. The software (Microsoft Windows format) is available free of charge from the author (lars.frisen@neuro.gu.se). The test stimulus is composed of two microdots with a diameter of one-half the minimum angle of resolution, spaced 4° apart and simultaneously shown for 200 ms. The paired dots are oriented either horizontally or vertically, and appear on the screen at random positions within each of 24 rectangular test areas. The areas tested range from 6° × 8° centrally to 6° × 14° peripherally, covering a horizontal eccentricity of 27.5°, a superior vertical eccentricity of 20° and inferior vertical eccentricity of 22.5°. The test probes the 24 locations twice for each run. A minimum of five repeated runs is recommended. As a control, 10% of the tested presentations are composed of either a true blank or a single dot. The target and background luminance are set at 150 and 1 cd/m2, respectively. Fixation is not monitored but is encouraged by a computer-generated target that moves in pseudo-random positions after each presentation.

Subjects are instructed to maintain fixation on the target throughout the test, and to indicate the number of microdots seen (0, 1, or 2) by clicking on a computer mouse once, twice, or not at all after each presentation. The test is carried out at a distance of 0.5 m to test the 20 peripheral areas, and then at 1 m for the four central areas. The RBP test results are shown as a hit-rate percent, defined as the total number of dots seen divided by the total number of dots shown. The printout provides a mean hit rate (MHR) and standard deviation (SD) (MHR–SD), representing an average of all tested areas (except for the one closest to the blind spot). Mean miss rates are also provided for each of the 24 tested areas in the form of a percent (shown as 0–100% in 10% increments when five RBP runs have been completed). The error statistic value represents the sum of the responses to control presentations, which should be close to zero. Only reliable RBP tests, defined as having an error statistic value of <2, were considered.

Pulsar perimetry

Pulsar perimetry was used to evaluate all patients with the T30W test (Haag-Streit International, Bern, Switzerland). PP has already been described elsewhere.9, 19, 23 In brief, the PP stimulus consists of a circular sinusoidal 5° diameter grating pattern that is presented for 500 ms. The circular wave pattern is formed by light and dark alternating concentric bands that gradually decrease in contrast near the peripheral edges, until blending with the background of similar luminance (100 asb; 31.7 cd/m2). The circular sinusoidal grating stimulus undergoes a counterphase pulse motion at 30 Hz, in which both SR (from 0.5 to 6.3 cycles/degree on a 12-step log scale) and contrast (C, from 3 to 100% on a 32-step log scale) are simultaneously modified. Threshold Se is expressed in SR contrast units (src). The 36 levels range from easily seen targets of 0 src units to those most difficult to detect at 35 src units.

A tendency oriented perimetry (TOP) threshold strategy is used, in which each VF position is tested only once and thresholds are based on sensitivities obtained from surrounding areas.24 The T30W test examines 66 areas of the central VF separated by 6°, covering a horizontal eccentricity of 30° and a vertical eccentricity of 24°. Subjects are instructed to fixate on the centre fixation mark and click on the joystick when a stimulus is detected anywhere on the screen. The printout provides a numeric threshold plot of Se, grey scale plot, indices (mean sensitivity, MS; mean defect, MDf; loss variance square root, sLV), deviation curve (Bebie curve), and a comparison probability (CP) plot with three P-levels of abnormality. The manufacturer's recommended reliability criteria for PP included false-positive and false-negative responses <33% and fixation losses <20%.

Main outcome measure

The parameters considered in the analysis were as follows:

  1. 1

    Standard automated perimetry: MD, PSD, number of significantly depressed points with P<5% (NP<5%) and P<1% (NP<1%) in the PDP plot, and test duration;

  2. 2

    Frequency-doubling technology: MD, PSD, number of significantly abnormal areas with P<5% (NP<5%) and P<1% (NP<1%) in the PDP plot, and test duration;

  3. 3

    Rarebit perimetry: MHR; MHR–SD; number of tested areas with hit rate <90% (N HR<90%), number of tested locations with miss rates 50% (N MR50%), and 70% (N MR70%); and test duration;

  4. 4

    Pulsar perimetry: MDf, sLV, number of significantly abnormal areas with P<5% (NP<5%) and P<1% (NP<1%) in the CP, and test duration.

Statistical analysis

Left eye results were converted to a right eye format for the analysis. Normality of the data distribution was assessed with the Kolmogorov–Smirnov test. Differences between test results were calculated using the analysis of variance for variables that showed a normal distribution, and the Friedman test for those that did not. Comparisons between groups were assessed using the unpaired t-test for variables that showed a normal distribution and the Mann–Whitney test for those that did not. Fisher's least significant difference (LSD) test was used for pairwise multiple comparisons. The best cut-off point (defined as the value dividing healthy from glaucomatous eyes with the highest probability); Se at 80, 90, and 95%; specificity (Sp); and area under the receiver operating characteristic curve (AROC) for detecting glaucoma were calculated for all parameters considered for each instrument. The tests in assessing gold standard included the most recent and repeatable HFA 30-2 test; IOP measurement with Goldmann applanation tonometry; and, clinical evaluation of ONH and RNFL appearance with fundus biomicroscopy. The parameter from each instrument with the highest AROC in diagnosing glaucoma was included in the comparison among instruments. Differences between sensitivities were calculated using the χ2-test; differences between the AROCs were evaluated using the Hanley–McNeil method.25

The statistical analysis was carried out using SPSS 11.0 for Windows (SPSS Inc., Chicago, IL, USA). Statistical significance was defined as P<0.05.

Results

Four eyes were excluded: two due to unreliable SAP results and two because of unreliable RBP results. A total of 105 eyes from 53 normal subjects and 52 POAG patient met our inclusion criteria. Demographic and clinical characteristics of the study population are summarised in Table 1. Statistically significant differences between the two groups were found for all SAP parameters (P<0.0001).

Table 1 Patients demography

Table 2 lists the mean, SD, and comparison P-values for the parameters of the three VF tests in the control and POAG groups. All parameters considered were statistically significantly different between controls and POAG eyes (P<0.0001), except for PP test duration (P=0.73). The duration of the three non-conventional VF tests were significantly lower than SAP (data not showed, Fisher's LSD test, P<0.0001). PP showed the shortest test time (data not showed, Fisher's LSD test, P=0.002).

Table 2 Test results

Table 3 lists the best cut-off values; Se at 80, 90, and 95%; Sp; and AROCs for glaucoma diagnosis for all parameters considered for the three instruments. The best cut-off as given in our results is where the overall best accuracy is obtained with the number of diseased and undiseased in the cohort equal, but may not be the best criterion for all applications for population screening or diagnosis in a clinical office where the proportion of individuals with glaucoma is different. The AROCs for discriminating between POAG and healthy eyes for FDT, RBP, and PP ranged from 0.706 to 0.933; 0.858 to 0.950; and 0.856 to 0.940, respectively. The single parameters associated with the greatest AROC were respectively: NP<5% in the PDP; MHR; and MDf.

Table 3 Best cut-off, sensitivity (Se), specificity (Sp), and area under the ROC curve (AROC) for discriminating healthy from POAG eyes

Statistical comparisons of Se, Sp, and AROC between the best parameters of each instrument are shown in Table 4. The differences among VF tests for Se at Sp set at 80 and 90%, Se and Sp at the best cut-off point, and AROCs were not statistically significant (P=0.16–1.0). Se at 95% Sp was significantly higher for RBP compared with the other tests (P=0.002 and 0.004).

Table 4 Statistical comparison expressed as P-values of sensitivity (Se), specificity (Sp), and area under the ROC curve (AROC) for discriminating between healthy and POAG eyes of the best parameter of each instrument

Discussion

Our study compares the diagnostic accuracy of three non-conventional perimetric tests in discriminating between healthy and early glaucomatous eyes. A direct comparative study involving FDT, RP, and PP has not been previously reported. These psychophysical testing methods have been designed to selectively stimulate a specific visual pathway system, which may be prone to early glaucomatous damage and be of limited redundancy.26 Some studies have reported nonselective RGCs loss in glaucomatous eyes;27 however, numerous studies have shown that VF testing of specific RGC sub-populations can be more sensitive than SAP in the early detection of glaucoma due to reduced RGCs redundancy.10, 11, 12, 27 Numerous studies have shown that SWAP and FDT are able to detect VF loss before SAP and predict the progression and morphology of future SAP VF defects.5, 14, 15, 28

Pulsar perimetry and FDT use a similar counterphase flickering grating stimulus with high temporal frequency. The difference between them, however, is that FDT utilises a fixed low-spatial frequency and contrast is modified; whereas, PP T30W measures contrast threshold resolution at several spatial frequencies.9 It has been reported that FDT selectively analyses a subset of the magnocellular pathway, known as My cells,29 however, several studies have shown that the existence of My cells in primates is debatable and that the neurophysiologic substrate for the FD illusion may lie at higher cortical levels.30 PP uses targets that reach high-spatial frequencies, which theoretically could stimulate the parvocellular system, while the high temporal frequency component could act on the magnocellular system.19

Rarebit perimetry does not assess VF threshold Se at tested points, but aims at evaluating the integrity of the visual system by assessing the proportion of observed stimuli compared with the total number of microdot presentations. The basic assumption behind RPB is that although the total number of RGCs may differ in the general population, the neuro-retinal architecture in most normal eyes should be complete, thus permitting the detection of paired dots of appropriate size, contrast, and separation in the VF.31 Some misses can be explained by factors such as the blind spot, angioscotomas, age-related neuro-retinal architecture depletion, blinks, artefacts, and attention lapses.32 The pathway that is probably assessed with RBP, based on the characteristics of the stimulus, is the parvocellular RGCs.8

Our results show that FDT, RBP, and PP have desirable characteristics for early glaucoma detection. Several parameters for each test provided good diagnostic accuracy, which are in agreement with previous reports. The ability of FDT in diagnosing early POAG eyes was comparable with other studies.15, 33, 34 Several abnormality criteria for FDT have been proposed; however, in agreement with previous reports,34 our data show that the criterion with the best performance to discriminate between glaucoma and healthy eyes was the presence of two or more abnormal locations on the PDP plot, regardless of the area P value defect severity. Our RBP results were comparable with a study using a similar cohort of early POAG patients.17 With regard to the diagnostic performance, PP showed a good ability in early-stage glaucoma detection, in which MD was the best diagnostic parameter, similarly to other studies.19

The best performing parameters of the three perimetric tests showed comparable AROCs (Tables 3 and 4). Our study design involved a randomised testing order, thus it is unlikely that fatigue or learning effect might have introduced bias. Moreover, considering that age affects RBP test results35 and that RBP results are not corrected for age, the fact that healthy subjects were age-matched with the glaucomatous patients may reasonably exclude the possibility of having artificially skewed the AROCs. Our results are consistent with previous findings that show that glaucoma does not necessarily affect one selective RGCs subtype first.10, 11, 34, 36, 37 The differences among tests suggest that not all eyes with POAG are affected in the same manner, which explains initial parvocellular VF Se loss in some patients while early magnocellular system loss in others.10

When compared with FDT and PP, RBP showed similar Se at 80 and 90% Sp. RBP provided slightly (not significantly) lower Se and higher Sp at the best cut-off. It also showed significantly higher Se at 95% Sp, which may suggest the role of RBP as a good screening test. Considering that the individual results for most instruments are compared and assessed with SD measure based on a normative database,38 our results suggest that the current statistical package of FDT and PP provide similar performances, whereas RBP, which does not provide a comparison with an internal database, may tend to have (at least in our population) a slightly higher Sp. No single diagnostic test showed perfect Se and Sp, thus clinical decisions should not be based on isolated test results. Our study is limiting in that topographic concordance of defect areas among instruments was not considered. VF test results were not repeated for confirmation of defect severity and location, which may have improved Se and decrease Sp; although our results showed good specificities for all perimetric tests.

Frequency-doubling technology, RBP, and PP testing took significantly less time than SAP SITA standard test. Moreover, PP was significantly faster than FDT and RBP because of the TOP strategy used, in which each position is tested only once.24 TOP strategy is an algorithm method based on estimation of thresholds from information gathered from adjacent points.39 Several studies have shown this strategy of determining threshold Se to be highly correlated to conventional staircase strategies;39, 40 others, however, have found that TOP strategy tends to underestimate the depth of focal defects and that defects appear to be smaller and having softer edges.40, 41, 42

Our study has some limitations. Our cohort of patients is relatively small to account for the general population. Furthermore, patients participating in this study had previous experience in automated perimetry, thus results may have been different in subject lacking VF testing experience. In addition, subjects with suspect ONH or GON having a normal IOP and VF were excluded from the control group. These strict criteria for normal subjects may have produced a supernormal population that may have contributed to the failure of the FDT and PP statistical package to find 5% for the points that were depressed <5, and 1% for the points that were depressed <1% in the controls, and to the 100% Sp observed for some parameters, which contrasts with clinical experience. This limitation is common in studies similar to ours and is difficult to overcome. Moreover, the use of repeatable SAP VF defects as gold standard to define POAG also presents a limit in that sensitivity may be falsely enhanced due to the exclusion of preperimetric patients, which have normal VF but early morphological optic nerve and/or RNFL defects.

In summary, FDT, RBP, and Pulsar are fast and easy perimetric procedures, providing similar good diagnostic accuracy in discriminating between healthy and early POAG eyes. Further studies are needed to assess the ability of these tests in defining VF damage severity and topography. Longitudinal studies are currently underway with this cohort of patients that undergo repeated SAP and non-conventional VF testing; thus the use of SAP as gold standard in our present study can be confirmed in future studies. Although SAP is considered gold standard, the use of non-conventional testing may prove to be useful in providing additional information in the diagnosis of glaucoma suspect with normal SAP results and in the therapeutic decision-making process of early glaucomatous patients. These modern day perimeters, which may be faster and easier to perform, could be used instead of SAP in subjects who may have trouble in undergoing SAP testing (ie, elderly, neurologic or retinal disorders, children, and so on), however, further studies are needed. Test–retest variability also needs to be compared among instruments with multicenter long-term studies to assess clinical use in monitoring glaucoma progression.