Methodological considerations related to sleep paradigm using event related potentials

In the last few decades, several works on event related potentials (ERPs hereafter) during sleep have been reported. In spite of numerous studies, clear methodological rules for this kind of study are often missing, making it difficult to valorize the scope of these results. We propose here a description of methodological aspects to be considered when evaluating ERPs during sleep. The use of Rechtschaffen and Kales rules versus automatic methods is assessed, plus the additional use of certain quantitative measures. Additionally, two topics are discussed which must be controlled in ERPs sleep studies: the First Night Effect, and sleep disturbances. Better control of experimental paradigms is relevant for the growth of the neuroscience of sleep. Key terms: sleep, ERP, Rechtschaffen and Kales rules, First Night Effect, automatic sleep scoring systems Corresponding author: A Ibanez (amibanez@puc.cl; web: http://neuro.udp.cl/) Received: January 21, 2008. In Revised form. September 12,2008 Accepted: October 20, 2008 INTRODUCTION In the last few decades several works on ERP research during sleep have emerged. Sleep studies of ERPs related to intrinsic sleep dynamics (i.e., Coté et al., 1999); sensory activity (Atienza et al., 2001); and even high level cognitive processes (typically reported in wakefulness) have been reported (i.e., Ibáñez et al., 2006; see reviews: Colrain and Campbel, 2007; Ibañez et al., 2008). In spite of numerous studies, clear methodological rules for these kinds of studies are often missing, making it difficult to valorize the scope of these results. In an attempt to fill in the gap, we discuss three elemental issues for any study of ERPs during sleep: methods for monitoring sleep staging; controlling the First Night effect in lab research and detecting and ruling out sleep abnormalities. Any of these are the subject of challenging controversies when considering experimental results in this field. How to monitor sleep stages: Rechtschaffen and Kales vs automatic measures The most commonly used method to determine different sleep phases in electrophysiological studies are the socalled Rechtschaffen and Kales rules (1968; R&K hereafter). This method implies the examination of polysomnograms using combined electroencephalogram (EEG) measures, electrooculogram (EOG) and electromyogram (EMG). R&K are a set of descriptive qualitative rules designed to differentiate sleep stages and different events during sleep, for example: “Stage I is defined by a relatively low voltage...the transition from an alpha record to stage I is characterized by a IBÁÑEZ ET AL. Biol Res 41, 2008, 271-275 272 decrease in the amount, amplitude and frequency of alpha activity” (Rechtschaffen and Kales, 1968, p. 5). This method has frequently been criticized because it implies subjective interpretation on the part of the human scorer (i. e., Himanen and Hasan, 2000) and other specific problems (e.g. the low temporal resolution provided by the onepage epoch, the division of sleep processes into a few discrete stages, and the neglect of spatial information; Hasan, 1996). Nevertheless, until now this methodology is the best criteria to distinguish sleep stages. Many clinical and physiological studies of sleep take these criteria as the standard and only method of sleep scoring (i.e., Ferri et al., 2001; Kaufmann et al., 2006; Peszka and Harsh, 2002), and most of ERP studies during sleep have been based on it. On the other hand, automatic sleep scoring systems (Agarwal and Gotman, 2002; Hazan, 1996; Penzel and Conradt, 2000; Robert et al. , 1998) based on different processing approaches currently available (e.g., artificial neural networks, period analysis, automated hypnogram, multiple discriminant analysis, hybrid techniques, bayesian approach, pattern recognition, wave detection, expert systems, etc.) are not good enough to replace the established R&K rules (Penzel and Conradt, 2000). The information available about sleep stages is imprecise, and it is always difficult to encode into an algorithm (Agarwal and Gotman, 2002), and the presence of unexpected artifacts and parameter configuration may lead to misclassification. Most of the methods try to mimic the rules of R&K or they are evaluated according to the degree of agreement with the visual score of R&K (Hasan, 1996; Penzel and Conradt, 2000), such that it is clear that they cannot be improved much beyond these rules per se. Automatic methods have been more successful in identifying individual events than stage classification (Agarwal and Gotman, 2002; Hasan, 1996). For example, some methods successfully classify microevents (Hasan, 1996); spindles (Uchida et al., 1994) or drowsiness (Hasan, 1996). Automatic methods have the possibility of identifying and calculating additional variables compared to R&K rules, but no standardized variables have been defined as yet. Until now, automatic analysis has never been better than analysis by an experienced sleep scorer (Penzel and Conradt, 2000). Some quantitative measurements can be used together with the R&K rules in order to improve visual inspection. For example, blink comparisons among human scorers can be used, and afterward the agreement between different human scores can be statistically assessed with some degree of non-random agreement among observers (i.e., Kappa values). Thus, only the epochs with high agreement can be selected, validating the stage selection for stimuli presentation. Second, the alpha slow-wave index (ASI), calculated as a ratio between the power of alpha activity and the sum of the power in the theta and delta bands, can be used as an index of sleep. This index has been used to detect wakefulness during sleep (Jobert et al., 1994, Hasan, 1996), and has also been demonstrated as a relatively reliable tool for detecting stage shifts between wakefulness and sleep. The recorded signal of each contiguous visually selected stage is split into a brief temporal window (about 5 seconds) for which power spectra is estimated using the Fast Fourier Transform, and then averaged for each subject and each stage condition. Each mean energy spectra is averaged according to the EEG standard frequency bands: delta (δ, 0.5-4.5 HZ); theta (θ, 4.5-8.5 HZ) and alpha (α, 8.5-11.5 HZ). After that, the ASI values can be calculated for each subject in the wake and sleep stages. The ratio is expressed as follows:


INTRODUCTION
In the last few decades several works on ERP research during sleep have emerged.Sleep studies of ERPs related to intrinsic sleep dynamics (i.e., Coté et al., 1999); sensory activity (Atienza et al., 2001); and even high level cognitive processes (typically reported in wakefulness) have been reported (i.e., Ibáñez et al., 2006; see reviews: Colrain and Campbel, 2007;Ibañez et al., 2008).In spite of numerous studies, clear methodological rules for these kinds of studies are often missing, making it difficult to valorize the scope of these results.In an attempt to fill in the gap, we discuss three elemental issues for any study of ERPs during sleep: methods for monitoring sleep staging; controlling the First Night effect in lab research and detecting and ruling out sleep abnormalities.Any of these are the subject of challenging controversies when considering experimental results in this field.

How to monitor sleep stages: Rechtschaffen and Kales vs automatic measures
The most commonly used method to determine different sleep phases in electrophysiological studies are the socalled Rechtschaffen and Kales rules (1968; R&K hereafter).This method implies the examination of polysomnograms using combined electroencephalogram (EEG) measures, electrooculogram (EOG) and electromyogram (EMG).
R&K are a set of descriptive qualitative rules designed to differentiate sleep stages and different events during sleep, for example: "Stage I is defined by a relatively low voltage…the transition from an alpha record to stage I is characterized by a decrease in the amount, amplitude and frequency of alpha activity" (Rechtschaffen and Kales, 1968, p. 5).This method has frequently been criticized because it implies subjective interpretation on the part of the human scorer (i.e., Himanen and Hasan, 2000) and other specific problems (e.g. the low temporal resolution provided by the onepage epoch, the division of sleep processes into a few discrete stages, and the neglect of spatial information; Hasan, 1996).Nevertheless, until now this methodology is the best criteria to distinguish sleep stages.Many clinical and physiological studies of sleep take these criteria as the standard and only method of sleep scoring (i.e., Ferri (Penzel and Conradt, 2000).The information available about sleep stages is imprecise, and it is always difficult to encode into an algorithm (Agarwal and Gotman, 2002), and the presence of unexpected artifacts and parameter configuration may lead to misclassification.Most of the methods try to mimic the rules of R&K or they are evaluated according to the degree of agreement with the visual score of R&K (Hasan, 1996;Penzel and Conradt, 2000), such that it is clear that they cannot be improved much beyond these rules per se.Automatic methods have been more successful in identifying individual events than stage classification (Agarwal and Gotman, 2002;Hasan, 1996).For example, some methods successfully classify microevents (Hasan, 1996); spindles (Uchida et al., 1994) or drowsiness (Hasan, 1996).Automatic methods have the possibility of identifying and calculating additional variables compared to R&K rules, but no standardized variables have been defined as yet.Until now, automatic analysis has never been better than analysis by an experienced sleep scorer (Penzel and Conradt, 2000).
Some quantitative measurements can be used together with the R&K rules in order to improve visual inspection.For example, blink comparisons among human scorers can be used, and afterward the agreement between different human scores can be statistically assessed with some degree of non-random agreement among observers (i.e., Kappa values).Thus, only the epochs with high agreement can be selected, validating the stage selection for stimuli presentation.
Second, the alpha slow-wave index (ASI), calculated as a ratio between the power of alpha activity and the sum of the power in the theta and delta bands, can be used as an index of sleep.This index has been used to detect wakefulness during sleep (Jobert et al., 1994, Hasan, 1996), and has also been demonstrated as a relatively reliable tool for detecting stage shifts between wakefulness and sleep.The recorded signal of each contiguous visually selected stage is split into a brief temporal window (about 5 seconds) for which power spectra is estimated using the Fast Fourier Transform, and then averaged for each subject and each stage condition.Each mean energy spectra is averaged according to the EEG standard frequency bands: delta (δ, 0.5-4.5 HZ); theta (θ, 4.5-8.5 HZ) and alpha (α, 8.5-11.5 HZ).After that, the ASI values can be calculated for each subject in the wake and sleep stages.The ratio is expressed as follows: The maximum ASI value calculated in each participant must be standardized (due to the high inter-individual variation of alpha activity).
Finally, amplitude measurements and time/frequency charts can be used to distinguish previously visually selected sleep stages.For example, the delta band power has been used to differentiate among ASI = different sleep stages since it is believed to reflect a basic sleep regulatory mechanism (Bobérly and Acherman, 2000).Figure 1 (a,  b and c) shows an example of comparison between time frequency charts of delta bands and sigma-beta1 bands in each stage (II and REM sleep), and the difference chart (stage II minus REM sleep) showing an increment of sigma and beta 1 activity in stage II (Ibáñez et al., 2006).Figure 1.d shows the differences in means and standard deviations of (averaged) delta amplitude in each sleep stage.Both measurements suggest a good previous sleep stage classification.
All of these quantitative criteria can be added to R&K rules in order to obtain a degree of correspondence between quantitative measurements and visual inspection and to achieve enhanced sleep stage selection Recording two consecutive nights to obtain a normal architecture of sleep can constitute an element that discourages research in the fields of cognition and sleep, especially if the increase in economic cost and time that it implies for both the experimenter and the voluntary participant is taken into consideration.Nevertheless, if certain factors are controlled, it is possible to carry out studies during the first night in the laboratory, without registering a second night.In such a case, the following aspects must be taken into consideration.It is recommended that studies take place with healthy, young participants in a comfortable laboratory setting, since FNE shows moderate or small sleep disorganization in such instances (Lorenzo and Barbanoj, 2002).The study should not consider slow wave sleep, which is highly reduced due to FNE (Browman and Cartright, 1980).Moreover, the study must be performed during the second part of the night, in which FNE is less pronounced.

First Night Effect and normal sleep parameters
In the case that a study is carried out during the first night in the laboratory, in addition to the previous recommendations, FNE in the normal parameters of sleep must be evaluated after accomplishing the study, considering hypnogram and the following sleep parameters: total sleep time; increased sleep latency; decreased REM sleep; decreased SWS; reduced sleep efficiency; and increased latency to REM; and % of Stage I.

Sleep Disorders
Finally, all study of cognitive processing during sleep must rule out any sleep abnormality that could alter the electrophysiological recordings.To achieve this, it is recommended to have a prerecording interview with the participant for thoroughly exploring sleep habits and Additionally, the offline analysis can confirm normal sleep.The small level of alpha slow-wave index (ASI) suggests absence of insomnia in the recordings (Jobert et al., 1994).Very long latencies for the sleep onset period, coupled with an increased arousal level in the EEG, are both indicative of insomnia (Perlis et al., 2001).The increase of awakenings, elevated stage I; abnormal distribution of SWS and REM abnormalities; and sleep disturbance events are frequently associated with narcolepsy (i.e., Harsh et al., 2005).The phenomenon of motor or repetitive arousal and REM activity with respiratory disorders are characteristic of sleep apnea (Li et al., 2004;Scholle et al., 2003).The typical sleep abnormalities of EEG-activity related to epilepsy can be detected, even in the case of sub-clinical manifestations (i.e.; reduced SWS, increased Stage I; seizures and/or interictal epileptiform discharges during non-REM with reduced REM sleep; Marzec et al., 2005).

CONCLUSION
Much knowledge is still required even to understand the conjunction of dramatic changes in cerebral dynamics and the recording of ERPs during sleep.Methodological requirements of the experimental design are especially important.In order to address these requirements, we have discussed three elemental issues for any study on ERPs during sleep: methods for monitoring sleep staging; controlling the first night effect in lab research and detecting and ruling out sleep abnormalities.While better control of the experimental paradigms is developed and simultaneously, ERPs that occur during sleep are differentially investigated, a general ERP model of sleep activity will be advanced.
The first night effect (FNE, hereafter) is a well-known phenomenon characterized by increased fragmentation of sleep architecture, increased sleep latencies, decreased REM sleep and decreased slow wave sleep due to the uncomfortable laboratory setting (Lorenzo and Barbanoj, 2002).Figure 2 shows a first night hypnogram and then the re-establishment of the sleep architecture in the second night.Given FNE, it is recommended to use two consecutive nights of sleep and carry out the study during the second night, during which sleep architecture and stage latency and duration are closer to the normal parameters.

Figure 1 :
Figure 1: Comparisons of frequency bands between sleep stages.(a) Time frequency chart for stage II sleep.(b) Time frequency chart for REM sleep.(c) Stage II minus REM sleep time frequency chart.(d) Microvolt differences (mean and standard deviations) of delta band activity during stage II and REM sleep.Reproduced from Ibanez et al. (2006), with permission from Elsevier © 2006.
behaviors during the past few months, and applying a scale of subjective symptoms of sleep abnormalities.During the polysomnographic recording and infraredvideo monitoring, several sleep abnormalities can be discarded: insomnia; delayed sleep syndrome or advanced sleepphase syndrome; sensory or movement disorders associated with restless leg syndrome, episodes of sleep-walking or sleep terror, or symptoms of REM sleep parasomnia (REM sleep behavior disorder; i.e., somatic muscle atonia absent during REM sleep; motor behaviors during REM sleep: see Mahowald and Schenck, 2005).

Figure 2 :
Figure 2: Hypnogram from a single subject in two consecutive nights sleeping in the lab (same recording conditions).Note the reduction of the total sleep time and the lengthening of sleep latency in the first recording, along with the other features characteristic of the first night effect.