MLR,LLR,MMN,P300,N400,T complex



 Contents
  
MLR,LLR,MMN,P300,N400,T complex

ü Generators


ü Principles of recording


ü Factors affecting recording


ü Interpretation


ü Correlation with FMRI, PET


ü Electrical LLR


ü Clinical Disorders
   AUDITORY MIDDLE LATENCY RESPONSES, 40 Hz AND 20 Hz RESPONSES

Auditory middle latency responses are also called as Middle Latency Response (MLR), Auditory Middle Response (AMR) or Middle Latency Auditory Evoked Potentials (MLAEPs).
AMLRs are those replicable positive and negative peaks that occur between 10 and 50 ms, after the onset of the eliciting signals (Goldstein).
Waveform characteristics of AMLR:
 It is usually consists of 3 positive and 3 negative peaks, which are labelled as No, Po, Na,Pa, Nb, Pb (Goldstein & Redman, 1967).
AMLR is consist of biphasic waveform with a negative wave occurring at about 20 ms (Na), a positive wave occurring at about 30 msec (Pa), a second negative wave occurring of about 40 msec (Nb) and second positive wave occurring at about 50 msec (Pb).  The Pb component of the MLAEP is often identified as the P1 component of the LAEP.  The wave amplitudes range from 0.5 to 3.0 µv.

Different investigators have given the latency values of various components of AMLR as follows:

Year
No
Po
Na
Pa
Nb
Pb
Goldstein & Redmann
1967
8-10 ms
10-13 ms
16-30 ms
30-55 ms
40-60 ms
55-80 ms
Medel & Goldstein
1972
-
11.3 ms
20.8 ms
32.4 ms
32.4 ms
45.5 ms
Lane et al.
1972
-
10.7 ms
19.7 ms
29.7 ms
47.2 ms
64 ms

Generators:
The Na component receives contributions from sub-cortical regions of the auditory system, specifically the medial geniculate body of the thalamus (Fischer, Bognar, Turjman, & Lapras, 1995); and perhaps portions of the inferior colliculus (Hashimoto, 1982).  However, evidence from intracranial electrophysiologic recordings and magnetic responses in human suggests that generation of the Na component also involves the primary auditory cortex within the temporal lobe – Heschl’s gyrus (Liegeois-Chauvel, Musolino, Badier, Marquis & Chauvel, 1994).
There is general agreement that the Pb component of the AMLR arises from auditory cortex, perhaps the posterior region of the planum temporale.  Finally, the pronounced effects of state of arousal, sedatives and especially, anesthetic agents on the Pa and Pb components provide evidence that the reticular activating formation is involved in generation of the AMLR.

FACTORS AFFECTING AMLR
A. Exogenous Factors
1. Gender
The effects of subject’s gender vary considerably among AERs. AMLR components tend to be shorter in latency and larger in amplitude, in females versus male subjects (Wilson et al. 1989). In one early study (Mendel & Goldstein, 1969) Women showed shorter latency and larger amplitudes for some AMLR peaks.

2. Age
The effects of subject’s age vary considerably among AERS, AMLR and ALLR along with P 300.  Responses are apparently not adult like until age 8-10 years or even later.  Some reports indicate difficulty in obtaining reasonable waveform for neonate (Engel 1971, Davis et al. 1974).  Other studies note little differences between adult and infant morphology for middle components as a function of intensity or rate of stimulus presentation (Smith et al. 1974; Goldstein & McRandle, 1976).
The major differences between these populations are that neonates demonstrate lightly      longer latencies and smaller amplitudes than do adults. Amplitude of Pa increases steadily from infancy through late childhood and then decreases with advancing age.  At the slower rate, latency of the Pa component is usually in the 50 msec range, or twice the expected adult latency value; although it may be further delayed in very young but normal infants (Fifer & Sierra-Irizarry, 1988).
Non-inverting electrode location appears to be an important factor in recording AMLR activity.  Until approximately 1982, almost all studies of AMLR in children, or adults for that matter, used a midline (vertex (Cz) or high forehead (Fz) electrode site to record the response when either ear was stimulated.  Recent experimental and clinical evidence indicates that response may be detected from a midline electrode when there is no response from an electrode located over the temporoparietal region of the brain (Kileny et al. 1987; Kraus et al. 1988).
Advancing Age in Adults
There is relatively little mention in the literature of AMLR on advancing age.  Woods & Clayworth (1986) reported that latency of the Pa component was longer (by 2.3 msec on the average) in older versus younger subjects.  Lanzi, Chiarelli & Sambataro (1989) found distinct AMLR deterioration and latency shift in advanced aging.
Summary of Neuroanatomical and Neurophysiologic Changes that are thought to play a role in the changes in the AMLR with Advancing Age
·      Alterations in cortical “folds” (and possibly the orientation of the scalp distribution of cortical AERs.
·      Cortical atrophy, including loss of functioning neurons in the auditory cortex (e.g. superior temporal gyrus).
·      Reduced communication and feedback between the auditory cortex and subcortical structures (e.g. inferior colliculus in brainstem and medical geniculate body in the thalamus; decrease in the number of neurons projecting caudally from the temporal lobe).
·      Decrease in gamma-aminobutyric acid (GABA) levels within thalamic auditory centers (i.e. decreased capacity for inhibition of cortical activity).
·      Reduction in white matter in prefrontal cortex.

3.   Body temperature

Hypo and hyperthermia exert the greatest effect on short latency AERs.  There are few studies of AMLR and temperature.  As noted Kileny Dodson Gelfand (1983) monitored hypothermic patients undergoing open heart surgery AMLR.
Hall, 1987, Hall & Cornaum, 1988 has applied AMLR is monitoring Patient undergo hyperthermia  treatment for advanced cancer while patients undergo whole- body heating up to 42 degree C. There is evidence of decreased latency and reduced amplitude of the Pa component for some Patients as body temperature is elevated from normal levels (above 37° C) however, it was not a consistently observed finding.

4.   Attention and State of Arousal

In adult subjects, the AMLR can be reliably recorded in light sleep and after mild sedation and in different states of subject attention (Hirabayashi & Kobayashi, 1983).  However, sleep, sedation and attention exert clinically important influences on the AMLR, especially in infants and children, and must be considered in interpretation of the response (Jerger).
The AMLR can be clearly recorded in REM sleep and also sleep stage 1, but the AMLR is more variable and inconsistent in sleep stage 2. The AMLR is rarely detected in sleep stage 3, and it’s altogether absent in sleep stage 4.
Kadobayashi and Toyoshima (1984) found decreased amplitude for the Pa component of the AMLR when subjects paid no attention to click stimuli.  On the other hand, the Pa component in adults was stable across sleep conditions (wakefulness, slow wave sleep and rapid eye movement sleep), according to Erwin and Buchwald (1986).  In general, wave Pa amplitude tends to be reduced during sleep in comparison to the awakened state, by up to 40 percent of maximum in stage IV sleep.  Latency, on the other hand, remains stable in sleep (Terkildsen, 1985).      

5. DRUGS:

Drugs that influence the CNS (E.g. Sedatives, anesthetic agents) exert the greatest effect on longer-latency, cortically generated AERs and almost no effect on ECochG and ABR.
(i) Anesthetic Agents
Anesthesia is defined as loss of sensation (Partial/complete), with or without loss of consciousness, which may be drug induced or due to disease/injury.
The MLAEP is suppressed with the Patient under general anesthesia with the use of halothane, enflurane, isoflurane and desflurane (Schwender et al. 1996).  The amount suppression is dose dependent and hence makes an excellent means of monitoring to depth of the anesthesia.  The MLAEP is more sensitive to changes in the level of anesthesia (Plourede et al. 1997; Tatsumi et al. 1995).

(ii) Sedatives and Hypnosis (Depressants)
Sedatives and hypnotics facilitate onset and maintenance of sleep, the Patients can be easily aroused with stimulation.
Chloral hydrate is the oldest synthetic “Sleeping drug” and, by far, the most popular sedative for quieting children for AER measurement.  It is a halogenated alcohol that undergoes chemical reduction after ingestion and causes CNS depression.  Shallop et al. (1985); Wilson et al. (1989); reports that chloral hydrate causes Pa amplitude to decrease and latency may be increased or decreased. AMLR changes with chloral hydrate sedation are more pronounced when stimulus rate is increased (from 4/sec to 10/sec).

(iii) Alcohol
Amplitude of AMLR is decreased by acute alcohol intoxication (Gross et al. 1966; Perry et al. 1978).

6.   Muscular Artifact

i.                   Muscle Interference (Artifact)
High-frequency muscle, and electrical, artifact is usually not a serious problem in AMLR recordings because the low pass (high-frequency limit of the recommended filter setting – 200 to 1500 Hz) is effective in minimizing these types of measurement contamination.  Low-frequency muscle artifact is, on the other hand, very troublesome since it often occurs within the same frequency region as the response.  Elimination of this low frequency artifact by filtering is, therefore, not an alternative.  The most effective clinical strategy for minimizing muscle artifact in the measurement of the AMLR is to verify that the patient is motionless and resting comfortably, with the head supported and the neck neither flexed nor extended.  Best results are obtained when the patient is resting in a recliner or lying supine on a bed or gurney.  For measurement of the AMLR, it is not advisable for the patient to be sitting upright in a straight-back chair with no head support.

ii.                Post Auricular Muscle (PAM) Activity
PAM activity is elicited with a sound and recorded with an electrode near the ear (e.g. earlobe) and, especially, behind the ear (e.g. mastoid).  The PAM response is one of numerous “sonomotor” or myogenic responses (Davis, 1965) that are described by the final efferent component, i.e. the muscle involved.
Postauricular muscle (PAM) artifact is sometimes apparent toward the end of an ABR waveform if an analysis time of 15 ms or longer is used, and it is not uncommon in AMLR measurement.  PAM artifact is more likely to occur in patients who are tense and is usually observed when the inverting electrode is located on the earlobe or mastoid that is, ipsilateral to the stimulus and at high (70 dB nHL or greater) intensity levels.  However, PAM can also be recorded under other measurement conditions.  Interestingly, PAM activity may be recorded from electrodes located on the side that is ipsilateral to the stimulus, contra lateral to the stimulus, or even bilaterally with a monaural stimulus.  The most effective electrode array for minimizing PAM activity includes a noncephalic reference electrode.
Initially there was a debate as to whether the AMLR was really a neurogenic response (arising from the nervous system) or, instead, a myogenic (arising from muscle) response from the postauricular muscles (Bickford & Rosenblith, 1958).
Knowledge about PAM is useful for the clinician who is intent on eliminating it as a factor in measurement of the AMLR.  Anatomically, the PAM response is a brainstem reflex that is somewhat similar to the acoustic stapedial reflex.
The triphasic response occurs in the 13 to 20 ms range and is optimally detected with an electrode on or in (via a needle electrode) the PAM lying over the mastoid region of the temporal bone, behind the ear.  Amplitude ranges from 2 to 4 µvolts at low intensity levels to 20 µvolts or greater for high intensity levels (90 dB and above).
The likelihood of observing a PAM artifact is increased if the patient is anxious, tense, smiling or in a head-down position (flexion of the neck).  With these maneuvers, Dus and Wilson (1975) recorded PAM activity to clicks from 89 percent of a group of thirty-seven adults.  PAM artifact is diminished or eliminated by neck extension (Bickford et al. 1963; Kiang et al. 1963), anesthesia, muscle relaxants, alcohol and tranquilizers, and facial nerve paralysis (Cody, Jacobson, Walker, & Bickford, 1964; Gibson, 1975).  There are reports of reduced or absent PAM activity during natural sleep (Erwin & Buchwald, 1986.
Robinson and Rudge (1977) found that the PAM response was absent when recorded from an electrode ipsilateral to the stimulus in 15 per cent of normal subjects and absent bilaterally in 40 percent of normal subjects (all of whom showed clear AMLRs).  The effect of repetitive stimulation on the PAM is unclear (e.g. Davis), Amplitude of the PAM response is somewhat variable (usually in the range of 5 to 15 µvolts), but latency is remarkably constant (estimated at approximately 8 ms by Gibson, 1975) and bilaterally symmetrical, with an inter aural difference of less than 0.6 ms (Clifford-Jones, Clarke, & Mayles, 1979).

7.   Handedness
 Hood et al., (1990) found that Pb varies with handedness, being 4 ms longer in left handed adults. Stewart et al. (1993) found a progressive increase in the latency of middle latency components in left handed individuals with the greatest effect being on Pb.

Exogenous Factors
1.   Types of stimuli
There are different stimuli, using which MLR can be elicited. Electrical as well as acoustical stimulation can be used to elicit MLR.  Burton et al. (1989), report that there is no significant difference between latencies of electrically and acoustically evoked waveforms in guinea pigs.  Kemink et al. 1987; found electrical MLR in profound deaf ears.  The latency of most positive peak is around 26-30 ms which is similar to the latency of acoustic MLR, was noticed. Stimulation of VIIIth nerve to produce electrically evoked MLR can be accomplished via a transtympanic needle electrode on the round window membrane (Black et al. 1987).
Zerlin et al. (1971) advocated the use of 1/3rd octave filter clicks and reported that filtered clicks elicited clear waveforms than tone bursts.
Kilney et al. (1986) found that clicks evoked well defined and easily identifiable MLR.  Also the amplitude of Na-Pa was larger when the tone bursts were use.
On the other hand, Kupperman & Mendel (1974) preferred use of gated tone bursts with a raise time of 2.5 to 2 msec duration.  Using either of these stimuli i.e. filtered clicks/gated tone bursts it was possible to obtain frequency specific stimuli in the range of 500-8000 Hz.
Maurizi et al. (1984) reported that tone pips provided more frequency specificity than clicks.  Na, Pa, Nb and Pb showed greater latency but smaller amplitude for tone pips.  Low frequency tone bursts are found effective in obtaining response from adult who are awake (Murick et al. 1981).
2.   Number of Stimuli

The MLRs are usually obtained after 400-500 stimulus presentation, although Goldstein et al. (1978); managed to obtain clear recordings after only 225 stimuli. Lorson et al. (1966) have stated that 200 to 400 stimuli should be presented to obtain average response. Goldstein et al. (1974) used 1024 stimuli to obtain an average response.  Increase the number of stimuli from 1000-4000 does not increase the case of identification of MLR.  Mc Candle et al. (1974) found a number of 256 stimuli with a stimulus rate of 4.5 sec, 512 stimuli with a rate of 9.6/sec.
3.   Stimulus rate

Stimulus rate is the number of stimuli repeated per unit of time.  MLR is generally obtained at a rate of 10/sec (Picton et al. 1974).  Mendel (1973) reported that a change in repetition rate has little effect on the amplitude of the response.  A change in repetition rate from 1 to 16 stimuli/sec has no effect on MLR amplitude (McFarland et al. 1973).  However, if repetition rate is increased beyond 16/sec, reduction in the overall amplitude may be seen (McFarland et al. 1979).
Amplitude of the Pa component recorded from normal adult subject remains stable for rates of 1/sec to 15/sec, but latency is significantly shorter for very slow rates (0.5 and 1/sec) than for faster rates (Erwin et al. 1986, Goldstein et al. 1972.
For rates higher than about 15 msec in adults, response latency decreases and amplitude increases until the stimulus rate approaches 40/sec.  Among infants, a Pa component at a latency of about 50 msec can be recorded if stimulus rate is as slow 1 or 2/sec.  With a faster rate (of 4-10/sec) the response is usually not observed.

4.   Stimulus frequency

There are not many studies to show the clear effects of frequency on MLR.  Waveform’s latency for the peaks reduces with increase in stimuli frequency (Thronton et al. 1977).  Further, linear changes in amplitude are noted early peaks with increase in stimuli frequency.  Lupperman (1970) demonstrated that the middle component was more dependent on the stimulus than the stimulus frequency.

5.   Intensity of the Stimulus
a.  Amplitude intensity function

As the stimulus intensity increases the amplitude of the MLR waves increase (Goldstein et al. 1967; Picton et al. 1977).  Amplitude increases steadily from over the intensity range of 0-70 dB SL, but the amplitude intensity function is not linear.  Madell & Goldstein (1972) found a high linear correlation between AMLR amplitude, especially for the Po-Na components and loudness.
Kupperman & Mendel (1974) reported of absence of systematic growth in amplitude with an increase in intensity of tone pips.
b.   Latency intensity function
As click stimulus intensity level increases from behavioral threshold (for click) up to about 40-50 dBSL, latency systematically decreases.  Then, for higher intensity levels, latency remains relatively constant (Goldstein, 1967, Thornton et al. 1977).
6.    Duration of Stimuli

Stimulus duration is the sum of the rise time, plateau time and fall time.  MLR is considered as “onset” response i.e. it is depend upon the onset of stimulus.  A fast rise time is very important for elicitation of MLR.  Skinner & Autinore (1965) found that a rise time greater than 25 ms was ineffective.  Use of faster rise time gives more consistent and clear response.  Clicks have faster rise time than tone pips/tone bursts.  They elicit wave form, which have larger amplitude (Berlin et al. 1974).
There is no effect on AMLR waveform with change in decay time as it is an ‘onset’ response. Kupperman & Goldstein (1974) used a 1 kHz tone burst at 50 dB SL.  Rise times used by them were 5, 10, 15 and 25 ms and duration were 20-40 msec.  Early component of AMLR were not affected but later waves showed an devices in amplitude when 25 ms rise delay time was used.  Increase in rise-decay time results in increase of latency by 1 to 3 ms for all peaks, an overall reduction in amplitude at all intensity levels (Weber et al. 1972) was seen.
Kupperman (1970) found no consistent change of AMLR as duration of stimuli was varied from 1.5 to 4 msec. A rise/fall time of at least 2 ms is recommended to reduce spectral splatter and appears to maximize frequency specificity.  Rise/fall times of greater than 4 ms appear to significantly reduce the amplitude of the Na-Pa response.  I.e. the amplitude of Na-Pa response increases for rise/fall times up to about 2 ms and then shows a reduction in amplitude for rise/fall times greater than 2 ms.  A rise/fall time of 4 cycles and plateau time of 2 cycles are believed to be a good compromise in estimating a AMLR threshold (Xu et al. 1997).
7.   Masking

The amount by which the threshold of audibility of a sound is raised by the presence of another (masking) sound (ANSI, 1960).  The presentation of contra lateral masking stimuli of moderate intensity does not appear to affect component amplitude (Gutnick et al. 1978).  The shift in amplitude is 0.7 µv, which is insignificant.  The ipsilateral masking noise shows a peak-to-peak amplitude variation, which varies directly with signal to noise ratio (Smith et al. 1975).
8.   Monaural versus Binaural Stimulations

In general, amplitude for the AMLR component is smaller for true binaural recordings than for the sum of monaural response (Grossmann et al. 1986).  Kelly-ball, Weber & Dobie (1984) assessed BI (Binaural interaction) with ABR and AMLR for 12 younger and 12 older adult subjects.  The two groups were matched for hearing impairment and each showed a moderate to severe-sloping, high frequency hearing loss.  No latency differences in AMLR were found for the summed monaural versus true binaural conditions.  But Na-Pa and Nb-Pb amplitude values in the younger groups were significantly reduced for the binaural condition in comparison to the summed monaural condition.  This expected binaural AMLR amplitude reduction was on the average, not observed for the older subjects.
Woods & Clayworth (1985) found evidence of binaural difference waveform in AMLR recordings from 12 normal subjects.  Wave Pa amplitude values were about 2% larger and latencies were about 1.5 ms longer for binaural versus monaural stimulation.  Na amplitude was larger and latency was shorter.  When recorded with an inverting electrode with the stimulus in contra lateral ear versus ipsilateral location, there was little inverting electrode effect on the Pa component amplitude/latency.

Effects of Acquisition Factors
1.   Averaging

The AMLR Pa component normally has approximately twice the amplitude of ABR wave V.  As a result the SNR is usually greater and less averaging is required to obtain a clear and easily identifiable response for AMLR.  A total of 1000 stimulus presentations is typically adequate and under ideal measurement conditions (i.e. high stimulus intensity level, quiet but awake normal hearing subject), 512 sweeps or fewer produces a suitable wave forms (Goldstein et al. 1972).
2.   Analysis time

It is the period after presentation of stimuli in which AER data are collected and the AER normally appears.  It is also known as Epoch/window size.  Different analysis times are required for AER that differ in latency.  The overall objective is to include the entire response.
Some evoked response equipment manufacturers recommended, and some investigators report using an AMLR analysis time of 50 or 60 ms because the major component Pa is invariably located within this latency region.  The second major component Pb occurs at about 50-60 ms and may, therefore be shortened / not detected.  For this reason an analysis time of 100 msec is suggested.  All AMLR components are included within this time frame, yet even with a minimal number of data points (256 per channel) latency resolution is adequate for the response.
Because frequency content of AMLR is primarily in 10-40 Hz region, the maximum latency resolution associated with this analysis time, approximately 0.4 msec (100 msec divided by 256 data points for a transient stimulus) is sufficient for precise latency analysis.
3.   Filtering

In AER measurement filters reject electrical energy at certain frequencies and pass energy at other frequencies.  Filtering is usually used in AMLR recording to reduce the unwanted influencing of low frequency EEG activity as a source of noise in the response.  Differences in the adult and child EEG have led to the expectation that the testing of children requires special considerations in the choice of filtering parameters.
a.       Low filter setting
Suzuki & colleagues (1983, 1984) reported that AMLR variability in both children and adults can be reduced when EEG activity below 20 Hz is filtered out although setting higher than 20 Hz caused unacceptable amplitude reductions in the child AMLR.
In a study of 217 children Kraus et al. (1987) examined AMLR detectability amplitude and latency using 2 filtering conditions. 3-2000 Hz with a 6 dB / octave slope and 15-2000 Hz with a 12 dB/octave slope.
In all age groups studied the detectability of waves Na and Pa was better with high pass filter setting of 15 Hz then with lower settings.  These results are consistent with hypothesis that large amplitude, low frequency EEG activity obscures AMLR in children i.e. the AMLR may be masked out by EEG activity in 3-15 Hz range.
The amplitude of AMLR waves increases as filter settings are lowered from 30 to 3 Hz in human adults (McGee et al. 1987).
b.    High filter setting and filter slope
Two other technical issues arise with filtering.  One concerns the high filter setting.  The spectral energy for the AMLR lies below 100 Hz and no significant changes in the morphology or latency of Pa are seen with settings greater than 300 Hz (McGee et al. 1983).  However, when the equipment is available it is often desirable to open the filter to 2000-3000 Hz to allow simultaneous recording of the ABR (Suzuki et al. 1981; Mran et al. 1983).
The second issue concerns the slope of the response filter, Scherg (1982) Smith et al. (1987) demonstrated that steep (24-48 dB/octave) analog filtering causes distortion of the AMLR and the emergency of non-physiologic peaks.  That is in patients with no AMLR, the filters produce an AMLR like artifact. In recording the AMLR, analog filtering of 6 or 12 dB /octave yields an undistorted waveform.
4.   Electrode:

Single channel recordings are most common in the MLAEP.  The MLAEP is maximally recorded over the vertex at Cz referenced to the ipsilateral earlobe/mastoid.  However, this gives a single vertex response (Cz) and does not provide information about inter hemisphere responses.  A more useful technique is to use three channels with electrodes located at Cz, T3 and T4 (non-inverting) with a non-cephalic electrode located at C7 or linked mastoids (A1+A2).  A ground in either situation is located at Fpz.  Similar montages have been consistently reported in the literature for use in measures of the middle latency auditory Pathway function (for Neurodiagnostic purposes) (Kraus et al. 1982, 1994).

ANALYSIS AND INTERPRETATION OF WAVEFORM
Goldstein and Rodman in 1967 first introduced the labels that are used to describe the AMLR components. Goldstein in his experiments chose the labels Na, Pa and Nb to designate the polarity (N for negative and P for positive). He has defined the normal latency range for each of the AMLR peaks, taking into account increase in latency with decreasing intensities,
Na: 16.25 to 30ms
Pa: 30 to 45ms
Nb: 46.25 to 56.25ms
Normal waveform variations
With the filter setting of 10 to 1500Hz, the ABR is apparent within the initial portion of the waveform, followed by a relatively slow, negative-going component in the 12-15ms regions. This is labelled Na (N for negative and ‘a’ to denote the first component of MLR. The major MLR component is the Pa.
There are two main techniques for amplitude measurement of the prominent Pa component. Traditionally, in AMLR analysis amplitude was calculated for the Na to Pa wave complex. This technique is still widely applied as it is a straightforward approach because each component tends to be distinct, at least in the normal waveform. But in cases with neuropathology, the wave Na is usually robust. This becomes the limitation of the Na-Pa amplitude calculation. An alternative approach is to calculate the amplitude of the Pa to Nb components. But in cases with higher auditory dysfunctions, the Nb component may be absent. A third possible approach is to take the difference between the Pa peak and measure of baseline activity. But defining the valid index for baseline is difficult.
Guidelines for Auditory Middle-Latency Response (AMLR) Test Protocol

Parameter
Suggestion
Rationale / Comment
Stimulus Parameters
Transducer
ER-3A
Supra-aural earphones are acceptable for AMLR, but insert earphones are more comfortable and, because the insert cushions are disposable, contribute to infection control.
Type
Click
For neurodiagnosis only.  However, a more robust AMLR is usually recorded with longer duration tone-burst signals.

Tone burst
For neurodiagnosis or frequency-specific estimation of auditory sensitivity.


Detection of the Pb component of the AMLR is enhanced for lower frequency tone-burst signals.
Duration click signal tone-burst signal rise/fall plateau

0.1          ms


1      cycles

Multiple cycles

Click signals are less effective than tone bursts in evoking the AMLR


Rather abrupt tone-burst onset is important for AMLR as it is for the ABR.
Plateau durations of 10 ms or longer are appropriate for evoking the AMLR, especially the Pb component.

Rate
57.1 / second
A slower rate of signal presentation is indicated for younger children, or for patients with cortical pathology.  Signal presentation rates as low as 1 per second, or 0.5/second (one signal every two seconds) are required to consistently record the Pb component.
Polarity
Rarefaction
An AMLR can also be recorded for condensation or alternating polarity signals.
Intensity
570 dB nHL
For neurodiagnosis, a moderate signal intensity level is appropriate.  Signal intensity is decreased, of course, for estimation of thresholds.  High signal intensity levels should be avoided.  Tone-burst signals should be biologically calibrated to dB nHL in the space where clinical AMLRs are recorded.
Number
51000
Signal repetitions vary depending on size of response and background noise.  Remember the signal-to-noise ratio is the key, averaging may require as few as 50 to 100 signals at high intensity levels for a very quiet and normal hearing patient.
Presentation ear
Monaural
For estimation of auditory sensitivity and neurodiagnosis.  There is no apparent clinical indication for binaural AMLR measurement.
Masking
50 dB
Rarely required with insert earphones and not needed for stimulus intensity levels of 570 dB HL.
Acquisition Parameters
Amplification
75,000
Less amplification is required for larger responses
Sensitivity
50 µvolts
Smaller sensitivity values are equivalent to higher amplification
Analysis time
100 ms
Long enough to encompass the Pa and Pb components
Prestimulus time
10 ms
Provides a convenient estimate of background noise and a baseline for calculation of the amplitudes for waveform components (Na, Pa, Nb, and Pb)
Data points
512

Sweeps
1000
See comments above for signal number.
Filters
Band-pass

10 to 1500 Hz

10 to 200 Hz

For recording an ABR, and AMLR with an Na and Pa component.
For recording an AMLR with an Na and Pa component.  Do not over filter (e.g. high-pass setting of 30 Hz and low-pass setting of 100 Hz) as it may remove important spectral energy from the response, and it may produce a misleading filter artifact.

0.1 to about 200 Hz
Decrease high-pass filter to 1 Hz or less to detect the Pb (P50) component.
Notch
None
A notch filter (removing spectral energy in the region of 60 Hz) is never indicated with AMLR measurement because important frequencies in the response (around 40 Hz or below for young children) may also be removed.
Electrodes Type

Disc

Disc electrodes applied with paste (versus gel) to secure the non-inverting electrodes on the scalp.  It is helpful to use red and blue-coloured electrode leads for the right and left hemisphere locations, respectively Ear-clip electrodes are recommended when an earlobe inverting electrode site is used.
Sites
Channel 1
C3 to Ai\Ac or C3 to NC

Hemisphere electrode locations are required for neurodiagnosis.  A linked earlobe inverting electrode arrangement (Ai = ipsilateral ear; Ac = contra lateral ear) or a non-cephalic (NC) inverting electrode (on the nape of the neck) is appropriate and reduces likelihood of PAM artifact.
Channel 2
C4 to Ai\Ac or NC
C3 = right hemisphere site; C4 = left hemisphere site.  See comments
Channel 3
Fz to Ai\ Ac or NC
A third channel (3) is optional for neurodiagnosis.  Only the midline non-inverting electrode channel is needed for the estimation of hearing sensitivity.
Channel 4

Outer canthi of eye
 Ground FPz
Optional: For detection of eye blinks and rejection of averages contaminated by eye blinks.

















   AUDITORY LONG LATENCY RESPONSES

 The cortical auditory evoked potentials (CAEPs) are scalp recorded evoked potentials that occur in response to variety of stimuli (Naatanen & Picton, 1987). CAEPs can be classified into ‘obligatory’ and ‘discriminative’ potentials. Discriminative potentials are evoked by the change from frequent ‘standard’ stimulus to infrequent ‘deviant’ stimulus. The discriminative potentials consist of MMN, P300.  The ‘obligatory’ CAEP are classified in terms of their latencies or the time of occurrence after presentation of a stimulus (Hall, 1992). The obligatory CAEP is also called auditory long latency responses (ALLR).

The long latency auditory evoked potentials are characterized by components comprising the time domain of 50-500 ms (Mc Pherson and Starr, 1993) and are labeled according to the polarity and latency at the vertex (Picton et al., 1978). Major components in the long latency auditory evoked potentials include a positive component at about 60 ms, another positive component at 160 ms and two negative components at about 100 and 200 ms (Mc Pherson and Starr, 1993).

Principles important for the acquisition and analysis of CAEPs and their potential applications for audiologists and hearing scientists are presented.  There is considerable clinical and scientific interest in CAEPs to probe threshold and suprathreshold auditory processes because they are believed to reflect the neural detection and / or discrimination of sound.  The underlying assumption is that sound perception results from the neural detection and discrimination of its acoustic properties.  For this reason, researchers are examining the clinical utility of CAEPs for assessing auditory processing in individuals with normal or impaired auditory system

Event-related potentials are classified in several ways. For example, the P1-N1-P2 complex is traditionally considered to be comprised of slow components (50-300 ms), while the MMN is considered to be a late component (150-1999 ms).  ERPs can also be classified as sensory evoked, processing contingent or movement related.  Movement-related components are not auditory and are therefore not discussed. However, no classification system is perfect.  The P1-N1-P2 complex while composed of sensory-evoked components is not purely sensory.  It is affected by attention and can be modified by auditory training.  Similarly, the MMN, while considered to be a processing-contingent component is affected by the acoustics of the eliciting stimuli.

Normal ALR Anatomy-  generators
The neuroanatomic origins of the major ALR components (N1 and P2) occurring in the range of approximately 60 to 250 ms, for many years been the subject of study and debate.
  The generators of N1 complex- within the N1 wave complex, multiple individual components can be recorded under certain stimulus and subject conditions among them Nb, Nc and the “Processing negativities”.
  The major N1 and P2 components receive contributions from in primary auditory cortex and the supratemporal plane located anterior to this region.
  It appears that both tonal and speech signals elicit N1 and P2 components generated within the auditory cortex. However there is a evidence that the for N1 activity elicited by vowels is limited to the left auditory cortex, consistent with the specialization of the left hemisphere for speech processing.
  Subcomponents (N1b and Nc) may reflect different orientations (vertically or laterally oriented) for the dipoles underlying the N1 and temporal lobe region related to primary auditory cortex.
  There is also evidence saying that sub cortical structures have a role in generating N1 which include Thalamus, hippocampus and the reticular activating system. (Naatanen & Picton 1987).

  The generators of P2- there is not well established generator for P2. Based on the topographic recordings, it appears that the P2 wave receives contributions from multiple anatomic sources. The sub cortical reticular activating system plays a major role in the generation of P2 wave. Auditory cortex is also the possible source including planum temporale and the auditory association regions (area 22).  The P2 waveform is essentially mature by 2 to 3 years whereas developmental changes of the N1 wave may continue till 16 years of age.

Components:
The P1-N1-P2 Complex
While it is possible to disassociate them, P1, N1 and P2 are typically recorded together, at least in adults. N2 might or might not be present even in Normals, so not much importance is given for that. When elicited together, the response is referred to as the P1-N1-P2 complex.  The specific latencies and amplitudes of each peak depend on the acoustic characteristics of the incoming sound and on subject factors.
 P1, the first major component of the P1-N1-P2 complex, is a vertex-positive voltage deflection that often occurs approximately 50 ms after sound onset.  P1 is usually small in amplitude in adults (typically <2 uv) but is large in young children and may dominate their response.  Generators of P1 have traditionally been identified in the primary auditory cortex and specifically Heschl’s gyrus.  P1 is typically largest when measured by electrodes over midline central to lateral central scalp regions.  While often described as part of the middle-latency response (component Pb), recent work ahs suggested that these are separate components.

Generation of P1 may actually be more complex than early studies suggest, and additional regions that may contribute to this response, including the hippocampus, planum temporale, and lateral temporal cortex have been identified.  Recent work has also focused attention on the importance of neocortical areas to P1.

N1 appears as a negative peak that often occurs approximately 100 ms after sound onset.  N1 latency can be longer in some cases, depending on the duration and complexity of the signals used to evoke the response.  N1 follows P1 and precedes P2.  Compared to P1, N1 is relatively large in amplitude in adults (typically 2-5 uv, depending on stimulus parameters).  In young children, however, N1 generators may be immature and therefore the response absent, particularly if stimuli are presented rapidly.  N1 is known to have multiple generations in the primary and secondary auditory cortex and is therefore described as having at least three components
      The first is frontocentral negativity (N11) that is generated by bilateral vertical dipoles in or near the auditory cortex in the superior portion of the temporal lobe and is largest when measured by electrodes near the vertex.  It is though that this component may reflect attention to sound arrival, the reading out of sensory information from the auditory cortex, or the formation of a sensory memory of the sound stimulus in the auditory cortex.
      The second component is known as the T complex.  It is a positive wave occurring approximately 100 ms after sound onset, followed by a negative way occurring approximately 150 ms after sound onset. 
      The T complex is generated by a radial source in the secondary auditory cortex within the superior temporal gyrus and is therefore largest when measured by electrodes over mid temporal scalp regions.  While it has been proposed that the T complex may involve a simple inversion of the N11, component, it has subsequently been shown to reflect separate processes. 

      The third component is a negativity occurring approximately 100 ms after sound onset that is best recorded near the vertex using long inter-stimulus intervals.  The generator of the component is unknown, and it may not be specific to sound.  It may reflect a wide spread transient arousal that acts to facilitate efficient sound processing.  In general recorded from electrodes in midline central scalp locations.  For this reason, it is wise to include electrodes over lateral temporal sites to optimally pick up contributions from the secondary auditory cortex.

P2 is a positive waveform that occurs approximately 180 ms after sound onset.  It is relatively large in amplitude in adults (approx. 2-5 uv or more) but may be absent in young children.  P2 is not as well understood as the P1 and N1 components, but it appears to have generators in multiple auditory areas including the primary auditory cortex, the secondary cortex and the mescencephalic reticular activating system. It has been hypothesized that P2 (or at least the magnetic version P2m) is generated from multiple sources, with a center of activity near Heschl’s gyrus.  P2 is best recorded using electrodes over midline central scalp regions.  As with N1, P2 does not appear to be a unitary potential, meaning that it is likely that there are several component generation processes occurring in the time-frame of P2, these components may be different for different age groups and subject states. P2 latencies are consistently reported to be delayed in older adults.

Equipment for Recording CAEPS
Two basic types of equipment can be used to record P1-N1-P2; research equipment and clinical equipment.  Each has advantages and disadvantages.  Equipment designed for research (e.g. Neuroscan system, Geodesic system) is more expensive and more difficult to operate than clinical equipment.  But with the added expense comes added flexibility.  It is possible to create almost any stimulus and stimulus presentation format. Furthermore, research equipment typically has the capacity to record from more amplifier channels (32 or more), and it is possible to save the continuous EEG for later offline processing.  This permits almost limitless types of post hoc analyses and allows information from all electrode sites to be analyzed.  For example, one simple but underused type of analysis is a calculation of global field power.  Global field power indices quantify the instantaneous global activity across the spatial potential field sampled over the entire scalp.  Because it takes into account contributions to the response from all electrode sites, there is an advantage in terms of signal-to-noise issues for global field power estimates, relative to the averaged waveform obtained at a single electrode site.

Equipment designed for the clinic (e.g. Biologic Navigator, GSI Audera) is typically lower in cost and is easier to operate than equipment designed for research.  Despite these advantages, there is little flexibility.  On systems designed for the clinic, it might only be possible to save averaged waveforms rather than the continuous EEG, which provides fewer data processing options.  Additionally, there is typically less control over the sweep time, the number of points per millisecond, and the recording rate.  Some clinical equipment cannot accommodate the presentation of speech stimuli or other complex sounds and may not be capable of producing the oddball paradigm necessary for eliciting the MMN, or a software upgrade may be necessary.  Furthermore, only two to four amplifier channels are generally available, which significantly impedes the ability to examine the scalp distribution of the response of interest.


Noise Sources
The CAEP is a small signal embedded in higher amplitude background noise.  All of the signal processing techniques presented in earlier chapters, such as filtering artifact rejection, and averaging, are critical to minimize the background noise and maximize the signal-to-noise ratio of the recorded CAEPs.  Because of their relatively short latencies, myogenic artifacts originating from the post auricular, temporalis, neck and frontalis muscles are somewhat less problematic for CAEPs than for ABR and MLR recordings.  However, other sources of noise can contaminate CAEP recordings. 
      One important source is line noise (60 Hz interference),
      Another is eye blink and eye movement potentials.  The amplitude of eye blinks is quite large, particularly for frontal electrode sites, and the morphology of the eye blink response can sometimes mimic the response of interest.  Therefore, several approaches to reducing their effect on the recorded CAEPs have been used.  The first approach is to instruct the subject to minimize eye blinks during the recording, to blink between stimuli when possible, and not be blink when the deviant stimulus is presented.  This approach, however, is problematic when recording CAEPs using a passive paradigm, with which subjects are instructed to ignore the stimuli, and is also problematic when recording from young children, who may not be able to follow the instructions.  As a result, it is common to use artifact rejection to eliminate trials that may be contaminated be eye movements.
      To do this, the electrooculogram (EOG) is recorded from a vertical (VEOG) and sometimes horizontal (HEOG), eye channels using electrodes above, beside, and below the eyes.  Asking the participant to blink and measuring the average amplitude of the individuals blink response lets the artifact rejection criterion (+/- 100 uv is typical) be set, and trials containing potentials to the EOG, channels exceeding the artifact rejection criterion are rejected for all recording channels.  In the event blinks contaminate much of the EEG recording it is possible to model the morphology and scalp distribution of the eye blink response and to correct the CAEPs at each electrode site.  This approach is advantageous because it preserves more of the EEG data while reducing or eliminating the effects of the eye blinks. 
      Collectively, these issues reinforce the importance of using multiple recording channels, because analyzing evoked responses obtained from the EOG channel and the scalp distribution of the ocular response can help to determine whether a response is contaminated by EOG artifact.
      Another noise source is alpha activity. Alpha is a rhythmic oscillation (8-13 Hz) in the electroencephalogram that typically occurs when a patient is awake and relaxed.  It is largest in amplitude over parietal and occipital regions and is especially problematic when the patient’s eyes are closed.  For this reason, it is important to record CAEPs with the patient’s eyes open.  In some individuals, alpha activity can make it difficult or even impossible to record the P1-N1-P2 complex.

Utility of Multiple Recording Channels
CAEPs are typically recorded from multiple electrode sites.  The standard montage is referred to as the 10 to 20 system, which has been modified several times as technological advances have allowed for recordings from increasing numbers of electrodes.  Studies employing more than 8 channels often use special electrode caps that permit fast electrode application and reasonably accurate electrode location.  Accurate placement of the electrodes is critical for CAEP recordings.  The generators of these components are close to the surface of the head (e.g. relative to the auditory brainstem response), which means that the electrical activity is volume conducted or projected to the scalp much more narrowly than for example the ABR, which is generated by deeper sources.  As a result, inaccurate placement of an electrode by several millimeters is considered to be large errors and can even result in missing the response of interest.  Using a standardized electrode array also allows for data comparison across different research laboratories.

The recorded waveforms represent complex, overlapping activity from multiple generators, and responses recorded at a particular electrode may or may not be generated in adjacent brain regions.  That being said, multiple electrode site recordings are used to identify CAEP components based on their topography (their amplitude distribution across the scalp).  For example, the MMN is known to be maximal in amplitude at fronto-central scalp locations.  Therefore, negativity that peaks at parietal electrode site P2 for example, is not likely to be an MMN.

 In general, the more electrodes used to compute the map, the more accurately the topography of the response is represented.  Several mathematical modeling techniques can improve estimates of topography and neural generators, including scalp current density analysis; these include BESA (brain electrical source analysis) and LORETA (low resolution brain electromagnetic tomography).  The primary disadvantage of many of these techniques is that often more than one solution may account for the activation patterns observed, and the assumptions behind these models are not always valid. 

The maximum number of channels is a hot topic, and the number is growing rapidly.  It is possible to record from 256 (or more) channels; however, with such large numbers of electrodes, inaccurate electrode placement can cause problems.  For this reason, in many cases, a practical upper limit is approximately 64 electrodes.  Ultimately, the minimum number of channels that should be used to record CAEPs depends heavily on the investigator’s question. 

                      For questions of presence or absence of a response (for example, when using the P1-N1-P2 to estimate the patient’s audiogram), it is possible to obtain reasonable results using as few as 1 or 2 channels. 

                      Suprathreshold questions, including topography and/or location and orientation of the neural generators, require at least 16 to 32 channels, often many more.  Even though more channels provide a better estimate of the response distribution, it is important to note the limitations.  Without clear separation of source waveforms, it can be difficult to assess component amplitudes and latencies.  This can be particularly problematic with recording of response complexes such as P1-N1-P2 because each component has different responses to stimulus, subject, and recording parameters, and amplitude changes in one component can affect the measured amplitude of the adjacent component.
For studies of patient populations, one should err on the side of too many rather than too few electrodes.  For example, emerging studies suggest that individuals with autism and/or language impairment show abnormalities in the obligatory CAEP at lateral temporal electrode sites.  These abnormalities are missed if the CAEP is recorded solely from C2.  For recommendations regarding the number of channels for recording CAEPs and other recording parameters, please refer to Table3 
 Summary of recommended parameters for P1-N1-P2 to estimate audiometric threshold (Table 3)
Subjects
State
eyes
condition
Awake and quiet adults, children, infants
eyes open
attend or ignore conditions
Stimuli
Frequency rise/falltimes
plateau time
interonset interval intensity
250-4000 Hz tone bursts
20 ms
20 ms or more
1-2 seconds
10-80 dB peSPL (use clinical judgement)
Recordings
Save
Electrodes
Non-inverting
Inverting
Additional
Artifact rejection
EEG filters
Amplification (gain)
Analysis time


Number of trials/average

Replications
Save averaged waveforms
1-2 channels
Vertex
Ipsilateral or contralateral mastoid or tip of noise
Consider vertical eye channel
100 uv
1-30 Hz
10,000 – 30,000x
Prestimulus – 100 ms
Poststimulus – 700 ms


50-100

At least 2

Measurements
Adults
Children
Infants
Measures
N1-P2
P1, N200-250
Reliable components
Peak to peak amplitude required, peak latency recommended
Response presence



Summary of-
Determine by



Recording. Parameters-
Replicable components
Response 2-3x larger than amplitude in pre stimulus interval

For Supra threshold applications
Subjects
State
Eyes
Condition
Awake and quiet adults, children, infants
Eyes open
Attend or ignore conditions
Stimuli
Types of stimuli

Interonset interval intensity
Tone bursts, speech (vowels or consonant-vowel combinations), other complex stimuli
1-2 seconds
60-80 dB peSPL
Recordings
Save
Electrodes
Reference electrode
Artifact rejection

EEG filters


Amplification (gain)
Analysis time

Number of trials / average
Replications
Ongoing, continuous EEG for later post hoc analysis
16-32 channels or more
Tip of nose or averaged reference
+/_100 uv on all channels, or fist use eyeblink algorithm
0.15 – 100 Hz (acquisition)
1-30 Hz (post hoc, digital filtering, -12 db/octave filter slope) 
10,000 – 30,000x
prestimulus – 100 ms
Post stimulus – 700ms or more
50-300

At least 2
Measurements
Adults
Children
Infants
Measures
P1-N1-P2
P1 N200-250
Reliable components
Baseling-to-peak amplitude, peak latency
Use latency window established using grand mean data
Response presence
Determine by
Statistical means
Replicable components
Response 2-3x larger than amplitude in prestimulus interval , Appropriate scalp distribution.

Another important issue for multichannel recordings is placement of the reference electrode.  Use of a reference electrode at the tip of the nose is typically recommended, because it is in line with the main generators of these potentials on the supratemporal plane of the auditory cortex; therefore, responses recorded below the supratemporal plane invert in polarity when a nose reference is used.  The disadvantage of this approach is that the nose is only relatively inactive.

 Waveform inversion is used to identify components based on their scalp distribution.  Both P1-N1-P2 and the MMN typically exhibit this inversion.  It is also possible to record using an ipsilateral (or even contra lateral) ear reference.  This is useful when recording CAEPs to estimate hearing threshold, because it is the same set up used to record the ABR.  The disadvantage is that response inversion above the supratemporal plane is not typically present.  Use of a linked earlobe reference is not recommended, as this reduces response differences across the hemispheres.  Some research systems allow for calculation of a common average reference derived by summing activity in all electrode channels and dividing by the number of channels.  This approach has the advantage of minimizing bias from a single reference site.  Regardless of reference site use, it is important to note the location of the reference electrode so that scalp distribution of the response can be appropriately interpreted.

In general, the CARP is a very small-amplitude signal embedded in large-amplitude background noise.  Hardware and software should be selected to allow for the response of interest to be maximized and the background noise to be minimized.  The parameters can dramatically affect the quality of the CAEP waveform and must be very carefully chosen.

Test Protocols and Procedures

Stimulus Parameters
Stimulus type: Tones- tonal stimuli have typically been used to elicit ALR. Whereas shorter latency responses generally are not effectively evoked with stimuli having rise/fall times longer than 5 ms, optimal ALR stimuli have rise/fall times and plateau times of greater than about 10ms (Onishi & Davis, 1968; Rise/fall times of over 20 ms and durations of hundreds of milliseconds are even effective in eliciting the ALR.

                               As a rule, amplitudes for the N1 and P2 components of the ALR are larger, and latencies longer, for low-frequency tonal signals when compared to high frequency signals. Attention during ALR measurements is usually verified by asking the subject to count silently the number of target stimuli presented and to keep a mental or written record of the number until the averaging run is complete.
                               Some components of the ALR (e.g., P100 and N250) show larger amplitude and shorter latency for complex tones than for single frequency tonal stimuli.
                               Two major ALR components – N1 and P2 – can also be elicited by the modulation of amplitude or frequency of a tonal signal and by acoustic manipulations of features of speech stimuli (e.g., amplitude, spectrum, and formant frequencies), reflecting neutral detection of the acoustical changes (Kaukoranta).

Stimulus type: speech- Speech stimuli are quite effective in eliciting the ALR. ALR findings have been reported for different types of speech signals, including natural and synthetic vowels, syllables, and words. Complex tonal stimuli generate ALR components P100 and N250 with larger amplitudes than speech (vowel) sounds (Ceponiene et al., 2001). There are other differences in ALRs evoked by simple tonal versus speech signals.
Latency of the ALR N1 components, for example, varies with the frequency of tonal. Whereas for natural speech sounds the N1 latency is consistently about 120 ms. The ALR can be applied in the electrophysiological assessment of the representation of speech cues in the central auditory nervous system. For example, latency of the ALR N1 wave evoked by speech sounds varies with voice onset time (Kurtzberg, 1989; Sharma et al., The effects of other speech cues on the ALR have also been reported for normal subjects in studies of auditory function in aging and in the clinical application of ALR in varied patient populations..

Natural vowel sounds generate ALR components (N1 and later waves) that are detected with considerably larger amplitude from the left hemisphere, whereas tonal stimuli produce symmetrical brain activity (Szymanski et al., 1999).

Most investigators of speech evoked ALR utilize syntheticacally created speech sounds (e.g., syllables /ga/, /da/). The stimuli were four consonant-vowel syllables (/bi/, /pi/, /si/, and /shi/), each a token from the Nonsense Syllable Test (NST). Taken together, the stimuli include a variety of acoustic features of speech, such as place of articulation, fricative phonemes with high-frequency energy, low-frequency vowel energy, and voice onset time. Stimuli were presented in the sound field at an intensity level of 64 dB SPL (at the ear) and with an ISI of almost 2 seconds (1910 ms). The ALR was recorded with a 31-channel electrode array (NeuroscanTM Quick-Cap system) over a 1400 ms analysis time (prestimulus time of 100 ms).

During ALR measurement, subjects watched a video of their choosing after being instructed to ignore the stimuli. There is long-standing evidence that tonal stimuli and synthetic speech sounds produce repeatable ALRs (Pekkonen) demonstrated that natural speech sounds also elicit reliable ALR components (P1, N1, and P2). Intersubject test-retest reliability was high, and the ALR was stable within subjects from one test session to the next. ALR morphology varied as a function of the speech stimulus. That is, speech sounds with different acoustic features generated differences in ALR waves, including smaller or larger amplitudes for specific negative and positive waves (e.g., N345 and P413).

There were also reliably distinctive ALR findings (e.g., neural patterns) when natural speech sounds differed according to important acoustic dimensions, such as two fricative sounds with different places of articulation or two stop constants that differed in voice onset time. Previous investigators synthetically generated voiced speech sounds evoked ALR waves (N130 and P217) with larger amplitudes than waves evoked by voiceless speech sounds.

Given the stability of the ALR to natural speech sounds, and its sensitivity to changes in acoustic properties of speech, we can anticipate investigations of the potential clinical application of the ALR in documenting auditory processing in various clinical populations including children with hearing aids.
Stimulus type: other. Speech stimuli at the word level are effective in eliciting the N400 wave within the ALR. Words will semantic content (e.g., common names and proper names), specifically those that are semantically anomalous or incongruent, are particularly effective in eliciting the N400 response. Amplitude of the response increases directly with the extent of semantic incongruence (e.g., Kutas & Hillyard, 1980). As summarized in the discussion N400 can be recorded in this way during certain sleep stages, as well as during wakefulness.

Duration- Effects of duration on the ALR in normal-hearing subjects study, the stimuli were 1000 Hz tone bursts with linear onset-offset ramps. Varying rise/fall and plateau times produced somewhat complex effects on ALR latency and amplitude. For example, at a fixed rise/fall time (30 ms), there was no change in latency (of N1 or P1 components) or of amplitude (N1 to P1) as duration was varied from 0 through 300 ms. Brief rise/fall time of 3 m, a progressive reduction of the plateau time form 30 ms down to 0 ms produced a corresponding reduction in ALR amplitude.

Also, with a relatively long fixed-plateau time, ALR amplitude remained constant as rise/fall time was decreased form 50 to 300 ms. Steeper slopes for the rise/fall time resulted in shorter ALR latencies. Found a significant reduction in ALR threshold as a function of signal duration, a pattern consistent with the conclusions of numerous psychophysical studies of temporal integration. Latencies for N1 and P2 waves decreased with increasing duration, although the latency changes were mostly for the change in signal duration from 8 to 32 ms.

Amplitude increase as stimulus duration increases up to approximately 30 to 50 ms but decreases when rise and fall times exceed 50 ms. Amplitude for the N1 wave increased linearly as a function of stimulus duration, and the change was the same for in all subject age groups. Amplitude changes were noted with stimulus duration differences of 2 to 4 ms. An age-related effect, however, was observed for the P2 component. Young and middle-aged subjects showed an increase in amplitude with longer durations, whereas duration changes did not produce a significant amplitude change for the older adults. The authors interpreted this finding as evidence of impairment in the encoding of signal duration with advanced age.

Intensity- P1-N1-P2 amplitude increases with stimulus intensity in an essentially linear manner, though the amplitude-intensity function may saturate at intensities exceeding approximately 70 dB normal hearing level (nHL), particularly when short ISIs are used.  P2 amplitude may saturate at higher stimulus intensities than N1.  In general, latencies decrease as stimulus intensity increases.  At low intensities, P2 latency increases more than N1 latencies.

  As intensity approximate behavioural threshold for the same stimulus (e.g., 1000 Hz), the P2 wave disappears first, and then the N1 wave. ALR latency changes with intensity vary for clicks versus tonal stimuli.
  For an ALR evoked by a click stimulus, latency for the N1 or P2 components changes relatively little as stimulus intensity increases, except at intensity levels very close to auditory threshold. As Rapin et al. (1966) point out
   ALR latency has limited potential for estimation of audiometric threshold.
  Variability in response latency occurs with intensity levels near threshold, but it decreases a stimulus intensity level is increased to about 40 dB or higher levels.
  N1 wave was detected for signal intensity levels that were an average of 8 dB higher than behavioural thresholds for 1000 Hz (standard deviation of 3.7 dB) and 7 dB higher for 4000 Hz (standard deviation of 3.2 dB).
  ALR threshold is clearly influenced by signal duration, reflecting an electrophysiological version of the psychophysical process of temporal integration or time intensity trading.

ALRs recorded from midline site (e.g., Fz and Cz) are more dependent on signal intensity and the order of signal presentation (of different intensities) than those for lateral scalp electrode sites over the temporal lobe regions (Carrillo-de-la-Pena & Garcia-Larrea, 1999). Gradually with increasing intensity levels, in some persons actually reaching a plateau or “saturation” above approximately 75 dB. Amplitude increases as a function of intensity are steeper for lower frequency stimuli (e.g., 500 Hz than for higher frequencies (e.g., 8000 Hz).
Rate and Interstimulus Interval (ISI).
P1-N1-P2 amplitude increases as the rate of stimulus presentation decreases until the ISI is approximately 10 seconds.  At low stimulus intensities, amplitudes asymptote or level off at shorter ISIs, while at high stimulus intensities, amplitude increases continue to occur even beyond ISIs of 10 seconds.  The most pronounced effect of longer ISI times is within 1 to 6 seconds. There is little change in latency as stimulus rate changes.

 The ALRs are highly dependent on ISI (Budd et al 1998). ISI is a more accurate and straightforward way of describing the rate factor in ALR measurement than simply noting the number of stimuli presented per second.

                                ALR studies confirmed that longer ISIs and, concomitantly, slower stimulus rates produced substantially larger amplitudes for N1 and P2 components, but had little effect on the latency of these ALR components.
                                The refractory time is directly related to the latency of the evoked response, but also to response amplitude. Presentation of a signal during the neuronal recovery process (i.e., when the ISI is shorter than the refractory time) results in smaller than optimal amplitude. Conversely, with increases in the ISI there are predictable increases also in ALR amplitude.
                                The increased ISI required for production of maximum amplitude ALR waves is not necessarily related temporally, or neurophysiologically, to the refractory period for individual neurons.

                                The most pronounced effect of longer ISI times is within the range of 1 to 6 seconds. However, further increases in amplitude may be observed by lengthening ISI times to 10 seconds or even longer.
                                At these slower stimulus presentation rates (longer ISI times), the amplitude of the N1 or the P2 components of the ALR are, on the average, 6 to 8 mVolts when evoked with a similarly moderate stimulus intensity level.
                                Relationship between stimulus presentation rate and ALR amplitude is to employ slow stimulus rate (longer ISIs) in recording the ALR in patient populations.
                                Stimulus intensity also interacts with rate. The amount of amplitude increase associated with lengthened ISIs-that is, the amplitude-versus-ISI slope–is steeper for higher intensity levels.
                                For ISI values of less than 4 seconds, ALR wave N 100 amplitude is comparable for frontal versus central electrode recordings.
                                With longer ISIs (greater than 4 seconds), vertex electrode recordings yield larger amplitudes.
                                Longer ISIs (e.g., > 1 second) are required to consistently record an N1 component from children (Bruneau et al., 1997).
                                For children, stimulus rate in general is an important factor in the amplitude of the N1 component, with decreases in amplitude on the order of 50 percent or more when the ISI is reduced from 4 seconds to 1 second.

Stimulus Repetition Late AERs have been elicited with various patterns of stimulus repetition, including presentation of single stimuli at regular intervals, single stimuli at irregular intervals, the trains of stimuli (a cluster of one or more signals separated by relatively short intervals) followed by longer (intertrain) intervals.

 Crowley and Colrain (2004) reported decreased N1 amplitude for signals within a train, indicating short-term habituation, and decreased N1 amplitude from train to train as an indication of long-term habituation.

The ALR N1 component was evoked with repeated trains (sequences) of four tones presented with ISIs of 1 second and separated by an interval of 12 seconds. In children, amplitude of the N1 wave decreased by about 50 percent from the first to the fourth successive tones within the sequence, and N1 latency increased. During the stimulus repetition, the N2 wave in children increased in amplitude. With continued recording of the ALR from children, there was a gradual dominance of the N2 wave and loss of the N1 wave. Although signal repetition with the four tone sequence also produced smaller N1 amplitude in adults, the decrease was less than in the children, and the N1 wave clearly remained.

Contralateral Signals
The ALR may be altered by sounds presented to nonstimulus ear. The contralateral sounds may be tones, some type of noise (e.g., white noise), or speech (e.g., multitalker babble, meaningful discourse). Competing sounds presented to one ear appear to interfere with subject attention to signals presented to the other ear. Cranford and colleagues reported amplitude reduction for the N1 to P2 wave complex with the presentation of a speech signal (babble) to the nonstimulus ear.

Cranford, Rothermel, Walker, Stuart, and Elangovan (2004) further investigated the effects of the difficulty of a listening task and a competing signal on the N1 and P2 components of the ALR. In this study, subjects were ten young normal-hearing female adults (age 20 to 35 years). The tasks involved discrimination of two frequencies that were separated by either an octave (1000 versus 2000 Hz) or only 100 Hz (1000 versus 1100 Hz), and these tasks were performed in quiet (with no competing signal) and then with speech competition presented to the nontarget ear.
Results
  Amplitude for the N1 wave was the same for each discrimination task (easy versus difficult) and in the quiet versus competition signal conditions.
  In contrast, there was a reduction in amplitude for the P2 component for the difficult versus easy task and with the competing signal in comparison to the quiet condition.
  These findings are another example of the independence of the N1 and P2 waves and argue against simple analysis of the N1 – P2 complex within the ALR waveform.
  The work of Dr. Cranford and colleagues also points to some potential clinical applications, such as measurement ALR with competing sounds in children with auditory processing disorder (APD).

Acquisition Parameters of LLR
Analysis time- The ALRs are long latency responses with major components (P1, N1, P2, N2) and other waves (e.g., N400) beginning or persisting long after the “middle-latency” region. The ALR analysis time should extend for at least for 500 ms after the stimulus. Post stimulus analysis times of 1000 to 1500 ms (1 to 1.5 second) in ALR measurement are often reported in literature, almost always with a pre stimulus analysis period (e.g., 100 ms).

Electrodes- Much of the current information on the different effects of electrode location on the ALR was generated by investigators attempting to determine the neural sources of the response. Pauline Davis (1939), in the first description of ALR, noted that the response was largest when recorded at the vertex. Many other investigators have subsequently presented evidence confirming that the vertex, or a location within two or three centimeters lateral or anterior, is an optimal electrode site. The figure adapted from the classic Vaughan and Ritter (1970) study showing ALR waveforms recorded from different coronal electrode arrays offers a concise illustration of the influence of recording site on the response. There is diminishing response amplitude at greater distances from midline and then clear reversal of the waveform polarity in the region or plane of the temporal lobe (the Sylvian fissure).

  The ALR can, therefore, be reliably recorded with a noninverting electrode located anywhere over the frontal portion of the scalp of the head, especially along the midline.
  The ALR components usually have maximum amplitude with a vertex site.
  Major ALR components (e.g., N1 and P2) have smaller amplitudes when recorded with hemispheric electrodes over coronal (e.g., C3 and C4) and temporal regions (T3 and T4).
  However, some ALR components, such as the Nc wave with a latency of about 150 ms, are recorded with noninverting electrodes over the temporal lobes.
  Wolpaw and Penry (1975, 1978), for example, provided evidence of a difference in waveform morphology in the 80 to 200 ms region for Cz versus T3/T4 electrode sites that they referred as the “T complex”.
  The T complex was composed of a positive voltage peak at about 105 to 110 ms and then a negative peak at about 150 to 160 ms.
  These investigators further showed that the conventional N1 – P2 complex and new T complexes were greater in amplitude when recorded from electrodes located on the scalp contralateral to the stimulus, and greater for T4 (the right side) than T3 (the left side).
  Right hemisphere dominance in brain activity (and sometimes a left ear effect) is often observed for nonverbal stimulation.
  ALR generation, at least for the N1 component, involves in part the posterior superior temporal plane and nearby parietal lobe regions.
  Amplitude of the N1 – P2 complex is influenced by an interaction of signal intensity, the order of signal presentation, and the noninverting electrode site (Carrillo-de-la-Pena & Garcia-Larea, 1999). ALRs recorded from frontal-central electrode sites (e.g., Fz and Cz) are more dependent on signal intensity and the order of signal presentation (of different intensities) than ALRs detected with lateral scalp electrode sites over the temporal lobe regions.
  In addition, amplitude of the N1 component is larger when it is recorded from an electrode over the frontal or temporal lobe contralateral to the side of stimulation, whereas amplitude of the P1 and P2 components is diminished for a contralateral (versus ipsilateral) noninverting electrode array.
  The inverting electrode for ALR measurements, as reported in the literature, is usually located either on the mastoid or on the earlobe ipsilateral to the stimulus ear, or electrodes linked between both ears.
  Commonly used reference sites (e.g., mastoid, nose, and ear) in ALR measurement are highly active.
  The nape of the neck is a practical and effective option for a noncephalic inverting, and true reference, electrode site.
  When the noninverting electrode serves as a true reference, all of the brain activity contributing to the response is detected with the noninverting electrode and amplitude is maximal.
Filter settings Frequency composition or spectrum for the ALR response and the P300 response are mainly in the frequency region under 30 Hz. Band-pass filter settings of less than 1 Hz (e.g., 0.1 Hz) to 30 or 100 Hz are typically employed in ALR and P300 measurement, with commonly reported values for the roll-off of 24 dB/octave for the high-pass filter and 12 dB/octave for the low-pass filter (e.g., the 100 Hz setting).
Analysis and Interpretation
Normal Variation. ALR components are not necessarily the same as ALR peaks. Although this issue of terminology may appear to be simply a matter of semantics, the point is rather fundamental and related to the anatomic source of the ALR. That is, a wave may in fact include more than one individual component, perhaps multiple components arising from different neural sources and affected differentially by manipulation of stimulus parameters or even neuropathology.
 Relatively minor alterations in the acoustical properties of the signal (s) and subtle variations in subject behavior markedly influence morphology of the late responses and even the presence or absence of specific ALR waves or components with a single wave.
  Waveform morphology for earlier latency AERs such as EchohG, ABR and even AMLR, is remarkably consistent from one subject to the next and within subjects, for most variations of stimulus characteristics. Subject’s factors, such as state of arousal or attention to the signals, are negligible for the early latency responses.     
  In general, the morphology of auditory late response waveforms is complex and highly variable, especially for certain types of signals (e.g., speech sounds) and for demanding listening tasks.
  The ALR N1 (N100) wave, really a wave complex, is usually a well-defined and sharp wave occurring, within the latency region of 75 to 150 ms. Parameters of the N1 component (including its presence, latency, and amplitude) and the presence of subcomponents or other negative waves within the same time frame, are determined by the physical properties of the stimulus, such as the type (e.g., tone-burst or speech stimulus), the frequency for tonal stimuli, the intensity, the duration and the rate of presentation), and also subject factors, such as state of arousal and sleep, attention, and memory.
  Indeed, the N1 wave is enhanced (made more negative) when a subject selectively attends or listens to a specific stimulus. Researchers have debated whether the N1 is actually increased in amplitude during selective attention or whether the negativity in the region of N1 is really made greater by an overlapping processing negativity (the Nd component). Variations or components of the N1 wave complex include the N1b and N1c components, although it is not always easy with modification of measurement parameters to clearly differentiate the two waves (Perrault & Picton, 1984).
  The Nb component is recorded with a latency of about 100 ms with a noninverting electrode at a midline (e.g. Cz or Fz) site,
  whereas Nc is recorded with electrodes sites over the temporal lobe (e.g., C3 and C4).
  The N110 response, another variant or component of the N1 wave complex, can be evoked with speech stimuli or by specific acoustic properties or features of speech stimuli.
  The Nd wave, referred to as the “processing negativity” or PN (Naatanen & Michie, 1979), is a broad wave that follows and persists after presentation of a stimulus. The Nd wave usually begins at a latency of about 150 ms, although with alterations in stimulus parameters (e.g., the ISI) or specific characteristics of subject attention to the stimuli, processing negativity may coincide with the earliest portion of the N1 wave.

The N400 response, as the label implies, is a negative wave in the region of 400 ms. The auditory N 400 response is evoked with speech stimuli. The typical stimulus paradigm for the auditory N400 response involves semantic properties of language for single words, e.g., related versus nonrelated or for words within sentences. A sentence that is semantically appropriate and expected – that makes sense (e.g., “I like my beer in a glass”) - does not elicit an N400 response, whereas a sentence that is, unexpectedly, not semantically appropriate (e.g., “I like my beer in a shoe”) produces an N400 response.

The P2 component is a robust, positive wave in the latency region of 180 to 250 ms. In normal persons, P2 may appear as a rather sharply peaked wave, as a broad wave with no distinct peak, or as a wave complex with multiple peaks. As noted in the preceding discussion about the N1 component, it’s likely that major waves of the ALR as recorded from the scalp, are really wave complexes. Much of our information about the ALR dates back to investigations conducted in the 1960s and 1970s. The N2 wave, the negative wave following P2, is substantially influenced by stimulus intensity, stimulus probability, the difficulty in determining a difference between two stimuli, and subject attention.

Finally, underlying components of the ALR N1 wave (e.g., N1, Nb, Nc, and Nd) is a general negative voltage shift in brain electrical activity referred to as the “sustained negativity potential”. The sustained negativity potential has negative voltage relative to the prestimulus baseline and is maintained throughout the duration of a stimulus.
Abnormal patterns Abnormal ALR findings include reduction in amplitude and prolongations in latency, polarity reversal for selected components, and total absence of one or more components. Because of the inherent normal response variability, the rather strict type of criteria used in analysis of shorter latency responses, such as ABR, are not appropriate for ALR. Even interaural (between ear) differences in response parameters are not applicable in most cases because binaural stimulation is often employed.      
       
Maturation and Aging
The morphology of the P1-N1-P2 complex is affected by maturation. The complex changes dramatically over the first 2 years of life.  The complex begins as a large P1 wave is followed by a broad, slow negativity occurring near 200 to 250 ms after the onset of the sound.  A P1-N1-P2 complex similar to that of adults is not seen until approximately 9 to 10 years of age unless stimuli are presented at a very slow rate.  Refractory changes occurring between the ages of 6 and 18 years of age can affect waveform morphology.  Also, responses recorded at midline central electrode sites, reflecting contributions from primary auditory cortex, mature more rapidly than those from lateral temporal sites, which reflect maturation of secondary auditory cortex. These potentials continue to mature until the second decade of life and then change again with old age.  Prolonged N1 and P2 latencies and amplitude changes have been reported in aging adults. (Fig 3)





ALR latency decreases and amplitude increases as a function of age during childhood, up until about age 10 years (Weitman, Fishbin & Graziani, 1965; Whiteman & Graziani, 1968).Some investigators have described this latency increase and also amplitude decrease, with advanced age (Callaway, Halliday, 1973; Goodlin et al. 1978; Roth & Kopell, 1980).Games et al. (1997) examined maturational changes in spectro-temporal features of central and lateral N1 components of the auditory evoked potential to tone stimuli presented with a long stimulus onset asynchrony.
     Peak latencies of both components decreased with age.
     Peak amplitude also decreased with age consequently, the difference between the lateral N1 and the central N1 amplitude also decreased with age.
     Deepa (1997) studied the age related change in ALLR.  The LLR waveforms achieved at 70, 50 and 30 dBnHL.
     There was significant difference between children and adults for all the peak latency and for amplitude.
     There was no significant difference between males and females for adult and children.       
There was a significant difference only in N1 peak latency 7-8 age group of age, P2 latency between 8-9 years group and P2 latency between 7-9 years of age group.The purpose to study cognitive potentials in Elderly persons (J. Am. Ac. Aud) was to analyze the changes in the acoustically evoked cognitive potentials N200, P300 with ages.  232 participants who were 60 years or older.
-        N200 was elicited in 46.9% &
-        P300 in 45%
Significant predictors of presence of cortical responses are the participant’s age and hearing level at target frequency.
Gender: There is some evidence that N1 latencies are shorter and amplitudes larger in women than in men.  Additionally, amplitude-intensity functions have been reported to be steeper for females than males. 
Handedness: There was no handedness effect seen for the N1 amplitude, latency of N1 component was shorter for left handed versus right handed subjects.P2 amplitude values were smaller in left hand users. Handedness was not a factor in N2 amplitude.
State of arousal and sleep: Sleep has pronounced effect on LLR. There are significant but differential changes in the major ALR waves as the person becomes drowsy and falls asleep. Progressively diminished amplitude of N1 is seen from wake to sleep state.( Campbell & Colrain, 2002). During the transition to deep sleep P2 amplitude increases (Campbell et al., 1992). The over all amplitude of N1 and P2 may remain reasonably stable across sleep stages (de Lugt, Loewy, & Campbell, 1996)





Attention: N1 and P2 waves of ALR are altered differentially when the subject is paying close attention to the stimulus or listening for a change in some aspect of the stimulus. N1 wave, an increase in attention causes greater amplitude. The P2, appears to diminish with increased attention by the subject on the signals (Michie et al., 1993)

Drugs:
Sedatives: ALR variability is increased.
Measurement of ALR and P300 responses under sedation is ill advised as validity of the findings may be compromised. Opioid analgesic like Morphine has no apparent effect on ALR or ABR. Droperidol produces prolongation of P1andN1 components by about 10ms and also reduction in amplitude seen.
Anesthetic agent: The results across studies are varying greatly. Generally concluded that there is little effect on the latency of ALR components, but there might be amplitude reduction seen as anesthesia is given.
Alcohol: amplitude of ALR is reduced by acute alcohol intoxication. Latency of N1 component was seen to be prolonged after alcohol ingestion, where as P2 latency was unchanged( Teo & Ferguson, 1986)

Recording Factors
The P1-N1-P2 complex and threshold estimation
Table 3 summarizes recommended stimulus, recording, and measurement parameters for P1-N1-P2 for threshold estimation.  Clinical judgement should be used in determine the most efficient response testing, it is often wise first to obtain thresholds for a high frequency in each ear and a low frequency in each ear and subsequently to fill in other frequencies provided the patient remains quiet, alert, and cooperative.  It is not always necessary to begin testing at high intensities (e.g. 80 dB peak equivalent sound pressure level (peSPL) and it may be more efficient to begin with stimuli of low to moderate intensity (e.g. 20-40 dB peSPL).  If a response is present at low intensities, it is not necessary to test at higher intensities.


P1-N1-P2 for Suprathreshold Applications
Table 3 summarizes recommended parameters for P1-N1-P2 for suprathreshold applications, such as deterining cortical responsiveness, such as determining cortical responsiveness to sound, examining the integrity of the central auditory system pathways, and applications related to the processing of complex sounds such as speech, topographical mapping and other research applications.  There is greater flexibility in these recommendations (equipment permitting) in terms of stimulus options and recording options.  It is therefore critical that all parameters in an experiment be set exactly the same across subjects, and filter settings that vary from subject to subject have the potential to confound the results of an experiment.

In addition to tone bursts, a variety of complex sounds are appropriate for suprathreshold application.  At least 16 to 32 channels are required to reasonably estimate scalp distribution of the response.  It is recommended that the ongoing EEG be saved for post hoc analysis.  For these research applications it is appropriate to obtain baseline-to-peak amplitude measures.

Applications of CAEPs
This section provides an overview of applications for CAEPs.  As previously described, the auditory P1-N1-P2 complex signals the cortical detection of an auditory event, can be reliably recorded in groups and individuals, and is highly sensitive to disorders affecting the central processing of sound.  Therefore, P1-N1-P2 response are typically used by audiologists to estimate threshold sensitivity, especially in adults; to index changes in neural processing with hearing loss and aural rehabilitation, and to identify underlying biological processing disorders in people with impaired speech understanding.


Estimation of Hearing Threshold    
The P1-N1-P2 complex is highly sensitive to hearing loss and P1-N1-P2 and behavioral thresholds typically fall within approximately 10 dB of each other.  Larger discrepancies have been reported, however, this is most likely due to lack of control over subject state.
The N1 is used in some clinical settings for the assessment of threshold in adult compensation cases and medico legal patients.  Similar to the intensity functions for the ABR, N1 latency increases and amplitude decreases as the intensity of the stimulating signal approaches threshold.  This pattern of results is evident in Figure 4, which shows good agreement between evoked responses and behavioral thresholds recorded from a 12-year- old child with a high-frequency sensorineural hearing loss.  As a result, the P1-N1-P2 complex is the method of choice when estimating hearing thresholds in cooperative, awake patients.








                                                                              Fig 4
    
The P1-N1-P2 complex has several advantages over ABR for threshold estimation:
(1) It can be elicited by longer-duration, more frequency-specific stimuli than the ABR and thus provides a better estimate of the audiogram.
(2) It involves less data collection time because cortical responses are larger in amplitude and thus easier to identify than the ABR. 
(3) It is mores resistant to electrophysiological noise.
(4) It provides a measure of the integrity of the auditory system beyond the brainstem.
(5) It can be evoked by complex stimuli, such as speech, and can therefore be used to assess cortical speech detection. 
It is for these reasons that the P1-N1-P2 complex is considered to be superior to the ABR for threshold estimation.  Yet in the United States the P1-N1-P2 is rarely used for this purpose, and for the most part has been replaced by the ABR.  This is likely because the largest population of patients requiring physiological estimates of hearing sensitivity is infants and young children, who need to be sleeping and/or sedated during testing.


Indexing changes in neural processing with hearing loss and aural rehabilitation
One of the more recent applications of CAEPs is monitoring experience-related changes in neural activity.  Because the central auditory system is plastic, that is, capable of reorganization as a function of deprivation and stimulation, CAEPs have been used to monitor changes in the neural processing of speech in patients with hearing loss and various forms of auditory rehabilitation, such as use of hearing aids, cochlear implants, and auditory training.


Hearing Loss
     CAEPs have been used to examine changes in the neural processing of speech in simulated and actual hearing loss.  Martin and colleagues examined N1, MMN (along with P3), and behavioral measures in response to the stimuli /ba/ and /da/ in normally hearing listeners when audibility was reduced using high-pass, low-pass, or broadband noise masking, partially simulating the effects of high-frequency, low frequency, and flat hearing loss, respectively.  In general, N1 amplitude decreased and latency increased systematically as audibility was reduced.  This finding is consistent with the role of N1 in the cortical detection of sound.  In contrast, the MMN showed decreasing amplitude and increasing latency changes beginning only when the masking noise affected audibility in the 1000- to 2000- Hz region, which is the spectral region containing the acoustic cues differentiating /ba/ and /da/.  This finding is consistent with the role of MMN in the cortical discrimination of sound.  Similar results have been obtained in individuals with sensorineural hearing loss.  That is P1-N1-P2 latencies increase and amplitudes decrease in the presence of hearing loss, and MMN latencies increase and amplitudes decrease as behavioral speech discrimination becomes more difficult.
     Figure 5 shows the ACC in a subject with flat sensorineural hearing loss.  In this example, nine synthetic vowels contain a range of second-formant frequency changes at midpoint.  The stimuli ranged from no acoustic change at midpoint (perceived as a long /u/) up to a 1200-Hz acoustic change (perceived as /ui/).  The detection of sound onset is evident in all conditions, as reflected by the presence of a P1-N1-P2 onset response.  A second P1-N1-P2 complex (the ACC) is clearly elicited by the change at stimulus midpoint in the 1200-Hz condition and persists through approximately the 38-Hz acoustic change condition.  The subject’s behavioral discrimination of the /u/-/i/ contrast exceeded chance in the 38-Hz condition (Fig. 5, starred condition).  Thus, the response complex signaling the neural detection of second formant frequesncy change is clearly elicited in individuals with sensorineural hearing loss.  This response decreases in amplitude and increases in latency as the amount of second-formant frequency change decreases and shows good agreement with behavioral thresholds.  These results are consistent with those obtained by Ostroff (parametric study of the acoustic change complex elicited by second formant change in synthetic vowel stimuli (unpublished doctoral dissertation), City University of New York, 1999) in a study using listeners with normal hearing except that the ACC and behavioral discrimination thresholds were sometimes elevated in the current study, as would be expected for listeners with hearing loss.













                                                           Fig 5
     Only a few studies have investigated the effects of conductive hearing loss on suprathreshold CAEPs.  Not only are prolonged latencies typically associated with conductive hearing losses.  Tecchio and colleagues found that the magnetic N1 responses to tones showed enlarged cortical representation after surgery and that these plastic changes occurred within a few weeks of surgery.  Rapid cortical reorganization, when measured using the N1 component of the P1-N1-P2 response, is also seen in adults with sudden unilateral hearing loss.  In fact, children with congenital unilateral hearing loss not only show central effects of auditory deprivation; development of the magnetic N1 response is also delayed.

Hearing Aids
     ERPs can be reliably recorded in individuals, even when the sound is processed through a hearing aid.  Yet to date only a few studies have examined ERPs in patients using hearing aids.  In earlier studies, cortical ERPs were recorded in aided versus un-aided conditions in children with varying degrees of hearing loss.  These studies showed good agreement with the neural detection and audibility of sound.  That is, in unaided conditions (sub-threshold) the equivalent of the P1-N1-P2 was a clear obligatory P1 response, followed by a prominent negativity (N200-250).  In other examples, a child with a progressive hearing loss initially demonstrated a large obligatory response in the aided condition that later disappeared when she could no longer behaviorally respond for the sound .Kraus and McGee reported in two subjects with sensorineural hearing loss.  One subject had good behavioral discrimination of the /ba/-/da/ contrast tested while using his hearing aid and showed a present MMN (and P3), while the other subject had poor behavioral discrimination of the contrast, even while using his hearing aid, and had an absent MMN (but a present P3).
     Korezak, Kurzberg and Stapells also demonstrated that hearing aids improve the detectability of CAEPs (as well as improving behavioral discrimination performance), particularly for individuals with severe to profound hearing loss.  Even though most of the subjects with hearing loss showed increased amplitudes, decreased latencies, and improved waveform morphology in the aided conditions, the amount of response change was quite variable across individuals.  This variability may be related to the fact that a hearing aid alters the acoustics of a signal, which in turn affects the evoked response pattern.  Therefore, when sound is processed through a hearing aid, it is necessary to understand what the hearing aid is doing to the signal.  Otherwise erroneous conclusions can be drawn from waveform morphology.  Despite the latency and amplitude changes that can occur with amplification, most subjects with hearing loss tested by Korezak and colleagues still showed longer peak latencies and reduced amplitudes than a normally hearing group.



Cochlear Implants               
CAEPs can be recorded from individuals with cochlear implants (Friesen and Tremblay, in press).  Modeling of auditory N1-P3 generator sources produces in the auditory cortex in implant users’ results similar to those in normally hearing listeners.  Latencies and amplitudes of N1 and P2 in “good” implant users are similar to those seen in normally hearing adults but are abnormal in “poor” implant users and P2 in particular may be prognostic in terms of separating “good” from “poor” users.
CAEPs can be recorded in implant users in response to sound presented either electrically (directly to the speech processor) or acoustically (presented via loud-speaker to the implant microphone), however, stimulus-related cochlear implant artifact can sometimes interfere (Martin, in press; Friesen and Tremblay, submitted).  In many cases, radio frequency pulses (artifact) appear in electrodes surrounding the implant magnet Depending on the type of implant, however, it is still sometimes possible to view the response from a large number of remaining sites.
Some researchers have avoided the problem by using stimuli of relatively short duration, so that the stimulus artifact ends prior to the latency of the neural response of interest.  Another approach is to move the reference electrode along the isopotential field of the artifact until a point of null polarity is obtained and stimulus artifact is minimal.  Martin recently demonstrated that CAEPs can be recorded in this population and that the neural response and the implant artifact can be teased apart (Martin, in press; Boothroyd, in press).  Two factors can be used to assist in the identification of the neural response:
(1) Attention increases the neural response amplitude, whereas the implant artifact does not,
(2) In some electrode sites the implant artifact inverts in polarity but the neural response does not (Martin, in press). 
The ability to identify the neural response in cochlear implant patients will also be critical for the application of speech-evoked potentials to individuals with bilateral implants.
     A final cautionary note is that like hearing aids, cochlear implant devices alter the incoming signal.  Therefore, device settings (e.g. channel number, volume control) have the potential to interact with evoked waveform morphology (Friesen and Tremblay, in press).  For this reason, it is important to understand how the implant device is set and control for such variables.

Auditory Training
Hearing aids and cochlear implants deliver an amplified signal to a patient in the hopes of improving the detection and discrimination of speech.  Even if the central auditory system is capable of processing the signal, the individual’s ability to integrate these new neural response patterns into meaningful perceptual events may vary.  For this reason, CAEPs have been used to examine the brain and behavior changes associated with auditory training.










                                       Fig 6
The objective of auditory training is to improve the perception of acoustic contrasts.  In other words, patients are taught to make new perceptual distinctions.  When individuals are trained to perceive different sounds, changes in the N1-P2 complex and the MMN have been reported.  As perception improves, N1-P2 peak-to-peak amplitudes increase (Fig.6) and MMN latency decreases, while the overall duration and area of the MMN response increases.  Because CAEP changes have been shown to occur prior to improve behavioral perception of speech sounds, physiological recordings may be helpful to clinicians.  Audiologists could monitor changes in the neural detection of sound during auditory rehabilitation.  For this reason, there is much interest in how these potentials may assist the audiologist who is programming a hearing aid or cochlear implant or managing an auditory training program.

Other Applications
     In addition to hearing loss, CAEPs are being used to explore the biological processes underlying impaired speech understanding in response to various types of sound and in various populations with communication disorders.  In some cases, the motivation is to learn more about the relationship between the brain and behavior.  In other cases, it is necessary to use physiological measures when an individual cannot reliably participate in traditional behavioral methods of assessment.  Take for example children with learning disabilities.  Abnormal neural response patterns have been recorded in children with various types of learning problems.  Older adults with and without hearing loss and individuals with auditory neuropathy or dys-synchrony also show abnormal neural response patterns.  Because CAEPs reflect experience-related change in neural activity, CAEPs are now being used to examine children with learning problems undergoing speech sound training and other forms of learning such as speech sound segregation and music training.
     Finally, CAEPs are being used to explore physiological correlates of various psychoacoustic processes such as gap detection, monaural and binaural time resolution, binaural release from masking, and auditory stream segregation.  For the most part, these studies have shown that P1-N1-P2 and MMN responses can successfully index auditory temporal resolution thresholds using methods that are independent of attention.
     Cortical auditory evoked potentials (CAEPs) are brain responses generated in or near the auditory cortex that are evoked by the presentation of auditory stimuli. 
     The P1-N1-P2 complex signals the arrival of stimulus information to the auditory cortex and the initiation of cortical sound processing.
     The MMN provides an index of pre-attentive sound discrimination.
Taken together, these CAEPs provide a tool for tapping different stages of neural processing of sound within the auditory system, and current research is exploring exciting applications for the assessment and remediation of hearing loss.          





Mismatch negativity:
     The mismatch negativity response is a negative wave elicited by a combination of standard and deviant stimuli, and occurring in latency region of about 100 to 300 ms. MMN is elicited with an odd ball paradigm in which the infrequently occurring deviant sounds are embedded in a series of frequently occurring standard sounds. The MMN response can be evoked by a vast array of sounds, ranging from simple tones to complex patterns of acoustic features to speech stimuli. It is best detected when scalp electrodes placed over the frontal central region of the brain. Generation of the MMN is a reflection of several simultaneous or sequential and fundamental brain processes, including pre attentive analysis of features of sound (frequency, intensity, duration, speech cues), extraction or derivation of the invariance within multiple acoustic stimuli, a sensory memory trace in the auditory modality that represents the sound stimulation, and ongoing comparison of the invariant (standard) stimuli versus different (deviant) stimulus. In order for the MMN system to recognize that a deviant is different from the standard, there must be a memory of the standard. Naatanen 92 considered the relevant memory to be auditory sensory memory.
However, 2 levels of representation seem to be involved,
-Representation of the recently acoustic part (sensory memory)
-Representation of regularities or invariance’s extracted from what is available in sensory memory.
Sensory memory provides the raw data from which invariant aspects of the stimuli are extracted. A minimum of two stimuli must be presented in order for the system to establish and representation of invariance for frequency. Once a representation of invariance has been established, a stimulus that violates, that representation can elicit an MMN.

The MMN response is often categorized as an event related potential, a cognitive evoked response, or a discriminative cortical evoked response, in contrast to so called obligatory earlier responses (ECochG, ABR, AMLR). MMN can be used to assess the auditory processing in very young children and other patient populations that are challenging to assess with behavioral audio logic techniques due to deficits in state of arousal, motivation, attention, cognition and other subject variables. The MMN response is best recorded in a passive condition with the patient paying no attention to the stimuli. (Reading, watching a video, or even sleeping). MMN can be recorded from infants in deep sleep and patients from coma. Kane, Curry, Rowlands, Manara, Lewis, Moss, Cummins and Butler 96. The other feature on MMN response is the feasibility of eliciting it with very fine distinctions between the standard and the target stimuli, such as acoustic cues within speech signals. The discrimination of sounds as reflected by the MMN response is equivalent to behavioral discrimination of just noticeable differences in the features of sound. The MMN is an objective reflection or index of automatic central auditory processing. MMN reflects a preconscious or perceptual detection of a change in acoustic stimulation, even a slight change that is barely greater than the perceptual and behavioral discrimination threshold. The MMN response reflects information processing that precedes and may be a prerequisite for behavioral (conscious) attentive processing of auditory changes in the environment.
MMN appears as an enhanced negativity in response to deviant sound relative to that obtained in response to the standard sound. Figure shows the P1-N1-P2 complex obtained in an adult with normal hearing to the speech sound /i/ presented as a standard stimulus (solid line). The presence of this complex indicates that the auditory system has detected the /i/ sound at a cortical level. The deviant waveform can be seen as an enlarged N1, a second negative peak, or an attenuation of P2 compared to the standard waveform. The MMN is best observed by subtracting responses to sounds presented as standard from the responses to deviants. The amplitude is less than 2 uV. Amplitude and latency of MMN reflects the amount of deviance, with larger amplitudes and shorter latencies for longer acoustic deviations.
Anatomic origins of the MMN response include the supra temporal plane and posterior regions of the auditory cortex and regions within the frontal cortical lobe.
 Differentiation of the MMN versus other responses:
     The MMN response may be influenced by, related to, or even confused with and contaminated by other AERs within the same general latency region, such as N1, N2, P300 and N400 waves, or components of these waves. Any change in the acoustic stimulus, including the onset and the offset, may either evoke an N1 or a MMN response or both.
     The N1 wave and the MMN response are closely related both in terms of the elicitation by stimuli and their latencies. The MMN emerges as a distinct wave when a different stimulus follows a sequence a sequence of similar or identical stimuli. Picton et al (2000) identify five findings that differentiate the N1 from the MMN wave:
1.    The MMN will be elicited by any change between the standard and the deviant stimuli, even if the change is decrease in intensity. The typical effect of decreased stimulus intensity on auditory evoked responses, including the N1 wave, is a reduction in response amplitude. Similarly MMN can be elicited by a change in stimulus duration, whereas the N1 is not affected by changes in duration.
2.    The MMN response is relatively unaffected by ISI and may actually begin to decrease in amplitude or disappear for long intervals (10 seconds) between deviant stimuli. There is decreased amplitude as the rate of stimulation increases in N1 wave.
3.    The MMN is effectively evoked by small (fine) differences between the standard and the deviant stimuli, whereas the N1 is more likely to be generated by large differences.
4.    There is a relationship for differences between the standard and the deviant stimuli and the response latency, while the N1 latency remains unchanged.
5.    Difference sin nuero anatomic origins for the Ni wave versus the MMN response.
There are strategies to increase the likelihood of recording a pure MMN which is not contaminated by the other auditory evoked responses which is superimposed within the same time frame.
1.    Calculation of the difference waveform by subtracting the response elicited by standard stimuli from the response elicited by the deviant stimuli. Auditory evoked response waves elicited by and common to both sets of stimuli e.g. N1 and P2 are eliminated or at least minimized by with this process. But deriving the MMN waveform from this technique will introduce an additional noise into the MMN and may therefore decrease the signal to noise ratio.
2.    Increase the rate of stimulation. MMN is enhanced when the standard ISI is decreased. The amplitude of the other waves will reduce, but the MMN wave amplitude will be the same.
3.    Subject attention can be directed away from the deviant stimulus during MMN measurement.
4.    Small difference between the standard and the deviant stimuli.
Applications of the MMN response:
-Evaluation of speech perception
-Prelanguage function
-Auditory processing disorders in infants
-Objective documentation of neural plasticity with auditory, phonologic, and language intervention
-Assessment of benefit from hearing aids and cochlear implants in children
-Determination of the capacity or talent for music and learning foreign languages
-Diagnosis of a variety of psycho neurological disorders (schizophrenia, Alzheimer’s disease and other dementias, Parkinson’s disease
-Prognosis of outcome in comatose patients.
Recording parameters of MMN:
Analysis time:
A lengthy time window is required for recording the MMN response. MMN latency values for various speech and non speech stimuli fall within the region of 100-300 ms. to encompass the rather broad wave, analysis times of 500 to 750 ms are commonly used in MMN measurement, with a pre stimulus baseline period of 50 to 100ms.
Electrode montage:
Electrode types and location in MMN measurement are similar as those for other cortical evoked responses.
Non inverting: Fz, Cz, Pz, C3, C4, F3, F4, Fpz,
Inverting: nose tip or mastoid
Common: low forehead
 Studies in MMN often employ 21, 30 or even more scalp electrodes, with non inverting and inverting electrodes interconnected on the scalp, or with an inverting electrodes on forehead, the earlobes, or each mastoid region.
Naatanen 92 recommends use of tip of nose as reference instead of ear or mastoid because of the phase shift in para sagittal and temporal deviations makes it easier to identify the MMN topographically and to distinguish it from the N2b waveform, which has a different scalp distribution.
Lang et al 95, recommend seven non inverting electrode sites (CZ, C3, C4, Fz, F3 and Fpz). Different locations (Fpz on forehead) are used for the ground electrode. Mandatory electrodes located around the eyes (above and below and at each side of one eye) to detect horizontal and vertical eye movement during blinking.
Filter settings and averaging:
This is an effective measure to enhance the SNR. Filtering optimally eliminates energy at energies above and below the spectrum of the MMN response. The spectrum of the MMN is dominated by the low frequency. 0.1 to 30 Hz is usually employed din MMN measurement.
A notch filter of 50/60 Hz can be used.
Averaging more deviant response will decrease the noise level in recording.
Stimulus parameters:
The MMN is elicited by any discriminable change in the auditory stimulation, such as change in frequency, intensity, and duration or rise time or when a constant ISI is occasionally shortened. The MMN can be elicited by the difference between the standard and deviant stimuli that approach the behavioral just noticeable difference for the sounds. MMN is elicited by changes in more complex stimuli, such as speech stimuli, rhythmic pattern, and complex spatiotemporal patterns (Naatanen 92, Naatanen & Picton 87, n Picton et al 2000). Larger the acoustic differences, the earlier and larger MMN.
Stimulus type: Tones: The MMN is often recorded with a straight forward frequency difference between the standard and deviant stimuli. Frequency difference between two pure tones e.g. 1000 Hz standard versus 1100 Hz deviant stimuli. The memory trace is formed by repetitive presentation of a tone at one frequency (1000). This standard stimulus evokes a waveform that consists of a N1 and P2 complex. A second tone at another frequency (1100) generates a negative wave reflecting the brain’s detection of a change in stimulation (neuronal mismatch). The actual MMN response (difference waveform) is usually derived by subtracting waveform evoked by the standard stimulus from the waveform evoked by the deviant stimuli.
Csepe 98 studied the effect of frequency deviance from 25% to 5% in generation of MMN and found that the smaller the difference, larger the latency range and larger the cortical areas involved in participation.
The amplitude enhancement was seen with increasing frequency deviance and it reached plateau at deviance of 15% (Csepe & Milner 97). No MMN was elicited for frequency deviance less than 5 %.( used 5ms duration).
Pavilainen et al 97 elicited MMN for frequency deviance of even 5 %.( used 30 ms duration).
Sams, Paavilainen, Alho & Naatanen 85, studied MMN for frequency deviances. The standard stimulus frequency was 1000 HZ, deviant: 1002, 1004, 1008, 1016 & 1032 Hz.  They found that the MMN was elicited at level equivalent to threshold. Large MMN above the threshold (1016 & 1032 Hz). Small MMN at the threshold (1008).
Naatanen (95) & long (95) found when difference between standard and deviant frequency is small; the MMN amplitude is low and SNR poor. They also reported that with a large difference in frequency, the neurons in primary auditory cortex activated by the devices are different from those activated by the standard.
Kraus et al (93) obtained MMN for speech stimuli (variants of /da/), the stimulus differed in onset frequency of F2 and F3. There was significant MMN in both adult and children having similar MMN duration and peak latency. Children have larger MMN magnitude (MMN amplitude, MMN area).
Speech signals:
MMN can be evoked by different units of speech, including at one extreme acoustic cues within a single speech, and progressing to phonological units (phonemes), larger speech segments (Words), and even prosody and the semantic (grammatical) features of speech and language.
Many studies have demonstrate the MMN to speech stimuli, including vowel (Aaltonen et al 87, Luntenen et al 95) and CV syllables (Kraus et al 92, Merlin et al 99). MMN is an objective “preconscious” measure of speech perception and a technique for studying in humans the nuero biologic mechanism and processes that take place during speech perception.
Differences between categorically distinct speech sounds (/da/ vs /ga/) are often used in eliciting the MMN response (Kraus et al 92, Kraus et al 96, Kraus & Nicol 2003.
Persons highly adept at distinguishing behavior between /da/ and /ga/ ( just perceptibly different variants) yield correspondingly clear MMN responses to the same speech sound differences, whereas no reliable MMN response is detected in poor /da/ - /ga/ perceivers.(Kraus, Mc Gee, & Koch 1998).
Three factors influence the MMN response to speech stimuli: 1. the physical properties e.g. frequency, duration, intensity and changes in the properties over time. 2. The acoustic context, e.g., the sounds and acoustic conditions surrounding the speech stimulus; 3. The perceptual and linguistic experience of the subject. Kraus and Cheour(2000).
Pulver-Muller et al (2001) reported that MMN amplitude was larger when evoked by a syllable that completed a real word rather than a pseudo word.
Sandeep (2003) examined effect of stimulus type on amplitude and latencies of MMN with other variables such as ear of stimulation and electrode site.
When speech was used as a stimulus, greater amplitudes was obtained than when tonal stimulus. And when presented in left ear, greater amplitudes for speech were observed only in Cz and TR sites. Shorter latencies with speech stimulus /da/, /ga/ at all the sites were observed.
Speech stimulus elicited larger MMN responses when the stimulus is presented to the left ear than the right ear. (Csepe 95, Jaramillo, Paavilainen & Naatanen 2000). The left hemisphere laterality of the MMN with speech signals appears to be reduced, eliminated, or even reversed when speech stimuli are presented within background noise (Shtyrov et al1998).
MMN elicited by a non native speech syllables were initially symmetric responses becomes especially enhanced over the left hemisphere following training (Trembly 1996).
Duration:
Duration may be an acoustic feature distinguishing standard versus deviant stimuli. Sreevidya (2001) found the threshold for duration discrimination by using a 5msec tone burst (1000Hz) as a standard. Results revealed that adults could discriminate a duration deviance of 3ms, i.e. MMN was elicited for this deviance, in accordance to psychophysical correlates. Children (8-12 yrs) could discriminate 5ms deviances (i.e. it produced good MMN).
Korpilahti & Long (94) elicited duration MMN in 12 normal children (7-13 yrs) for 1000 Hz tone at 70 dB SPL, the standard was 150 ms and deviant 110ms & 50ms. Results indicated that the peak amplitude increased as the deviance increased and there was a negative correlation between age and peak latency.
Joutsniemi et al (98) found better MMN for greater duration deviance in normal subjects.
Pavilainen et al (93) elicited MMN amplitude for frequency deviation as a function of stimulus duration. Stimuli were in different blocks, either of 4, 10, 90, 100 or 300 ms in duration. Deviants were either higher in frequency then the standards. Results revealed that minimum stimulus duration for an efficient coding of the critical frequency information is of the order of 20-30 ms.
Intensity deviance:
MMN can be elicited by both intensity increments and decrements. Naatanen (1992) recorded MMN using 1000 Hz standards and varied the intensity of the deviants. MMN was recorded to intensity changes as small as 3 dB (lower intensities were not assessed) in general, as the amount of intensity change between the standard and deviant stimuli increases, MMN amplitudes increase and latencies decrease. As stimulus intensity decreases, MMN amplitudes decrease and latencies increase. Similar results have been reported by Naatanen & Picton (87).
Solo et al (1999) evaluated the intensity dependence of automatic frequency change detection using MMN. Standards: 1000 Hz, deviant: 1141 Hz. Intensity: 40, 50, 60, 70, 80 dB NHL. The results indicated that MMN mean amplitude increased with increase in intensity. MMN onset latency shortened at 6- to 70 dB NHL.]
Jose & Naatanen et al (1993) highlighted contribution of attention in intensity deviance evoked MMN. MMN may be attenuated in absence of attention suggesting intensity deviance evoked responses are vulnerable to attention.
Schoger(96), Stapell(98) reported that as the stimulus level is lowered, if the degree of deviance is held constant, MMN amplitude decreases and latency increases. Stapells & colleagues (98), increasing the degree of deviance for these lower intensity stimuli restores the MMN amplitude and latency.
Inter stimulus interval: ISI
The ISI facilitates the representation of the stimulus and detect a change from the one stored in the sensory memory (Naatanen 92). In general, amplitude of the MMN decreases as the ISI increases and MMN is absent with long ISIs. It is believed that the longer intervals between stimuli result in the sensory memory of the standards diminishing, which in turn results in lower amplitudes. Typically an ISI of 300-500 ms is used in MMN recording (Lang et al 95). However, clear MMNs have been obtained using ISIs as short as 60 ms. (Naatanen 92). Naatanen, Mentysalo 87, Sama et al 93, the memory trace required for MMN are said to last for 10sec that is an ISI more than 10 sec doesn’t not or rarely elicit MMN.
 There are different parameters which may determine the responses in experiments of ISI:
a.      The interval between the standard stimuli
b.      The interval between the deviant and preceding standard
c.      The probability that the deviant stimulus will occur on any particular trail.
d.      The number of intervening standard stimuli between the deviants.
Manipulating one of these variables can indirectly affect the others. The most common experimental manipulation of changing the general ISI (standard, standard and standard deviant) will change both the time between the last standard and the deviant and the time between the preceding deviant and the present deviant.
For MMN response increasing the stimulus rate and decreasing the ISI helps in identification of the MMN and in differentiating it from other cortical wave components. E.g. ISI is shortened, N1 wave amplitude decreases, while MMN remains unchanged.  MMN amplitude decreases as the interval between the deviant and the preceding standard stimulus increases, at least for simple stimuli. (Naatanen 92, Naatanen et al 87). When this interval approaches 10 seconds, there is possibility that the MMN will not be generated. Presumably this phenomenon is related to the duration of the sensory memory trace produced by the repetitive standard stimuli. In addition, MMN amplitude increases directly with the ISI. However, an increase in the interval between deviant stimuli introduces probability of the deviant stimulus as a possible factor I the amplitude change.
Number of stimuli:
The MMN responses are buried in the physiological EEG background activity. The background activity will not be cancelled in the averaging. Hence, more than 10,000 deviant responses should be averaged for resolving the hypothetical 0.3 uv MMN response. In practice, the duration of recording session is limited and it is seldom possible to collect more than 200 to 300 deviant and 1200 to 2700 standard responses (Lang et al 95).
 For all practical purposes one quarter to one half deviants of the total number of stimuli are to be averaged for better signal to noise ratios ( Picton, Lindden, Hamel & Maru 83) .
Rate of stimulation:
If simple stimuli are used, the MMN amplitude increases when the ISI is shortened provided the intervals between the deviants are of the same duration (Naatanen et al 87). Whereas N1 and P3b components decreases rapidly in amplitude with decrease in ISI. (Mantysalo et al 87, Sams et al 93). This is powerful feature to dissociate the MMN from N1 response. Increasing the repetition rate shortens the recoding time also making it efficient.
Reddy, iyenger & Vanaja, 2001.used 1.1, 2.7, 3.1 Hz rates and 250, 1KHz, 6 KHz. Standard stimuli 45 dB n deviant stimuli was 40 dB. Results reveal that MMN amplitude increases as repetition increase. But no such trend in latency was seen.
An ISI of about 300 msec has shown to be appropriate for MMN application when using simple or vowel stimuli. When speech stimuli are used, the MMN may sometimes deteriorate with too short ISI (Lang et al 95).
The duration of the stimulus is a contributing factor in deciding a rate of stimulus. More is the duration, slower to be the rate of presentation. Lang et al recommended an ISI of 400-500 ms for pediatric population.
Probability of the deviant stimuli:
The MMN amplitudes increase as probability of the deviant decreases (Javitt, Grochowski, Shelley and Ritter 98). The MMN is also larger as the number of standards preceding the deviant increases, and if 2 deviants are presented in succession in a row, MMN will be smaller to the second deviant.  Therefore when setting g up a sequence of stimuli for an MMN experiment, it is important to make sure that 2 or more deviants are not presented in succession and that the first standard that follows a deviant is not included in an average. As the number of standard stimuli before the deviant increases, the representation of the standard increases. so that a greater mis match occurs and then larger amplitude MMN results.
Subject factors:
Subject state:
MMN can be elicited in sleep. MMN amplitude decreases with increased sleepiness (Lang et al 1985), and in sleep MMN is small and can be recorded in stage two and REM sleep.
The MMN is relatively independent of attention. It can be recorded in sleep, can be obtained even when subjects ignore the stimuli presented to them or are engaged in a difficult task that is unrelated to the stimulus and it can be obtained in comatose patients.
MMN can be recorded when subjects attend to stimuli, but it is difficult to measure in this condition because of overlap from the N2b component. Hence it is recommended that MMN be recorded while the subject ignores the stimuli and reads, watches a silent captioned video.
Maturation and aging:
Maturational changes in MMN are subtler than in the P1-N1-P2 complex. The MMN may show small decreases in latency during infancy and childhood. However these changes have not been consistently demonstrated. Inconsistencies in the literature may appear when results are drawn from different electrode sites, because MMN topography changes with maturation through at least 11 years of age.
In adults, the amplitude of the MMN has been reported to be smaller in older than in younger adults. However, these age effects may depend on stimulus presentation factors, because age related differences are not reported when short ISIs are used.
Cheour et al 98 compared MMN in pre tem, full term infants (3monts). The amplitude for the infants resembled the adults. There was no significant difference between the groups, MMN latency decreased significantly with age.
Pany et al98, measured MMN in 15 normal (8months infants) and compared with the adult data with different electrode sites. Results revealed that the infants got clear MMN at only C3 and T3 sites, and in adults it was F2, C2, C3, L4 and P4, largest at C2 and C3, they concluded that there can be a possible maturational change in MMN.
Ponton et al98, reported that the MMN to durational change occur by 5-6 yrs of age. In adults, the amplitude of the MMN has been reported to be smaller in older adults than in younger adults. (Pekkonen et al 93, Gaeta et al 98).
Gender:
MMN latency may be longer in females than in males and the amplitudes are larger for females. These findings are inconsistent.
Drugs:
Drug
Effect on MMN
Ketamine
Reduces the MMN response amplitude
Lorazepam
No effect in patients with schizophrenia, Reduced signal detection
Psilocybin
No effect on MMN response
Haloperidol
No influence on MMN response amplitude
Scopolamine
No clear direct effect on the MMN response
Chlorpheniramine
No effect on MMN response amplitude

Memory:
Sensory memory of the features of invariant (standard) stimulus is a requisite for recording the MMN response. The presence of MMN implies that the deviant stimulus generated a neural response due to the deection of a change in incoming information.the information which was stored in the sensory memory.


Guidelines for an auditory mismatch negativity response test protocol:
Stimulus parameters:
Parameter
Suggestion
Rationale/comment
Transducer
ER-3A
Supra aural earphones are acceptable, but insert earphones are more comfortable for longer AER recording sessions. Insert earphones attenuate background sound in the test setting. Since insert are disposable, their use contributes to the infection control.
Type
Tone


Speech
A variety of differences between standard and deviant tonal stimuli are effective for evoking a MMN response, including frequency, intensity, duration, source in space.
MMN response can be elicited with speech signals (natural or synthetic) such as /da/ and /pa/. e.g.: voice onset time can be used.
Duration
 Rise/fall

~10 ms

Longer onset times are feasible for signals used to elicit the MMn




















P300 RESPONSE
        P300 is an event related or endogenous evoked response identified in the 1960’s (David, 1965). The P300 is a component within an extended ALR time frame recorded using an oddball paradigm (standard and target signal). Target signal produces a positive peak in the latency of 300ms, which is also called P3.  A missing, rare or a deviant signal can elicit P300 response.  It is often described as cognitive evoked response as it depends on the detection of the difference between frequent vs. rare signals. 
        Diverse regions of the brain contribute to the generation of P300 including sub cortical structures – hippocampus, other structures within the limbic system and the thalamus, auditory regions in cortex, frontal lobe.
Nomenclature:
        Morphology of endogenous response waveforms is dependent on details of test paradigm and subtle variations in the subject’s attention to the stimuli. Anticipation of the stimulus, processing time affects amplitude and latency of P300.

Terminology
Stimulus
*Oddball paradigm: there are at least two stimuli. One stimulus is presented frequently and the other is presented infrequently.
*Standard: the frequent signal in the oddball paradigm. Standard stimuli are predictable, accounting for 80% of the stimulus presented in the oddball paradigm.
*Target: the infrequent, unpredictable, rare stimuli, accounting for 20% of stimuli presented.
Response
*Novelty P3: this is elicited by highly deviant stimuli.
*P3a (Passive): this is shorter latency P3 that is evoked independent of attention to the target stimuli.
*P3b: this refers to conventional P300 response that appears 300ms after presentation of the rare stimulus in oddball paradigm.












TEST PROTOCOLS AND PROCEDURES:
STIMULUS PARAMETERS:
        The oddball stimulus paradigm is typically used to evoke or elicit P300 response with two different stimuli presented randomly, and one stimulus presented much less frequently than the other.
        The frequent, predictable, non-target signal is referred to as standard stimulus. The infrequent, relatively rare, unpredictable random stimulus is referred to as target stimulus. In P300 standard and rare stimulus are distinguished by a difference in frequency.
        The subject is typically required to attend carefully for the target stimulus and to respond to it by pressing a button or counting silently the number of target stimulus presentations. This paradigm characteristically elicits a positive peak in the latency region of 300msec-the P300 response-, increase negativity of the N2 component and the slow wave component. As the difficulty of detecting the rare stimulus increases, the N2-P3 latencies increase and amplitude decrease. No auditory ERP is elicited if the frequency difference for the target vs. standard stimuli is smaller than threshold for discrimination.
        The P300 response can be elicited with one, two or three stimulus paradigm.
*P300 latency is shortest for a single stimulus paradigm in comparison of the other two as the task is easier and less information to process.
*Changes in probability and inter stimulus interval for the standard vs. target stimuli yields similar effects on the P300 response elicited by the three different stimulus paradigms.

Single stimulus paradigm
*Measurement paradigm includes target stimulus but no standard stimulus (McCarthy, 1992).
*The place within the sequence of stimuli typically occupied by the standard stimuli is instead replaced by silence.
*Requires simple instrumentation and subject task.
*By requiring the response to a stimulus vs. a passive presentation, the subject is a coerced to allocate attentional resources to the stimulus so that robust p3 is produced consistently in a fashion similar to oddball paradigm (Polich et al. 1994).
*Peak latencies are equivalent and amplitude for the single vs. oddball paradigm was similar for target probabilities from 0.2 to 0.8 and for different ISI (2-6 sec).
*Single stimulus task can provide the same information as the typical two stimulus paradigm and because the task is less demanding this can be easily performed by young children, demented, retarded subjects (Katayama and Polich, 1996).

Multiple stimulus paradigm
*The three-stimulus paradigm yields information on automatic cognitive processing not available from one or two stimulus paradigm.
*There are two standard stimuli and one target stimulus.
*The subject is required to attend to and identify target stimulus. A decrease in the frequency of presentation and predictability of occurrence for one of the standard stimuli elicits a P300 response, even though the changes in frequency of occurrence are not the stated target stimulus.
*Similar P300 responses may be evoked by the official target stimulus and the infrequent standard stimulus. Therefore the response elicited by the infrequent standard stimulus is thought to reflect automatic cognitive processes. (Katayama and Polich, 1996).
*The three-stimulus paradigm can include a novel or alerting stimulus (noise or dog bark) as a distracter within the sequence of standard and target. The resulting P300 waveform consists of a P3a component elicited by the novel distracter stimulus and the conventional P300 (P3b) elicited by the target.
*P3a component has shorter latency, larger amplitude compared to P3b for frontal and central electrode site and rapid habituation. P3b has large amplitude for parietal site.
*Lesions within the frontal lobe and/ or hippocampus are associated with abnormalities of P3a component (Knight, 1984), as confirmed by neuron radiologic findings.
*Intact temporal-parietal cortical function is important for generation of the P3b.
.













    Katayama, Polich (1996). P300 from one-, two-, and three-stimulus auditory paradigms.
    P300 event-related potentials (ERPs) from 1-, 2-, and 3-tone oddball paradigms were elicited and compared from the same subjects. In the 1-tone paradigm, only a target tone was presented, with the standard tone replaced by silence. The 2-tone paradigm was a typical oddball task, wherein the target and standard tones were presented every 2.0 s in a random order with a target-tone probability of 0.10. In the 3-tone paradigm, in addition to the infrequent target (p = 0.10) and the frequent standard (p = 0.80), infrequent non-target tones (p = 0.10) also were presented. The subject responded with a button press only to the target stimulus in each task. The target stimulus in each paradigm elicited a P300 component with a parietal maximum distribution. No P300 amplitude differences were found among paradigms, although peak latency from the 1-tone paradigm was shorter than those from the other two tasks. Both P300 peak amplitude and latency demonstrated strong positive correlations between each pair of paradigms. The results suggest that P300 was produced by the same neural and cognitive mechanisms across tasks. The possible utility of each paradigm in clinical testing is discussed.

Stimulus probability
*Amplitude of the P300 response decreases as the probability of the target stimulus increases. (Duncan-Johnson, 1977), whereas the effect of target stimulus probability on P300 latency is minimal.
*There is little change in P300 response when probability of target stimulus is decreased below 20%.

Stimulus repetition
*Decreased P300 amplitude with repeated signal presentation is attributed the habituation and related to the response task, e.g. Counting the rare stimuli vs. pressing a button upon perceiving a rare stimuli (Lew and Polich, 1993)
*Habituation or fatigue in the oddball paradigm is apparent when the number of target stimuli repetition exceeds 80-100.
*Interstimulus interval, Intertarget interval and interval between blocks of stimulation trains are the temporal measurements factors that influence P300 habituation.
    Schwent , Hillyard , Galambos (1976) studied in a selective attention task, twelve subjects who received random sequences of 800 and 1500 c/sec tone pips in their right and left ears, respectively. They were instructed to attend to one channel (ear) of tones, to ignore the other, and to press a button whenever occasional "targets", tones of a slightly higher pitch, were detected in the attended ear. In separate experimental conditions the randomized interstimulus intervals (ISIs) were "short" (averaging 350 msec), "medium" (960 ms) and "long" (1920 ms). The N1 component of the auditory evoked potential (latency 80--130 msec) was found to be enlarged to all stimuli in an attended channel (both targets and non-targets) but only in the short ISI condition. Thus, a high "information load" appears to be a prerequisite for producing channel-selective enhancement of the N1 wave; this high load condition was also associated with the most accurate target detectability scores (d'). The pattern of attention-related effects on N1 was dissociated from the pattern displayed by the subsequent P3 wave (300--450 msec), substantiating the view that the two waves are related to different modes of selective attention.

*Larger P300 amplitudes, less habituation is associated with longer values of ISI, ITI (>13 secs) and IBI (>2 mins).

Stimulus types
Tones
*In the oddball paradigm, standard and rare stimuli differ in frequency.
*In applying stimulus-frequency based oddball paradigm clinically, patients hearing sensitivity should be comparable for the two frequencies.
*The hearing related P300 response latencies or amplitude changes might be expected if the hearing sensitivity is depressed in the region of target stimuli.
*Negative or positive ERP components can also be recorded when the target stimulus is omitted.
*Amplitude of P300 increases directly with the frequency difference between the standard and target stimuli.

*Pollock, Schneider. (1989) studied the effects of tone stimulus frequency on late positive component activity (P3) among normal elderly subjects.
The effects of tone stimulus frequency on the auditory P3 of normal younger (n = 16) and elderly (n = 23) subjects were assessed using latency and amplitude measures. Because presbycusis is prevalent among the elderly, it was hypothesized that P3s of elderly subjects elicited by a 2000 Hz target tone would show longer latencies and smaller amplitudes than those elicited by a 500 Hz target tone. As hypothesized, the results indicated that elderly subjects showed prolonged P3 latencies under a 2000 Hz, as compared to 500 Hz target tone condition, whereas younger subjects did not show such latency differences. The P3 amplitudes of the younger and elderly groups were not, however, affected by target tone frequency.
Speech sounds
*The P300 response can be elicited during REM sleep with frequent vs. rare speech stimuli differing in acoustic characteristics and using the oddball paradigm, probability of occurrence (Bastuji, Perrin, and Garcia-laurea, 2002)
*Relevance of speech stimuli influences P300 response
*With speech stimuli in a passive, sleeping recording condition, P300 amplitudes are larger when elicited with the personally significant word than with other irrelevant words (perrin et al 2000)
*Measurement of P300 is enhanced or facilitated by first familiarizing the subjects to rare speech stimuli (bastuji 1995)
*Syntactically anomalous words presented in an oddball paradigm generate a positive in the 600 msec region (Osterhoote et al 2002).
*The semantic difference between the standards and target stimuli evoked a larger a positive peak in the 500 to 800 msec region. The response was recorded at parietal and central scalp electrodes but not over frontal areas (Kotchoubey and Lange 2001)

Sound environment
*Distinct differences in the P300 response are found when recordings are made with the stimulus presented in the silent environment vs. in background noise.
*Nature of the background sound, white noise vs. culturally significant music can influence P300. (Arikan et al, 1999)
*When the P300 is recorded with frequent vs. rare stimuli presented in competing noise, amplitude is usually reduced directly as a function of the SNR.

Frequency
*Stimulus frequency can be considered in absolute and relative terms.
*P300 can be recorded by 2 500Hz tonebursts that differ in their duration or intensity.
*P300 may be elicited by a relative difference in stimulus frequency.
*P300 is smaller and latency longer for low vs. high stimulus frequencies (sugg & Polich, 1995)

Intensity
*P300 wave amplitude increases and latency decreases as stimulus intensity increases fro both standard and target stimuli (Ryan & Polich, 1993)
*Intensity difference between rare vs. frequent stimuli can elicit P300

Rate & ISI
*Optimal with slow stimulus rates and long ISIs
*Rates of 1 or .5/sec
*Amplitude of P300 decreases as the stimulus rate increases (Polich, 1986)

ACQUISITION PARAMETERS
Analysis time
*About 450 – 500ms is required
*The response consists of entirely low frequency (less than 30-40 Hz) energy
*Sweeps – 250 to 500



Electrode location
*P300 response can be reliably recorded clinically with a noninverting electrode located anywhere over the frontal position of the scalp of the head especially in the midline.
*P300 has maximum amplitude with the frontal or central or parietal sites (Polich et al, 1997).
*P300 evoked by conventional tonal signals is largest for the midline parietal electrode site and decreases as the non-inverting electrode is placed at the more anterior sites, whereas P3a is maximal with electrodes at frontal central sites.
*With right-handed normal subjects, larger amplitude for the P300 wave is recorded with non-inverting electrode over the right vs. left hemisphere, especially for anterior location such as the frontal sites F3/4 and central sites C3/4 (Alexander et al, 1996).
*The inverting electrode for P300 measurement is usually located either on the mastoid on the ear lobe ipsilateral to the stimulus, or electrodes linked between two ears (A1/A2). The common/ ground is located on the forehead.
*To exclude eye blink artifact, vertical ocular activity is detected with a pair of electrodes above and below an eye and horizontal ocular activity is detected with electrodes located on each side. 

Filter settings
*Typical high pass filter settings in the region of 0.01 to 0.25Hz and low pass filter settings on the order of 30 to 50 Hz (Polich, 2004).
*Analog filter used is typically about 3dB down at the cutoff and then a slope of 12dB/octave.
Averaging
*Techniques applied for detecting P300 elicited by a single stimulus include adaptive filtering (Woody, 1967), correction techniques (Suwazona, 1994).
*The neural network technique permits detection of P300 elicited by a single stimulus with the accuracy equivalent to the visual inspection of the averaged P300 waveform by experienced persons.
*An adequate SNR, larger than about 0.8 is required (Nishid et al, 1997). 

CLINICAL TEST PROTOCOL
Parameter
Recommendation
Stimulus

Transducer
 ER-3A
Type
Tone burst, speech sounds
Duration         rise/ fall
10ms
                      Plateau
50ms
Rate
< 1.1/sec
Oddball paradigm
Standard, target
Signal difference
Frequency, intensity or duration
Probability
Target – 20%
Polarity
Rarefaction
Intensity
70dB HL
Masking
50dB
Acquisition

Amplification
50,000
Analysis time
600ms
Prestimulus time
100ms
Sweeps
< 500
Band pass filter
.1 to 100 Hz
Notch filter
None
Electrodes
Disc


ANALYSIS AND INTERPRETATION
        P300 latency and amplitude vary considerably as a function of various signal factors such as complexity and subject factors such as memory and speed of information processing.
Latency
·      The initial step in the calculation of P300 latency or timing is to identify a peak within the appropriate latency region. 
·      P300 is a positive deflection in the waveform within latency region of 250 -400ms and upto 800ms for infants.
·      Actual latency values for individual subjects differ due to intersubject variability and a host of measurement parameters such as the test paradigm, passive or attending, stimulus intensity, relevance of stimulus, recording electrode site etc and subject related factors such as age, gender, and cognitive status.
·      Latency is calculated from the onset of the stimulus.
·      P300 wave is often broad and characterized by multiple positive peaks (Dalebout and Robey, 1997).
·      Dalebout and Robey, 1997 described the intersection method. Latency is determined for the midpoint of the wave as calculated from the latency point where the initial / leading portion of the peak crosses the baseline to the corresponding latency point at the final / traiting portion of the peak.
·      With faster information processing including quick recognition and categorization of the stimulus, P300 latency is shorter.
·      P300 latency increases directly with the complexity of the processing task and with STM demands (Polich, Howard, Starr, 1983).

Amplitude
·      Response amplitude is generally within the range of 10-20µV.
·      Two common approaches for determining P300 amplitude
-        Calculation of the amplitude in the µV from the P300 peak to the trough that immediately proceeds of follows the peak.
-        Calculation of the difference in µV from a prestimulus baseline period to the, maximum positive voltage point or peak of the P300 wave. This results in an absolute value that reflects the amplitude of a single peak. This approach presumes that the prestimulus baseline activity is stable reference point.
·      Two common reference points are the trough of the preceding component (usually N2) or the baseline (determined from the prestimulus baseline analysis time).
·      Calculation of the N2 –P3 amplitude excursion appeared to be more stable (Segalowitz and Barnes, 1993). It’s a hybrid measure of both N2 and P3 amplitude.
·      P300 amplitude is generally larger when recorded with electrodes located in frontal / central (Fz, CZ).
·      P300 amplitude is higher with subject anticipation of an impeding of probable rare signal presentation. (Johnson, 1986).
·      Subject’s degree of confidence in the perception of a difference between the standard and the target id associated with larger amplitude (Pritchard, 1981).

Polich et al (1997). P300 topography of amplitude/latency correlations.
The correlational association from 19 electrode sites between peak amplitude and latency for the P300 event-related brain potential (ERP) for n = 80 homogeneous subjects was assessed using a simple auditory discrimination task. The correlation strength varied systematically across scalp topography in different ways for the various ERP components. For the target stimuli, P3 amplitude and latency were negatively correlated and most tightly coupled over the frontal-central and right medial/lateral recording sites. In contrast, the N1 produced negative correlations that were strongest over the left and right central/lateral locations; P2 demonstrated a positive correlation that was strongest frontally and centrally; N2 demonstrated a positive correlation that was strongest over the central and parietal sites. ERPs from the standard stimuli produced generally similar patterns for the P3 and P2 components, with only weak or no reliable effects observed for the N1 and N2 potentials. Taken together, the findings suggest that analysis of amplitude/latency correlational relationships can provide information about ERP component generation.

Reliability
·      Most variability in the P300 response can be attributed or traced to manipulation of measurement conditions, and intersubject factors such as age, gender, and cognitive status.
·      Kinley and Kripal (1987) reported a high degree of test retest reliability for the P300 response elicited for tonal signals from young normal hearing adults.
·      Dalebout and Robey 1997, reported greater P300 latency differences within subjects than from one subject to the next.

Non-pathologic subject factors
Age
Infancy & childhood
·      There is relatively less normative data on the P300 response in children.
·      Passive P300 can be used with infants & young children
·      From 6 yrs to late adolescence, P300 amplitude increases, latency reduces, morphology improves (Squires & Hecox, 1983).
·      The relation between age from 6yrs upto 15 yrs and latency is defined by an average change in P300 latency as a function of age of approximately -19ms/year (Fernandez, 1989).
·      P300 latency changes of 20ms/yr over the age range of 5 to 13yrs (Pearce et al, 1989).
·      Kurtzberg & colleagues (1989) described P300 response to speech sounds in awake infants with /ta/, /da/, /ba/ using oddball paradigm.
·      McIsaac H, Polich J. (1992). Event-related brain potentials (ERPs) using auditory stimuli were recorded from 5- to 10-month-old infants and normal young adults with a passive tone sequence paradigm. A P300 component was obtained which demonstrated the same central-parietal maximum scalp distribution for both subject groups. P3 amplitude was smaller and its peak latency longer for infants compared to those of the adults across all electrode sites. P3 measures remained stable across stimulus trials indicating that ERP habituation was not occurring. Evaluation of individual subjects suggests that the P3 can be elicited from infants with auditory stimuli in a manner similar to that from adults and may serve as a useful index for the assessment of cognitive function in infants.

Advancing age in adults
n Average P300 latency over the age range of 10 to 90 yrs, steadily increases from 300 to 450 ms (a change of 1 to 2 ms/ yr), while amplitude decreases at an average rate of 0.2µV/ yr.
n Polich (1997) found a Positive correlation between spectral power of EEG with P300 amplitude but not latency.
n The Strongest relation between age and changes in P300 latency & amplitude are found for Cz & Pz, not for Fz or lateral locations (Fjell & Walhord, 2001).
n Young adults have pronounced parietal distribution, P300 becomes frontal with advancing age (Pfefferberum et al, 1984).
n Ford et al (1979) used an event-related brain potential (ERP) technique developed by Hillyard et al. (1973) to test abilities to attenuate irrelevant stimuli and to detect target stimuli. Subjects, 12 healthy old (80.3 years) and 12 healthy young adults (22.0 years), heard 1500 Hz tones in one ear and 800 Hz tones in the other ear. Infrequently, the pitch of either tone was raised. During one run, infrequent tones in the right ear were targets, and in the other run those in the left ear were targets. Subjects counted targets. For both groups, an early component of the ERP (N1) was larger to tones in the attended ear than in the unattended ear, and a later component (P3) was largest to the target. This suggests that both groups can attenuate irrelevant stimuli and can use stimulus probability information in this task. That P3 was later for old subjects suggests that they take longer to decide stimulus relevance.

n Vesco , Bone, Ryan Polich (1993) studied P300 in young and elderly subjects: auditory frequency and intensity effects.
        Auditory event-related potentials (ERPs) were assessed in young and elderly subjects when stimulus intensity (40 vs. 60 dB SL) and standard/target tone frequency (250/500 Hz and 1000/2000 Hz) were manipulated to study the effects of these variables on the P3(00) and N1, P2 and N2 components. Auditory thresholds for each stimulus type were obtained, and the stimulus intensity was adjusted to effect perceptually equal intensities across conditions for each subject. Younger subjects demonstrated larger P3 amplitudes and shorter latencies than elderly subjects. The low frequency stimuli produced larger P3 amplitude and shorter latencies than the high frequency stimuli. Low intensity stimuli yielded somewhat smaller P3 amplitudes and longer peak latencies than high intensity stimulus tones. Although additional stimulus intensity and frequency effects were obtained for the N1, P2 and N2 components, these generally differed relatively little with subject age. The findings suggest that auditory stimulus parameters contribute to P3 measures, which are different for young compared to elderly subjects.

        Knott et al (2003) studied the effects of stimulus modality and response mode on the P300 event-related potential differentiation of young and elderly adults The P300 event-related brain potential (ERP) was examined in 14 young (20 - 29 years of age) and 16 elderly (60 - 82 years of age) adult subjects during the performance of auditory and visual discrimination tasks requiring silent counting or key pressing in response to target stimuli. P300 latencies were longer in elderly (vs. young) adults and in visual (vs. auditory) tasks, and visual tasks elicited larger P300 amplitudes than auditory tasks in both age groups. Neither stimulus modality nor response mode affected P300 differentiation of young and elderly subjects. Steeper P300 anterior-posterior scalp amplitude gradients were seen in the young (vs. elderly) adults, regardless of stimulus or response type. Examination of inter-subject variability with the coefficient of variation (CV) statistic found the lowest (i.e., best) CV values to be exhibited in the visual task requiring the counting of target stimuli. Implications of the findings are discussed in relation to P300 applications in the clinical assessment of dementia and aging-associated cognitive alterations.
Gender
n No significant difference in latency and amplitude (Polich, 1986).
n Greater amplitude of P3 and shorter latency for females than males over 15 yrs of age (Morita et al, 2001).
Handedness
Alexander & Polich (1997) reported the following
n Larger P300 amplitude for left handed individuals at Fz, right handed subjects at posterior sites.
n Shorter latency for left vs. right handed subjects.
n Possible explanations include larger corpus callosum in left-handed subjects, differences in skull thickness, STM, attention.
Attention
·      McPherson & Salamat (2004) found a direct relation between ISI and both behavioural reaction time and
P300 latency. Longer ISIs were associated with longer reaction times and P300 values.
·      P300 latency is a reflection of information processing time.
·      Amplitude of P300 decreases as ISI increases.
·      In dual task paradigm, Singhal, Doerfling & Fowler(2002) found that subject performance for visual task during auditory P30 measurement involving a dichotic listening task was associated with decreased amplitude of P300. As difficulty of the visual task increases and attention is allocated more to visual modality, amplitude of P300 reduces.
·      Polich (1987) demonstrated that P300 latency was longer and amplitude larger when the subject silently counted the target signals vs. when the subject pressed a button with a thumb. 
State of arousal
n P300 response has larger amplitude, shorter latency with conscious and focused attention to rare signal.
n P3a has smaller amplitude and shorter latency than P300 on oddball paradigm.
n Selective attention to one stimulus vs. another is a characteristic feature of P300.
Sleep
n The P300 recorded in Stage I & REM sleep with posterior electrodes similar to awakened state. P300 is not apparent during stage II.
n P300 recorded with Fz electrode sites is absent in stage I & II (Cote, 2002)
n Amplitude of P300 decreases & amplitude increases during the transition from alert awake state to drowsiness & the to sleep stage I (Koshino et al, 1993)
n P3a can be consistently recorded in sleep (Atienz, cantero, escera, 2001).
n Sleep deprivation increases latency & reduces amplitude of P300 (Danos et al, 1994).
Task difficulty
n P300 latency becomes longer; amplitude becomes smaller as the difficulty of the listening increases (Katayama & Polich, 1998).
n Highly novel target produces large, short latency response (Simpson, 1982)
Memory
n Memory updating for target signal is required as standard forms good representation.
n Anticholinergics have negative effect on P300 (Patter et al, 1992).
Exercises
n Regular exercise enhances cognitive function, therefore enhances P300 amplitude and not latency (Polich & Lardon, 1997).
Motivation
n Attaching monetary value to correct identification of target signal produces larger P300 (Johnson, 1986).
n Larger amplitude of P300 for motivational instructions than neutral ones. P2 amplitude for standard signals increased and N2 latency reduced for motivational instructions than neutral instructions.
Pregnancy
n Longer P300 latencies, smaller amplitudes in group of 18 pregnant women (Tandon, Bhatia, Goel, 1996)
n Attributed to Changes in estrogen, Inhibitory effect of pregnancy on cognitive processing.
Drugs
n P300 latencies are increased in children receiving Phenobarbital in management of epilepsy.
n Acute alcohol ingestion slightly increases P300 latency (Pfefferbaum et al, 1980).


N400
·      This is a negative potential that occurs at about 400ms and was first described by Kutas and Hilliard (1980) as being present during the presentation of semantic material.
·      The earlier negative potential, above 400ms is related to semantic differences between the context of a sentence and the ending word of the sentence that is semantic priming.
·      The greater semantic mismatch, the more robust the response.
·      This is in contrast to a positive response at about 500 ms that occurs when the ending word is different from that of the preceding word.
·      Likewise, a slow negative potential occurring between 250 and 400ms has been described by Fischer, Bloom; Childers (1983) following the presentation of the object of a false sentence “a car is a boat”.
·       N400 - 6µV, 80-100ms wide
·      P500 – 20µV, 80-100ms wide
·      This is a cortex response with multiple sequential and parallel sources.
Acquisition parameters
·      These are similar to P300 recording.
·      ISI of 2 sec is used.
·      A series of 100 averages per trial are obtained.


References:
  Robert F. Bukard –Auditory evoked potentials.
  Hall-Hand book auditory evoked potentials.
  Jack Katz- hand book of clinical Audiology.
  Deepa (1997) - Age related changes in ALLR.


Comments

Popular posts from this blog

History of Audiology

Auditory Brainstem Response

CAPD Management