Abstract
We have compared sleep staging by an automated neural network (ANN) system, BioSleep™ (Oxford BioSignals) and a human scorer using the Rechtschaffen and Kales scoring system. Sleep study recordings from 114 patients with suspected obstructed sleep apnoea syndrome (OSA) were analysed by ANN and by a blinded human scorer. We also examined human scorer reliability by calculating the agreement between the index scorer and a second independent blinded scorer for 28 of the 114 studies. For each study, we built contingency tables on an epoch-by-epoch (30 s epochs) comparison basis. From these, we derived kappa (κ) coefficients for different combinations of sleep stages. The overall agreement of automatic and manual scoring for the 114 studies for the classification {wake | light-sleep | deep-sleep | REM} was poor (median κ=0.305) and only a little better (κ=0.449) for the crude {wake | sleep} distinction. For the subgroup of 28 randomly selected studies, the overall agreement of automatic and manual scoring was again relatively low (κ=0.331 for {wake | light-sleep | deep-sleep | REM} and κ=0.505 for {wake | sleep}), whereas inter-scorer reliability was higher (κ=0.641 for {wake | light-sleep | deep-sleep | REM} and κ=0.737 for {wake | sleep}). We conclude that such an ANN-based analysis system is not sufficiently accurate for sleep study analyses using the R&K classification system.
Original language | English |
---|---|
Pages (from-to) | 105-110 |
Number of pages | 6 |
Journal | Medical and Biological Engineering and Computing |
Volume | 44 |
DOIs | |
Publication status | Published - 26 Jan 2006 |
Externally published | Yes |