Abstract
Automatic speech processing devices have become popular for quantifying amounts of ambient language input to children in their home environments. We assessed error rates for language input estimates for the Language ENvironment Analysis (LENA) audio processing system, asking whether error rates differed as a function of adult talkers’ gender and whether they were speaking to children or adults. Audio was sampled from within LENA recordings from 23 families with children aged 4–34 months. Human coders identified vocalizations by adults and children, counted intelligible words, and determined whether adults’ speech was addressed to children or adults. LENA’s classification accuracy was assessed by parceling audio into 100-ms frames and comparing, for each frame, human and LENA classifications. LENA correctly classified adult speech 67% of the time across families (average false negative rate: 33%). LENA’s adult word count showed a mean +47% error relative to human counts. Classification and Adult Word Count error rates were significantly affected by talkers’ gender and whether speech was addressed to a child or an adult. The largest systematic errors occurred when adult females addressed children. Results show LENA’s classifications and Adult Word Count entailed random – and sometimes large – errors across recordings, as well as systematic errors as a function of talker gender and addressee. Due to systematic and sometimes high error in estimates of amount of adult language input, relying on this metric alone may lead to invalid clinical and/or research conclusions. Further validation studies and circumspect usage of LENA are warranted.