Computational models of auditory scene analysis: A review

Beáta T. Szabó, Susan L. Denham, I. Winkler

Research output: Contribution to journalReview article

8 Citations (Scopus)

Abstract

Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.

Original languageEnglish
Article number524
JournalFrontiers in Neuroscience
Volume10
Issue numberNOV
DOIs
Publication statusPublished - 2016

Fingerprint

Acoustics
Ear

Keywords

  • Auditory object representation
  • Auditory scene analysis
  • Auditory streaming
  • Bi-/multi-stable perception
  • Computational model
  • Predictive processing

ASJC Scopus subject areas

  • Neuroscience(all)

Cite this

Computational models of auditory scene analysis : A review. / Szabó, Beáta T.; Denham, Susan L.; Winkler, I.

In: Frontiers in Neuroscience, Vol. 10, No. NOV, 524, 2016.

Research output: Contribution to journalReview article

Szabó, Beáta T. ; Denham, Susan L. ; Winkler, I. / Computational models of auditory scene analysis : A review. In: Frontiers in Neuroscience. 2016 ; Vol. 10, No. NOV.
@article{eb598f7ade56440fba7d5512f4b011e4,
title = "Computational models of auditory scene analysis: A review",
abstract = "Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.",
keywords = "Auditory object representation, Auditory scene analysis, Auditory streaming, Bi-/multi-stable perception, Computational model, Predictive processing",
author = "Szab{\'o}, {Be{\'a}ta T.} and Denham, {Susan L.} and I. Winkler",
year = "2016",
doi = "10.3389/fnins.2016.00524",
language = "English",
volume = "10",
journal = "Frontiers in Neuroscience",
issn = "1662-4548",
publisher = "Frontiers Research Foundation",
number = "NOV",

}

TY - JOUR

T1 - Computational models of auditory scene analysis

T2 - A review

AU - Szabó, Beáta T.

AU - Denham, Susan L.

AU - Winkler, I.

PY - 2016

Y1 - 2016

N2 - Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.

AB - Auditory scene analysis (ASA) refers to the process (es) of parsing the complex acoustic input into auditory perceptual objects representing either physical sources or temporal sound patterns, such as melodies, which contributed to the sound waves reaching the ears. A number of new computational models accounting for some of the perceptual phenomena of ASA have been published recently. Here we provide a theoretically motivated review of these computational models, aiming to relate their guiding principles to the central issues of the theoretical framework of ASA. Specifically, we ask how they achieve the grouping and separation of sound elements and whether they implement some form of competition between alternative interpretations of the sound input. We consider the extent to which they include predictive processes, as important current theories suggest that perception is inherently predictive, and also how they have been evaluated. We conclude that current computational models of ASA are fragmentary in the sense that rather than providing general competing interpretations of ASA, they focus on assessing the utility of specific processes (or algorithms) for finding the causes of the complex acoustic signal. This leaves open the possibility for integrating complementary aspects of the models into a more comprehensive theory of ASA.

KW - Auditory object representation

KW - Auditory scene analysis

KW - Auditory streaming

KW - Bi-/multi-stable perception

KW - Computational model

KW - Predictive processing

UR - http://www.scopus.com/inward/record.url?scp=85009774554&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009774554&partnerID=8YFLogxK

U2 - 10.3389/fnins.2016.00524

DO - 10.3389/fnins.2016.00524

M3 - Review article

AN - SCOPUS:85009774554

VL - 10

JO - Frontiers in Neuroscience

JF - Frontiers in Neuroscience

SN - 1662-4548

IS - NOV

M1 - 524

ER -