We model the audio-scene as a collection of a few dominant sound
sources. These sources are assumed to possess stationary properties that
can be characterized using a few features. An audio-scene change is said
to occur when the majority of the dominant sources in the scene change.
We use three types of features (a) scalar sequences (b) vector
sequences and (c) scalar point data. The scalar sequences are further
analyzed using:
- Trends
- Periodic components
- Noise
We have developed novel metrics to analyze the different types of
features used in our work. Then, these metrics are used to determine
(within our memory model framework) a correlation value per one second
interval. Local correlation minima give us audio scene changes.