Windows 8.1 Audio streaming – Part 3: Audio processing

What type of audio processing processing is included in Windows? Does Windows support Echo Cancellation? Noise Suppression? Some other type of audio processing?

One common misconception about the Windows audio stack is that Windows natively includes audio effects that modify the audio signal, e.g. acoustic echo cancellation, noise reduction, gain control, etc. However, in reality the audio stack just passes the unmodified raw signal from the application to the speakers or from the microphone to the application. The signal is not modified by any Windows component. In fact, the audio stack loads external components (called "Audio Processing Objects" or APOs) that modify the audio signal. There are multiple companies that create APOs for Windows, such as Dolby, DTS, Waves, Conexant, ForteMedia, Realtek, etc. All commercial systems for the well-known OEMs include at least one APO in the system. It is up to the OEM to decide what type of processing they want in each model and which company will create the APO for that model. In Windows 8.1 the APOs are bundled and installed together with the audio driver.

All APOs are required to declare the type of processing that they do, so that the applications can be informed about the type of processing that is available in each system. Windows provides a list of 18 types of audio effects (as is shown in this MSDN link):

  1. Acoustic Echo Cancellation
  2. Noise Suppression
  3. Automatic Gain Control
  4. Beam Forming
  5. Constant Tone Removal
  6. Equalizer
  7. Loudness Equalizer
  8. Bass Boost
  9. Virtual Surround
  10. Virtual Headphones
  11. Speaker Fill
  12. Room Correction
  13. Bass Management
  14. Environmental Effects
  15. Speaker Protection
  16. Speaker Compensantion
  17. Dynamic Range Compression
  18. Other

Each audio effect category is described in more detail here (just note that this link is for Windows 10).

 

How can I find the type of processing that was selected by the OEM for my system?

  1. For playback processing: Control Panel -> Sound -> Playback tab -> Right click on speakers -> Properties -> Enhancements
  2. For capture processing: Control Panel -> Sound -> Recording tab -> Right click on microphone -> Properties -> Enhancements

Alternatively, if you don't want to go to control panel, then you can right lick on the speaker icon (at the bottom right of the screen) and select "Playback devices" (opens the "Playback tab" directly) or "Recording devices' (opens the "Recording tab" directly).

 

If there is no Enhancements tab, then this PROBABLY means that there are no APOs in your system. I used the word "probably", because the Enhancements tab is actually a page that is created by the party supplying the APOs. So, it is up to them to create this page. Also, this means that they content of the page is totally up to them and they can decide how they want to name the audio effects that they provide (that's why the names in the list that you will see there do not always correspond with the above list with 18 audio effect types).

 

You can enable/disable individual effects from that list and listen to the impact that it has on the output in your system. However, typically the OEM has tuned the default settings to be optimal for your device, so be careful about what you are changing.

 

Typically, the 3rd parties also provide a box with titles, such as "Disable all enhancements" or "Disable all sound effects", etc. If you want to understand the impact of all these audio effects in your system, you can select that box (click "Apply") and play sounds. Most probably you will find that the audio will now be lower quality (e.g. lower volume). Remember to uncheck the box afterwards, so that the audio effects are enabled again.

 

Are all audio effects applied to all streams? Which effect is applied first?

In order to explain this easily, let's take a look at the following diagram of the audio stack:

Applications have the option to select if they want to open a stream using default mode or raw mode:

  • Default mode is appropriate for most applications. The stream will go through all audio effects (SFX, MFX, EFX) that have been selected by the OEM, in order to optimize audio quality
  • Raw mode will avoid most processing (only EFX are applied to raw streams). This should be used by applications who want to have unprocessed audio signal (e.g. Pro Audio applications)

Audio effects can be applied to:

  1. An individual stream: Stream effects (SFX)
  2. All streams that use same mode (explained below): Mode effects (MFX)
  3. All streams that use the same endpoint (e.g. speakers, microphone, etc): Endpoint effects (EFX)

Each of the 18 audio effect types from the above list can be applied to any of the 3 positions (SFX, MFX, EFX). Also, each position (SFX, MFX, EFX) can have processing for multiple effect types.

Audio effects can be implemented in S/W, in H/W or a combination of both.

If you want more information about this topic, you can look at the Audio Processing Object Architecture page in in MSDN. Just note that the MSDN page is describing the Windows 10 architecture, which is slightly different than the Windows 8.1 architecture that is described in this post.