Audio Flamingo 3

An open-source audio-language model by NVIDIA for in-depth audio understanding and reasoning (speech, sound, music), supporting long audio, multi-turn chat, and voice-to-voice interaction.

Free Plan Available Starts at

About Audio Flamingo 3

Audio Flamingo 3 (AF3) is a Large Audio-Language Model (LALM) developed by NVIDIA ADLR. It is open-source, designed for advanced understanding and reasoning over multiple audio modalities (speech, sound effects, music), across long contexts, and supports interactive capabilities.