This paper is accepted by ISMIR 2024(code: https://github.com/fundwotsai2001/AP-adapter)

In this website, we present generated samples for three distinct tasks: timbre transfer, genre transfer, and accompaniment generation, using both out-of-domain and in-domain evaluation datasets.

Note: In all of the samples below, negative prompts are only applicable to our AP-Adapter and AudioLDM2, not to MusicGen.

MusicGen: We use MusicGen-Melody (1.5B), which achieves melody conditioning using chromagram as a proxy.

AudioDLM2: Following SDEdit, we perform the forward process (i.e., adding noise to the audio input)

partially for 0.75T steps, where T is the original number diffusion steps, and then denoise it back with the editing command.

Out-of-Domain Instruments

The out-of-domain dataset contains the instruments:

that are not included in our AudioSet subset for AP-Adapter training, and
whose timbre cannot be generated by AudioLDM2 or MusicGen through prompting

The instruments are:

Chinese flute
Chinese plucked string
Korean mallet percussion
Korean percussion
Korean plucked string 
Korean rubbed string
Korean woodwind

which are all ethnic instruments

Task 1 — Timbre Transfer

Hyperparameters:

AP scale α = 0.5
pooling rate ω = 2
guidance scale λ = 7.5

Audio prompt: Original audio (given below)