auffusion / auffusion

huggingface.co
Total runs: 0
24-hour runs: 0
7-day runs: 0
30-day runs: 0
Model's Last Updated: January 03 2024

Introduction of auffusion

Model Details of auffusion

Auffusion is a latent diffusion model (LDM) for text-to-audio (TTA) generation. Auffusion can generate realistic audios including human sounds, animal sounds, natural and artificial sounds and sound effects from textual prompts. We introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment. Our objective and subjective evaluations demonstrate that Auffusion surpasses previous TTA approaches using limited data and computational resource. We release our model, inference code, and pre-trained checkpoints for the research community.

📣 We are releasing Auffusion-Full-no-adapter which was pre-trained on all datasets described in paper and created for easy use of audio manipulation.

📣 We are releasing Auffusion-Full which was pre-trained on all datasets described in paper.

📣 We are releasing Auffusion which was pre-trained on AudioCaps .

Auffusion Model Family
Code

Our code is released here: https://github.com/happylittlecat2333/Auffusion

We uploaded several Auffusion generated samples here: https://auffusion.github.io

Please follow the instructions in the repository for installation, usage and experiments.

Quickstart Guide

First, git clone the repository and install the requirements:

git clone https://github.com/happylittlecat2333/Auffusion/
cd Auffusion
pip install -r requirements.txt

Download the Auffusion model and generate audio from a text prompt:

import IPython, torch
import soundfile as sf
from auffusion_pipeline import AuffusionPipeline

pipeline = AuffusionPipeline.from_pretrained("auffusion/auffusion")

prompt = "Birds singing sweetly in a blooming garden"
output = pipeline(prompt=prompt)
audio = output.audios[0]
sf.write(f"{prompt}.wav", audio, samplerate=16000)
IPython.display.Audio(data=audio, rate=16000)

The auffusion model will be automatically downloaded from huggingface and saved in cache. Subsequent runs will load the model directly from cache.

The generate function uses 100 steps and 7.5 guidance_scale by default to sample from the latent diffusion model. You can also vary parameters for different results.

prompt = "Rolling thunder with lightning strikes"
output = pipeline(prompt=prompt, num_inference_steps=100, guidance_scale=7.5)
audio = output.audios[0]
IPython.display.Audio(data=audio, rate=16000)
Citation

Please consider citing the following article if you found our work useful:

@article{xue2024auffusion,
  title={Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation}, 
  author={Jinlong Xue and Yayue Deng and Yingming Gao and Ya Li},
  journal={arXiv preprint arXiv:2401.01044},
  year={2024}
}

Runs of auffusion auffusion on huggingface.co

0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs

More Information About auffusion huggingface.co Model

auffusion huggingface.co

auffusion huggingface.co is an AI model on huggingface.co that provides auffusion's model effect (), which can be used instantly with this auffusion auffusion model. huggingface.co supports a free trial of the auffusion model, and also provides paid use of the auffusion. Support call auffusion model through api, including Node.js, Python, http.

auffusion auffusion online free

auffusion huggingface.co is an online trial and call api platform, which integrates auffusion's modeling effects, including api services, and provides a free online trial of auffusion, you can try auffusion online for free by clicking the link below.

auffusion auffusion online free url in huggingface.co:

https://huggingface.co/auffusion/auffusion

auffusion install

auffusion is an open source model from GitHub that offers a free installation service, and any user can find auffusion on GitHub to install. At the same time, huggingface.co provides the effect of auffusion install, users can directly use auffusion installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

auffusion install url in huggingface.co:

https://huggingface.co/auffusion/auffusion

Url of auffusion

Provider of auffusion huggingface.co

auffusion
ORGANIZATIONS

Other API from auffusion