viewfinder-annn commited on
Commit
5a472e3
·
verified ·
1 Parent(s): 474482f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -3
README.md CHANGED
@@ -1,3 +1,121 @@
1
- ---
2
- license: cc-by-nc-nd-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-nd-4.0
3
+ language:
4
+ - en
5
+ library_name: torch
6
+ tags:
7
+ - audio
8
+ - music-generation
9
+ - accompaniment-generation
10
+ - unconditional-audio-generation
11
+ - pytorch
12
+ ---
13
+
14
+ ## AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck
15
+
16
+ This is the official Hugging Face model repository for **AnyAccomp**, an accompaniment generation framework from the paper **AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck**.
17
+
18
+ AnyAccomp addresses two critical challenges in accompaniment generation: **generalization** to in-the-wild singing voices and **versatility** in handling solo instrumental inputs.
19
+
20
+ The core of our framework is a **quantized melodic bottleneck**, which extracts robust melodic features. A subsequent flow matching model then generates a matching accompaniment based on these features.
21
+
22
+ For more details, please visit our [GitHub Repository](https://github.com/AmphionTeam/AnyAccomp).
23
+
24
+ <img src="https://anyaccomp.github.io/data/framework.jpg" alt="framework" width="500">
25
+
26
+ ## Model Checkpoints
27
+
28
+ This repository contains the three pretrained components of the AnyAccomp framework:
29
+
30
+ | Model Name | Directory | Description |
31
+ | ----------------- | ---------------------------- | ------------------------------------------------- |
32
+ | **VQ** | `./pretrained/vq` | Extracts core melodic features from audio. |
33
+ | **Flow Matching** | `./pretrained/flow_matching` | Generates accompaniments from melodic features. |
34
+ | **Vocoder** | `./pretrained/vocoder` | Converts generated features into audio waveforms. |
35
+
36
+ ## How to use
37
+
38
+ To run this model, you need to follow the steps below:
39
+
40
+ 1. Clone the repository and install the environment.
41
+ 2. Run the Gradio demo / Inference script.
42
+
43
+ ### 1. Clone and Environment
44
+
45
+ In this section, follow the steps below to clone the repository and install the environment.
46
+
47
+ 1. Clone the repository.
48
+ 2. Install the environment following the guide below.
49
+
50
+ ```bash
51
+ git clone https://github.com/AmphionTeam/AnyAccomp.git
52
+
53
+ # enter the repositry directory
54
+ cd AnyAccomp
55
+ ```
56
+
57
+ ### 2. Download the Pretrained Models
58
+
59
+ We provide a simple Python script to download all the necessary pretrained models from Hugging Face into the correct directory.
60
+
61
+ Before running the script, make sure you are in the `AnyAccomp` root directory.
62
+
63
+ Run the following command:
64
+
65
+ ```bash
66
+ python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='amphion/anyaccomp', local_dir='./pretrained', repo_type='model')"
67
+ ```
68
+
69
+ If you have trouble connecting to Hugging Face, you can try switching to a mirror endpoint before running the command:
70
+
71
+ ```bash
72
+ export HF_ENDPOINT=https://hf-mirror.com
73
+ ```
74
+
75
+ ### 3. Install the Environment
76
+
77
+ Before start installing, make sure you are under the `AnyAccomp` directory. If not, use `cd` to enter.
78
+
79
+ ```bash
80
+ conda create -n anyaccomp python=3.9
81
+ conda activate anyaccomp
82
+ conda install -c conda-forge ffmpeg=4.0
83
+ pip install -r requirements.txt
84
+ ```
85
+
86
+ ### Run the Model
87
+
88
+ Once the setup is complete, you can run the model using either the Gradio demo or the inference script.
89
+
90
+ #### Run Gradio 🤗 Playground Locally
91
+
92
+ You can run the following command to interact with the playground:
93
+
94
+ ```bash
95
+ python gradio_app.py
96
+ ```
97
+
98
+ #### Inference Script
99
+
100
+ If you want to infer several audios, you can use the python inference script from folder.
101
+
102
+
103
+ ```bash
104
+ python infer_from_folder.py
105
+ ```
106
+
107
+ By default, the script loads input audio from `./example/input` and saves the results to `./example/output`. You can customize these paths in the [inference script](./anyaccomp/infer_from_folder.py).
108
+
109
+ ## Citation
110
+
111
+ If you use AnyAccomp in your research, please cite our paper:
112
+
113
+ ```bibtex
114
+ @article{zhang2025anyaccomp,
115
+ title={AnyAccomp: Generalizable Accompaniment Generation via Quantized Melodic Bottleneck},
116
+ author={Zhang, Junan and Zhang, Yunjia and Zhang, Xueyao and Wu, Zhizheng},
117
+ journal={arXiv preprint arXiv:2509.14052},
118
+ year={2025}
119
+ }
120
+ ```
121
+