mirror of
https://github.com/fumiama/Retrieval-based-Voice-Conversion-WebUI.git
synced 2026-06-05 01:10:22 +08:00
* feat(audio): use PyAV instead of ffmpeg replaced usage of ffmpeg in favor of PyAV (`av`) * refactor(audio): store all of the audio related functions in the `infer.lib.audio` refactors previous commit to have singular functions for each task, all located in `infer.lib.audio` * fix(audio): remove downsample_audio from mdxnet.py it is no longer needed, since it's imported from infer.lib.audio * docs: remove every ffmpeg mention in the documentation to avoid confusion * chore(requirements): remove ffmpeg-python and ffmpy from all requirements * fix(audio): fix loading for UVR wrapped gathering of META info from the stream into a function fixes loading for UVR * fix(audio): use np.frombuffer() instead of direct conversion of the resampled frames this fixes traceback on preprocessing * feat(audio): pre-allocate decoded_audio array in the load_audio function this should improve performance, even if just a little * Revert "docs: remove every ffmpeg mention in the documentation to avoid confusion" This reverts commit1e05bbce03. * chore(format): run black on dev * fix(requirements): revert removal of ffmpeg in unitest.yml and Dockerfile * Revert "fix(requirements): revert removal of ffmpeg in unitest.yml and Dockerfile" This reverts commite28a0eebb2. * feat(audio): pre-allocate numpy array to store the AudioFrame data in ndarray of dtype float32 * chore(format): run black on dev * fix(audio): fix the decoded_audio size estimation in estimated_total_samples we multiply by `sr` instead of `container.streams.audio[0].rate` since we want to estimate size of the OUTPUT file, not the input one. - Added dynamic resizing, in case something goes wrong and the size of decoded_audio is estimated incorrectly Fixed function `load_audio` when the input audio's samplerate does not match the desired samplerate (`sr`) * chore(format): run black on dev * refactor(audio): remove `clean_path()` function as it serves no purpose anymore * docs: remove everything related to ffmpeg this includes everything except for formats support specification in the training_tips docs, since it has nothing to do with what ffmpeg does/did but rather what audio formats are supported (all the ones that ffmpeg supports!) * docs: fix order of the steps in preparation in the READMEs --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
48 lines
720 B
Plaintext
48 lines
720 B
Plaintext
joblib>=1.1.0
|
|
numba
|
|
numpy
|
|
scipy
|
|
librosa==0.9.1
|
|
llvmlite
|
|
fairseq @ git+https://github.com/One-sixth/fairseq.git
|
|
faiss-cpu
|
|
gradio
|
|
Cython
|
|
pydub>=0.25.1
|
|
soundfile>=0.12.1
|
|
tensorboardX
|
|
Jinja2>=3.1.2
|
|
json5
|
|
Markdown
|
|
matplotlib>=3.7.0
|
|
matplotlib-inline>=0.1.3
|
|
praat-parselmouth>=0.4.2
|
|
Pillow>=9.1.1
|
|
resampy>=0.4.2
|
|
scikit-learn
|
|
tensorboard
|
|
tqdm>=4.63.1
|
|
tornado>=6.1
|
|
Werkzeug>=2.2.3
|
|
uc-micro-py>=1.0.1
|
|
sympy>=1.11.1
|
|
tabulate>=0.8.10
|
|
PyYAML>=6.0
|
|
pyasn1>=0.4.8
|
|
pyasn1-modules>=0.2.8
|
|
fsspec>=2022.11.0
|
|
absl-py>=1.2.0
|
|
audioread
|
|
uvicorn>=0.21.1
|
|
colorama>=0.4.5
|
|
pyworld==0.3.2
|
|
httpx
|
|
onnxruntime; sys_platform == 'darwin'
|
|
onnxruntime-gpu; sys_platform != 'darwin'
|
|
torchcrepe==0.0.20
|
|
fastapi
|
|
torchfcpe
|
|
python-dotenv>=1.0.0
|
|
av
|
|
pybase16384
|