簡介
NeMo 工具包整合自動語音識別 (ASR)、自然語言處理 (NLP) 和文本到語音 (TTS) 模型集合
,工具包內每個模組都包含預訓練好的檔案與訓練所需要的設定,可以輕易地使用與組合
NeMo 使用PyTorch Lightning進行簡單高效的多 GPU/多節點混合精度訓練。
懶人直用
docker run --gpus all -it --rm --shm-size=8g \
-p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit \
stack=67108864 --device=/dev/snd raidavid/nemo bash -c "jupyter notebook"


從頭建立
clone NeMo
git clone git@github.com:NVIDIA/NeMo.git
執行 Docker
<nemo_github_folder> 更換為 NeMo 目錄 ex:/home/ubuntu/NeMo
docker run --gpus all -it --rm -v <nemo_github_folder>:/NeMo --shm-size=8g \
-p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit \
stack=67108864 --device=/dev/snd nvcr.io/nvidia/pytorch:21.05-py3
安裝相關套件
cd /NeMo/
./reinstall.sh
測試環境
wget https://nemo-public.s3.us-east-2.amazonaws.com/mcv-samples-ru/common_voice_ru_19034087.wav
import nemo
import nemo.collections.asr as nemo_asr
import nemo.collections.nlp as nemo_nlp
import nemo.collections.tts as nemo_tts
quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="stt_ru_quartznet15x5").cuda()
nmt_model = nemo_nlp.models.MTEncDecModel.from_pretrained(model_name='nmt_ru_en_transformer6x6').cuda()
spectrogram_generator = nemo_tts.models.FastPitchModel.from_pretrained(model_name="tts_en_fastpitch").cuda()
vocoder = nemo_tts.models.HifiGanModel.from_pretrained(model_name="tts_hifigan").cuda()
russian_text = quartznet.transcribe(['common_voice_ru_19034087.wav'])
print(russian_text)
english_text = nmt_model.translate(russian_text)
print(english_text)
def text_to_audio(text):
parsed = spectrogram_generator.parse(text)
spectrogram = spectrogram_generator.generate_spectrogram(tokens=parsed)
audio = vocoder.convert_spectrogram_to_audio(spec=spectrogram)
return audio.to('cpu').detach().numpy()
audio = text_to_audio(english_text[0])

啟動 Jupyter Notebook
cd /NeMo/
jupyter notebook
製作 Docker
Dockerfile
FROM nvcr.io/nvidia/pytorch:21.05-py3
WORKDIR /
RUN git clone https://github.com/NVIDIA/NeMo.git
WORKDIR /NeMo
./reinstall.sh
編譯
docker build -t raidavid/nemo .
測試
docker run --gpus all -it --rm -v /home/ubuntu:/NeMo --shm-size=8g \
-p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit \
stack=67108864 --device=/dev/snd raidavid/nemo bash -c "jupyter notebook"
上傳映像檔
docker push raidavid/nemo