diff --git a/speaker_encoder/README.md b/speaker_encoder/README.md index 8f7b6750..53f77724 100644 --- a/speaker_encoder/README.md +++ b/speaker_encoder/README.md @@ -4,9 +4,11 @@ This is an implementation of https://arxiv.org/abs/1710.10467. This model can be ![](https://user-images.githubusercontent.com/1402048/64603079-7fa5c100-d3c8-11e9-88e7-88a00d0e37d1.png) +Download a pretrained model from [Released Models](https://github.com/mozilla/TTS/wiki/Released-Models) page. + To run the code, you need to follow the same flow as in TTS. - Define 'config.json' for your needs. Note that, audio parameters should match your TTS model. - Example training call ```python speaker_encoder/train.py --config_path speaker_encoder/config.json --data_path ~/Data/Libri-TTS/train-clean-360``` - Generate embedding vectors ```python speaker_encoder/compute_embeddings.py --use_cuda true /model/path/best_model.pth.tar model/config/path/config.json dataset/path/ output_path``` . This code parses all .wav files at the given dataset path and generates the same folder structure under the output path with the generated embedding files. -- Watch training on Tensorboard as in TTS \ No newline at end of file +- Watch training on Tensorboard as in TTS