From 4b99eacb38d9983271610947267063cfd39606a6 Mon Sep 17 00:00:00 2001 From: erogol Date: Fri, 19 Jun 2020 16:51:48 +0200 Subject: [PATCH] README update add models and method section --- speaker_encoder/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/speaker_encoder/README.md b/speaker_encoder/README.md index 38b4bb1b..b6f541f8 100644 --- a/speaker_encoder/README.md +++ b/speaker_encoder/README.md @@ -1,16 +1,16 @@ -### Speaker embedding (Experimental) +### Speaker Encoder This is an implementation of https://arxiv.org/abs/1710.10467. This model can be used for voice and speaker embedding. With the code here you can generate d-vectors for both multi-speaker and single-speaker TTS datasets, then visualise and explore them along with the associated audio files in an interactive chart. -Below is an example showing embedding results of various speakers. You can generate the same plot with the provided notebook as demonstrated in [this video](https://youtu.be/KW3oO7JVa7Q). +Below is an example showing embedding results of various speakers. You can generate the same plot with the provided notebook as demonstrated in [this video](https://youtu.be/KW3oO7JVa7Q). ![](umap.png) Download a pretrained model from [Released Models](https://github.com/mozilla/TTS/wiki/Released-Models) page. -To run the code, you need to follow the same flow as in TTS. +To run the code, you need to follow the same flow as in TTS. - Define 'config.json' for your needs. Note that, audio parameters should match your TTS model. - Example training call ```python speaker_encoder/train.py --config_path speaker_encoder/config.json --data_path ~/Data/Libri-TTS/train-clean-360```