Skip to playerSkip to main content
Combine the power of Microsoft's Edge-TTS with RVC voice models to create high-quality, expressive AI speech. In this tutorial, I show you how to install the Edge-TTS WebUI by LeAgent02 and bridge it with your RVC voice models for custom character audio.

Whether you're looking to create character-specific narration or simply want to experiment with voice synthesis, this workflow works locally on your machine—even if you don't have a high-end GPU (though NVIDIA hardware is recommended for faster processing).

Original YouTube Tutorial: https://youtu.be/6DcSJWI32JY

Video Details:
* Original Publish Date: January 24, 2024
* Focus: TTS Integration / RVC Voice Conversion / Character Synthesis
* Test using GTX 1660 Super
---
Resources & Tools:
* Edge-TTS WebUI (LeAgent02): https://github.com/litagin02/rvc-tts-webui

Follow lordcaocao2025 on Dailymotion for more technical AI research and generative workflow guides!

---
Connect with me:
📺 YouTube: https://www.youtube.com/@CaoCao2025
📱 TikTok: https://www.tiktok.com/@caocao20250
💎 Patreon: https://www.patreon.com/cw/Caocao2025

#EdgeTTS #RVC #TextToSpeech #AIVoice #AIWorkflow #lordcaocao2025
Transcript
00:00hello guys welcome back with me chowchow2025 today we're going to learn
00:05about how to use text-to-speech with RVC models for example we use Uruji voice
00:12here with already converted I convert it again
00:15you see this is an English text-to-speech conversation demo this is an English
00:28chowchow2025 please enjoy the video and it's pretty fast too with 1660 super
00:40that is this is chowchow2025 please enjoy the video okay let's learn about how to
00:48install it first let's go I'm going to create a new folder so I'm not going to
00:54use this installation the first step you need to do is to install Python and
01:01kid not going to explain this too many video already explained that and now
01:06access this link or not because I'm going to put the links on the description
01:11below text-to-speech that I use is obviously test web UI by litagin02 it's
01:17use edge TTS to create this so basically it's gonna use text-to-voice by
01:24windows by age and then the mp3 gonna be conferred by an RVC models that is
01:33RVC text-to-speech wave UI okay step one you need to create a folder for example I'm
01:40create a folder here create this folder then we need to get clone this thing so
01:47first let's call cmd it's just a folder that I created simply copy paste this
01:58like so that's why I need you to do this
02:03paste it and we're going to get this thing
02:10looks real fast after this we're going to create we need to enter the folder so we
02:20are now in the folder next we're going to create a virtual environment so you no
02:27matter what our setup in the regular setting it's not going to change when we
02:34call this thing okay the virtual environment is created now we need to
02:43activate it it call it with this fan script activate now next we're going to
02:49install it in our virtual environment going to install PyTorx this is
02:57necessary if you have a nvidia GPU or else you're going to use CPU you don't need
03:01to do this but yeah we're going to install it it's 2.7 gigs
03:11it takes seven minutes on my PC to eight minutes so yeah we're seeing one hand it's
03:1810 minutes which it's kind of don't look too oh it's good so I don't know that any
03:25other thing too
03:31but nothing big I guess
03:38you
03:49available
03:50okay okay next we're going to install the requirement
04:18Thank you, Mr. Chairman.
04:33Okay, it should be very fast.
04:56Okay, we're already finished this.
04:58The next part is you go back to your folder that you are installed here.
05:05You're going to have wait here.
05:07And this thing is empty.
05:10Now we need to put our models here.
05:14We're going to put some here, for example.
05:17Let's place a folder, two folder here.
05:19This is the folder name.
05:22And inside a folder name, you're going to put two models, PTH and index file.
05:29So I put RUJI file and Futaba Sakura here as the model.
05:34This is your model name.
05:36Okay, after you do this, you need to call the application.
05:44Remember, you need to enter virtual environment first.
05:47I'm going to exit first.
05:51Now how to call the application.
05:56You need to copy-paste this link, the location that you have, the app.py, the Python file here.
06:05And you open a new GMD.
06:10You select CD.
06:11You write CD, then paste this by right-clicking.
06:16Enter it and then enter the drive.
06:20You know, for example, I used D, so I used D.
06:24Then press Enter.
06:25Enter, we are here.
06:26Remember, always access virtual environment first by writing fanscript.
06:34Activate.
06:36Enter.
06:37Now, use Python app.py to call the application.
06:45Okay, guys, I forgot to do this.
06:47I need to do this first.
06:50Download the models before I do anything else.
06:53And then we're going to copy-paste this, curl this.
07:00We've done this the first time.
07:05Then it takes some time to download this, I guess.
07:0946 megs.
07:10It's not really that big.
07:15Okay.
07:16I asked the other model.
07:18Sorry, guys.
07:19I kind of forgot to do this first.
07:21But it doesn't matter.
07:22You could always do it later.
07:26Okay.
07:27Now the same step.
07:31Penf.
07:33Script.
07:35Activate.
07:36To activate virtual environment.
07:38And then tighten.
07:41F.U.I.
07:48Now it's loading the models.
07:50Because, you know.
07:52Now we have two models that we put before.
07:55Ryuji and Futaba Sakura.
07:58Depending on this weight here.
08:06Now, the trick is, for example, if you use a girl voice, like Futaba Sakura here, you need to select
08:15the models.
08:16First, if you use English, just use U.S. here.
08:20And then, if it's a girl, use a girl voice.
08:24So, you ask female.
08:26For example, we're going to use Anna female here.
08:30I'm going to say, hello.
08:35This is Futaba Sakura from Persona 5.
08:40Hope you enjoy this text to speech web UI using RPC.
08:52You're going to convert it.
08:53You could transform.
08:55Transpose the word or something.
08:57Increase the speech speed and stuff.
09:00Index.
09:00You could sell out the index that you want and protect.
09:06It's going to convert it.
09:08This is the first time you run it.
09:10It may take a while.
09:13But usually it's not.
09:16Okay.
09:17This is the original age voice.
09:20Hello.
09:21This is Futaba Sakura from Persona 5.
09:24Hope you enjoy this text to speech web UI using RPC.
09:29This is the original age voice.
09:31Okay.
09:31The original.
09:32And then the voice is converted to these models.
09:37Hello.
09:38This is Futaba Sakura from Persona 5.
09:42Hope you enjoy this text to speech web UI using RPC.
09:47So it converts the voice.
09:49It's not its original model to create a voice.
09:55But you could play this feed here and transfer and start index that you like.
10:00So it's more and more like the Futaba Indonesian.
10:04The voice is right.
10:05But the Indonesian is according to the ATTS speaker that you sell it.
10:11So we're going to increase the speed like Futaba talk fast.
10:17And we could reduce the protect or index you could play it.
10:23If you like playing index.
10:25I rarely use it.
10:27But sometimes it would make it good.
10:28Hello.
10:29This is Futaba Sakura from Persona 5.
10:32Hope you enjoy this text to speech web UI using RPC.
10:36Okay.
10:37Now we could use also Ryuji here.
10:41But make sure you use male voice.
10:43Like Christopher here.
10:45This is Ryuji Sakamoto.
10:53Ryuji's book.
10:55Maybe.
10:56More slowly.
10:57Okay.
11:03This is Ryuji Sakamoto from Persona 5.
11:07Hope you enjoy this text to speech web UI using RPC.
11:11Okay.
11:12This is the original ATTS.
11:14Now it converts to Ryuji voice.
11:16This is Ryuji Sakamoto from Persona 5.
11:20Hope you enjoy this text to speech web UI using RPC.
11:24Okay guys.
11:25That's how it works.
11:27If you download the voice you just right click download here.
11:31You could also setting play big speed if you want.
11:34Anyway that's that.
11:35And make sure you use RMV because it's really better.
11:39And with GTX 1660 Super that I have.
11:42It's conferred very fast.
11:43So don't worry about it.
11:44But maybe you could use PM if you don't have an N3DS GPU.
11:49Or if you feel that the comfort is take too long.
11:54But it's kind of have worse quality I guess.
12:01Hello.
12:02This is Ryuji Sakamoto from Persona 5.
12:05It's worse.
12:06Hope you enjoy this text to speech web UI using RVC.
12:09It's not really clean the voice if you use this PM mode also.
12:13Not PM but page extraction method.
12:17You could also use the transport to transport.
12:21Transports to no voice.
12:25Anyway that's that.
12:27Hope it's useful for you.
12:29And have a nice day.
Comments

Recommended