adirik / styletts2

Generates speech from text

replicate.com
Total runs: 130.4K
24-hour runs: 0
7-day runs: 0
30-day runs: 0
Github
Model's Last Updated: February 01 2024

Introduction of styletts2

Model Details of styletts2

Readme

StyleTTS 2

StyleTTS 2 is a text-to-speech model that can generate speech from text and text + a reference speech to copy its style (speaker adaptation). See the original repository and paper for details.

API Usage

To use the model, simply provide the text you would like to generate speech for sound files as input. Optionally provide a reference speech (.mp3 or .wav) between 2-8 seconds for speaker adaptation. The API returns an .mp3 file with generated speech.

Input parameters are as follows:
- text: Text to convert to speech.
- reference: (optional) Reference speech to copy style from.
- alpha: Only used for long text inputs or in case of reference speaker, determines the timbre of the speaker. Use lower values to sample style based on previous or reference speech instead of text.
- beta: Only used for long text inputs or in case of reference speaker, determines the prosody of the speaker. Use lower values to sample style based on previous or reference speech instead of text.
- diffusion_steps: Number of diffusion steps.
- embedding_scale: Embedding scale, use higher values for pronounced emotion.

References
@article{Li2023StyleTTS2T,
  title={StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models},
  author={Yinghao Aaron Li and Cong Han and Vinay S. Raghavan and Gavin Mischler and Nima Mesgarani},
  journal={ArXiv},
  year={2023},
  volume={abs/2306.07691},
  url={https://api.semanticscholar.org/CorpusID:259145293}
}

Pricing of styletts2 replicate.com

Run time and cost

This model costs approximately $0.034 to run on Replicate, or 29 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker .

This model runs on Nvidia T4 GPU hardware . Predictions typically complete within 3 minutes. The predict time for this model varies significantly based on the inputs.

Runs of adirik styletts2 on replicate.com

130.4K
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs

More Information About styletts2 replicate.com Model

styletts2 replicate.com

styletts2 replicate.com is an AI model on replicate.com that provides styletts2's model effect (Generates speech from text), which can be used instantly with this adirik styletts2 model. replicate.com supports a free trial of the styletts2 model, and also provides paid use of the styletts2. Support call styletts2 model through api, including Node.js, Python, http.

styletts2 replicate.com Url

https://replicate.com/adirik/styletts2

adirik styletts2 online free

styletts2 replicate.com is an online trial and call api platform, which integrates styletts2's modeling effects, including api services, and provides a free online trial of styletts2, you can try styletts2 online for free by clicking the link below.

adirik styletts2 online free url in replicate.com:

https://replicate.com/adirik/styletts2

styletts2 install

styletts2 is an open source model from GitHub that offers a free installation service, and any user can find styletts2 on GitHub to install. At the same time, replicate.com provides the effect of styletts2 install, users can directly use styletts2 installed effect in replicate.com for debugging and trial. It also supports api for free installation.

styletts2 install url in replicate.com:

https://replicate.com/adirik/styletts2

styletts2 install url in github:

https://github.com/yl4579/StyleTTS2

Url of styletts2

Provider of styletts2 replicate.com

Other API from adirik

replicate

Detect everything with language!

Total runs: 4.2M
Run Growth: 0
Growth Rate: 0.00%
Updated:October 23 2023
replicate

Realistic interior design with text and image inputs

Total runs: 663.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:April 06 2024
replicate

Photorealism with RealVisXL V3.0 Turbo based on SDXL

Total runs: 190.7K
Run Growth: 0
Growth Rate: 0.00%
Updated:January 19 2024
replicate

Flux lora, use "CNSTLL" to trigger

Total runs: 74.8K
Run Growth: 0
Growth Rate: 0.00%
Updated:August 24 2024
replicate

Photorealism with RealVisXL V4.0

Total runs: 46.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 20 2024
replicate

Zero-shot / open vocabulary object detection

Total runs: 23.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:October 13 2023
replicate

Monocular depth estimation

Total runs: 8.0K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 15 2023
replicate

Lightweight multimodal model for visual question answering, reasoning and captioning

Total runs: 7.8K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Generate videos from text prompts with Kandinsky-2.2

Total runs: 7.3K
Run Growth: 0
Growth Rate: 0.00%
Updated:October 18 2023
replicate

Zero-shot speech synthesizer for text-to-speech and voice conversion

Total runs: 4.4K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 15 2023
replicate

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Total runs: 4.3K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 07 2023
replicate

Text-guided image generation and editing

Total runs: 3.9K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 08 2024
replicate

Photorealism with Realistic Vision v6.0

Total runs: 3.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 11 2024
replicate

Editable image generation with MasaCtrl-SDXL

Total runs: 3.4K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 07 2023
replicate

Generates 3D assets from images

Total runs: 2.9K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 28 2024
replicate

[Non-commerical] A multi-image visual language model

Total runs: 2.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 13 2024
replicate

[Non-commerical] A multi-image visual language model

Total runs: 2.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 13 2024
replicate

PyTorch version of Lightweight OpenPose as introduced in "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose"

Total runs: 1.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:September 09 2023
replicate

Detects objects in an image

Total runs: 1.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:November 16 2023
replicate

Image-Prompt Multi-view Diffusion for 3D Generation

Total runs: 1.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:January 19 2024
replicate

Generate texture for your mesh with text prompts

Total runs: 1.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:November 28 2023
replicate

Multilingual speech translation that preserves original vocal style and prosody

Total runs: 1.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 12 2024
replicate

Generate 3D assets using text descriptions

Total runs: 1000
Run Growth: 0
Growth Rate: 0.00%
Updated:November 16 2023
replicate

Text-Guided Image Generation and Manipulation

Total runs: 824
Run Growth: 0
Growth Rate: 0.00%
Updated:February 01 2022
replicate

Base version of Mamba 2.8B, a 2.8 billion parameter state space language model

Total runs: 810
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Mamba 2.8B state space language model fine tuned for chat

Total runs: 791
Run Growth: 0
Growth Rate: 0.00%
Updated:February 16 2024
replicate

LEdits++ for image editing

Total runs: 760
Run Growth: 0
Growth Rate: 0.00%
Updated:March 27 2024
replicate

E5-mistral-7b-instruct language embedding model

Total runs: 629
Run Growth: 0
Growth Rate: 0.00%
Updated:February 23 2024
replicate

Multilingual E5-large language embedding model

Total runs: 536
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Inst-Inpaint: Instructing to Remove Objects with Diffusion Models

Total runs: 521
Run Growth: 0
Growth Rate: 0.00%
Updated:October 04 2023
replicate

[Non-commercial] Generate texture for 3D assets using text descriptions

Total runs: 268
Run Growth: 0
Growth Rate: 0.00%
Updated:March 21 2024
replicate

Fast text-to-3D Gaussian generation by bridging 2D and 3D diffusion models

Total runs: 245
Run Growth: 0
Growth Rate: 0.00%
Updated:March 04 2024
replicate

Image editing with Prompt-to-Prompt for SDXL

Total runs: 237
Run Growth: 0
Growth Rate: 0.00%
Updated:March 16 2024
replicate

Base version of Mamba 130M, a 130 million parameter state space language model

Total runs: 140
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Generate panoramic images with text prompts

Total runs: 118
Run Growth: 0
Growth Rate: 0.00%
Updated:January 30 2024
replicate

Generating object-level shape variations with Stable Diffusion

Total runs: 82
Run Growth: 0
Growth Rate: 0.00%
Updated:December 11 2023
replicate

Performs speaker identity verification

Total runs: 76
Run Growth: 0
Growth Rate: 0.00%
Updated:November 21 2023
replicate

Base version of Mamba 1.4B, a 1.4 billion parameter state space language model

Total runs: 72
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Base version of Mamba 2.8B Slim Pyjama, a 2.8 billion parameter state space language model

Total runs: 71
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Multilingual E5-small language embedding model

Total runs: 49
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Base version of Mamba 370M, a 370 million parameter state space language model

Total runs: 48
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Base version of Mamba 790M, a 790 million parameter state space language model

Total runs: 47
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Multilingual E5-large language embedding model

Total runs: 20
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024