Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio – including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints, which are ready for inference and available for commercial use.
Bark是Suno创建的基于转换器的文本到音频模型。Bark可以生成高度逼真的多语言语音以及其他音频——包括音乐、背景噪音和简单的音效。该模型还可以产生非语言交流,如大笑、叹息和哭泣。为了支持研究社区,我们提供了对预训练模型检查点的访问,这些检查点可以进行推理,并可用于商业用途。
相关导航
暂无评论...