Google has introduced Lyria 3, a new family of music generation models now available in public preview on Vertex AI. These models allow users to create high-fidelity stereo audio from text prompts and images, including vocal support.
Two distinct models are offered through the API:
- Lyria 3 Pro: Capable of generating complete compositions up to three minutes long, it understands musical structures such as intros, verses, choruses, and bridges.
- Lyria 3: Designed for rapid prototyping, this model generates tracks lasting up to 30 seconds, making it ideal for social media and short-form audio.
Why it matters: These models provide structural coherence and the ability to produce studio-quality audio directly within applications. Key features include:
- Multi-modal input: Users can generate audio from standard text prompts or reference images to influence the mood and style.
- Vocal and lyrics generation: The models can create vocals and timed lyrics or utilize user-provided lyrics. For purely instrumental tracks, users can specify this in their prompts.
- Flexible compositions: Duration controls enable the generation of full songs with distinct musical sections.
Accessing Lyria: The models can be accessed via the Vertex AI API and Vertex AI Media Studio.
Customer Experiences with Lyria 3
Artlist's Chief Product and Technology Officer, Roee Peled, highlighted the model's potential, stating that it merges their music expertise with Google's advanced generative capabilities, offering unprecedented creative control and high-fidelity output.
Carlos Perez, Engineering Lead at Freepik, noted that Lyria 3 facilitates a shift from basic generation to enhanced creative control, allowing clients to produce music that aligns closely with their vision and workflows.
Commercial Safety and Responsible Creation
Google emphasizes responsibility in the design and training of Lyria 3 models. The models utilize materials compliant with YouTube and Google’s terms of service. Filters are in place to ensure outputs do not infringe on existing content, and users must comply with the Terms of Service and Gen AI prohibited use policies. All outputs are embedded with SynthID watermarking and support C2PA standards.
Getting Started with Lyria 3
For those interested in exploring best practices, the following resources are available:
- Model card
- Ultimate prompting guide for Lyria
- Gen AI SDK for Python notebook
- For users of Gen Media MCP tools, domain-specific sound design can be provided via Agent Skill.
Google Workspace customers and Google AI subscribers can also utilize Lyria 3 models in Google Vids to create custom tracks that reflect their brand's style.