Skip to main content

On This Page

Tencent Releases HY-Motion 1.0: A Billion-Parameter Text-to-Motion Model

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Billion-Parameter Text-to-Motion with HY-Motion 1.0

Tencent Hunyuan has released HY-Motion 1.0, an open-weight text-to-3D human motion model with 1 billion parameters. Built on the Diffusion Transformer (DiT) architecture and Flow Matching, HY-Motion 1.0 generates 3D human motion clips on an SMPL-H skeleton from natural language prompts and specified durations.

Why This Matters

Current text-to-motion systems often struggle with generating realistic and semantically accurate movements, especially for complex activities or longer sequences. Existing models frequently produce unnatural poses, jittering, or fail to adhere to the specified textual instructions. The resulting limitations hinder adoption in applications like game development and virtual avatars, where believable animation is critical - often requiring extensive manual correction, driving up production costs.

Key Insights

  • 78.6% SSAE Score: HY-Motion 1.0 achieves a Structural Similarity and Animation Evaluation (SSAE) score of 78.6%, outperforming baseline models like DART and MoMask by over 20 percentage points.
  • Diffusion Transformers for Motion: The model leverages the power of DiT architecture, specifically adapted for motion data, offering advantages in sequence modeling and attention mechanisms.
  • Flow Matching for Stable Training: Utilizing Flow Matching, rather than traditional denoising diffusion, results in more stable training and better performance with long sequences.

Working Example

# Inference script example (simplified)
import torch
from hy_motion import HYMotion

# Load the model
model = HYMotion(size="1B") # or "Lite" for the smaller model

# Define the prompt and duration
prompt = "a person walking slowly"
duration = 5  #seconds

# Generate the motion
motion = model.generate(prompt, duration)

# Save the motion data (e.g., as a .bvh file)
motion.save("walking.bvh") 

Practical Applications

  • Game Development: Automate the creation of character animations based on narrative scripts.
  • Virtual Reality/Metaverse: Enable more realistic and responsive avatars for immersive experiences.
  • Pitfall: Relying on synthetic prompting data without sufficient domain-specific fine-tuning can result in models that produce unrealistic or unnatural motions.

References:

Continue reading

Next article

Trust Wallet Hack: $8.5M Drained via Shai-Hulud Supply Chain Attack

Related Content