AIDive
AvatarFX logo

AvatarFX

Turns a photo into a realistic video avatar

Description

AvatarFX is a multimodal model from Character.AI that animates a single photo into a longer, photorealistic video. It focuses on natural-looking facial expressions, speech sync, and consistent motion across the face, hands, and body, including scenes with more than one speaker.

How to access

AvatarFX is currently in a closed beta. To try it:

  • Go to the official site and sign in
  • Request access to the closed beta
  • Upload a source image and an audio file, or use built-in text-to-speech
  • Click Generate to receive the video
Key capabilities
  • Generate video from one photo with realistic expressions, gestures, and audio
  • Maintain consistent face/hand/body motion in longer clips
  • Support multiple speakers and dynamic dialogue in a single video
Tech, availability, and safety

AvatarFX uses flow-based diffusion on a DiT architecture and is trained on diverse video data with low-quality content filtered out. Inference is accelerated via distillation to reduce steps without sacrificing quality. Early full access is planned for C.ai+ subscribers ($10/month), with a waitlist at character.ai/video. Character.AI applies policy filters (including blocks for minors, public figures, and prohibited content) and adds a watermark to generated videos.

Summary

Tags

    Newsletter

    Get notified when new AI tools are added

    Join the community.