Qwen2.5-Omni Multimodal Chat Demo

Experience the omni-modal capabilities of Qwen2.5-Omni through text, images, audio, and video interactions.

Voice Type

Model Capabilities

Qwen2.5-Omni can:

  • Process and understand text
  • Analyze images and answer questions about them
  • Transcribe and understand audio
  • Analyze video content (with or without audio)
  • Generate natural speech responses

Example Prompts

Text Examples
Text Input