UX/UI Design / Web / AI

Opus Clip User Experience Enhancement: Integrating Multi-Video Control and Text-to-Video

Duration
Jan 2024 - Apr 2024
My Role
  • UI/UX Designer
Team
  • Yihao Shi
  • Xiaowei Zhang
  • Clark Huang
Opus Clip User Experience Enhancement: Integrating Multi-Video Control and Text-to-Video

01. Project Overview

Context

Opus Clip is a generative AI video tool that repurposes long videos into engaging short clips with a single click. Using a special model powered by GPT-4, Opus Clip picks out the best moments from a video, rearranges the content, and creates impactful clips. Our project aimed to enhance the existing features and introduce new functionalities to improve the user experience and expand the tool’s capabilities.

Problem Statement

Despite its advanced AI capabilities, Opus Clip faced challenges in making its video editing features more robust and user-friendly. Users needed better control over multi-video projects and a seamless way to create videos from text prompts.

Goals

  1. Enhance Multi-Video Control: Allow users to upload, edit, and mix multiple videos seamlessly.
  2. Introduce Text-to-Video Feature: Enable users to create videos from text prompts.

02. Research

Use Case Analysis

Use Case 1:

  • Scenario: Users have creative ideas but find taking videos too cumbersome.
  • Problem: Taking and organizing videos is time-consuming and often discourages users from following through with their ideas.
  • Solution: Introduce a text-to-video feature that allows users to generate videos from text prompts, simplifying the initial video creation process.

Use Case 2:

  • Scenario: Users find it challenging to edit multiple videos and compare different results.
  • Problem: Managing and editing multiple video inputs to produce a cohesive final product is difficult and inefficient.
  • Solution: Implement a multi-video control feature that allows users to upload, mix, and edit multiple videos seamlessly, and compare different versions easily.

Target Users Comparison

Competitors: Vimeo, Veed, Canva

  • User Skill Level: High
    • These platforms target professional video editors with advanced skills.
    • Interfaces are designed for team collaboration and complex editing tasks.

Opus Clip:

  • User Skill Level: Low to Medium
    • Designed for individual content creators with less video editing experience.
    • Focuses on simplicity and ease of use.

Design Goals:

  • Effortless and Clean: The interface is straightforward and user-friendly, ideal for solo users.
  • Minimalist Approach: Prioritizes intuitive navigation and reduces complexity to make video editing accessible to everyone.

03. Design Process

Current Design

Opus Clip currently allows users to extract clips from a single video using AI.

The process involves three steps:

  1. Input Video Link: Users paste a video link into the homepage input box.
  2. Adjust Settings: Users set clip length, time frame, and keywords.
  3. Generate Clips: Users click a button to generate and compare clips, aided by AI-generated scores.

While effective, this system is limited to single video inputs. We aimed to enhance it with multi-video control and text-to-video generation.

Integrating these new features without complicating the user experience was challenging. We needed a design that maintained simplicity while offering expanded functionality.

Prototype

Design Solution 1: Merged Functionalities

Initially, we imagined a robust input box that could merge all functionalities. Users could choose to input video, text, or both, and AI-generated content would be presented accordingly. However, this interface required multi-tasking, leading to user confusion.

Design Solution 2: Separated Entrances

Later, we explored a clearer approach with each feature as an independent entrance. This made the information architecture super clear for users. However, all users had to go through an extra step, making the interface less efficient.

Final Solution: 2 Tabs

After iterations, we introduced 2 tabs on the homepage. The "Video to Clips" and "Multi-Video Mix" functionalities were combined into the first tab, while "Text-to-Video" stood by itself in the second tab.

Pros and Cons Comparison:

Comparing the three options, the final solution is both intuitive and efficient for users. There are no additional steps, and each step focuses on a single task, making the interface easy to navigate and understand.