Omost Art: High Quality AI Image Generation with the Shortest Prompts
How to use Omost, best practices and explainers
Prompt: Little witch in the woods
Prompt: a crowded expo
Prompt: the yellow river
Omost Explained
Omost is a project designed to leverage the coding capabilities of large language models (LLMs) to generate and compose images.
The project's name, Omost (pronounced "almost"), reflects its purpose: after using Omost, your image is almost finished.
The "O" stands for "omni" (multi-modal), and "most" signifies the goal of maximizing its capabilities.
How Omost works
Core Concept
Omost enables LLMs to write codes that compose visual content on a virtual canvas. This canvas serves as a blueprint that can be rendered into actual images by specific implementations of image generators. Essentially, Omost acts as a mediator between textual descriptions and visual content creation.
Model Training
Omost provides three pre-trained LLM models based on variations of Llama3 and Phi3. These models are trained using a mix of data sources:
Ground-Truth Annotations: Data from several datasets, including Open-Images, which provide accurate annotations.
Automatically Annotated Images: Data extracted by automatically annotating images.
Direct Preference Optimization (DPO): Reinforcement data based on whether the codes can be compiled by Python 3.10 or not.
Tuning Data: A small amount of data from OpenAI GPT-4o’s multi-modal capabilities.
How to Deploy Omost Locally
Who Made Omost
Lvmin Zhang (Lyumin Zhang) is a Ph.D. student in Computer Science at Stanford University, where he has been working under the guidance of Prof. Maneesh Agrawala since 2022. Prior to this, he served as a Research Assistant in the lab of Prof. Tien-Tsin Wong at the Chinese University of Hong Kong starting from 2021. Additionally, he has collaborated on numerous intriguing projects with Prof. Edgar Simo-Serra. Lvmin earned his B.Eng. degree from Soochow University in 2021, under the supervision of Prof. Yi Ji and Prof. Chunping Liu.
Lvmin's research interests span computational art and design, interactive content creation, computer graphics, and image and video processing, with a particular passion for anime. Reflecting this enthusiasm, he founded the Style2Paints Research group, which focuses on these areas. Furthermore, he developed an anime drawing software named Style2Paints.