Omost Art: High Quality AI Image Generation with the Shortest Prompts

How to use Omost, best practices and explainers

Prompt: Little witch in the woods

Prompt: a crowded expo

Prompt: the yellow river

a crowded expo by omost
a crowded expo by omost
the_yellow_river_byOmost
the_yellow_river_byOmost

Try Omost

⚠️Due to the GPU limitaions of Huggingface, you might only try once or twice here.

Explore Omost Art Gallery

a-hyperrealistic-model-with-delicate-watercolor-inspired-makeup
a-hyperrealistic-model-with-delicate-watercolor-inspired-makeup
Protraits

Real results, no cherry picking

blurry-image-of-palm-leaves-white-and-emerald-nature-based-patterns-flower-and-nature-motifs-soft-tonal-range
blurry-image-of-palm-leaves-white-and-emerald-nature-based-patterns-flower-and-nature-motifs-soft-tonal-range
Products
Art
Abstract

Omost Explained

Omost is a project designed to leverage the coding capabilities of large language models (LLMs) to generate and compose images.

The project's name, Omost (pronounced "almost"), reflects its purpose: after using Omost, your image is almost finished.

The "O" stands for "omni" (multi-modal), and "most" signifies the goal of maximizing its capabilities.

Omost's workflow
Omost's workflow

How Omost works

Core Concept

Omost enables LLMs to write codes that compose visual content on a virtual canvas. This canvas serves as a blueprint that can be rendered into actual images by specific implementations of image generators. Essentially, Omost acts as a mediator between textual descriptions and visual content creation.

Model Training

Omost provides three pre-trained LLM models based on variations of Llama3 and Phi3. These models are trained using a mix of data sources:

  1. Ground-Truth Annotations: Data from several datasets, including Open-Images, which provide accurate annotations.

  2. Automatically Annotated Images: Data extracted by automatically annotating images.

  3. Direct Preference Optimization (DPO): Reinforcement data based on whether the codes can be compiled by Python 3.10 or not.

  4. Tuning Data: A small amount of data from OpenAI GPT-4o’s multi-modal capabilities.

Limitations of Omost

  • Limited to specific types of interactions and scenarios.

  • Requires significant computational resources.

  • Dependency on SDXL's performance.

  • Potential biases present in the model.

  • Limited support for non-English languages.

  • Huggingface version is heavily censored. You can not generate images of celebrities.

  • May produce less accurate results for complex or nuanced queries.

2024 NBA final, made by Omost
2024 NBA final, made by Omost

Prompt: 2024 NBA FINAL

A FAILED example. See more examples: Omost Art Blog

Comments on Omost

"It's huge improvement for image generation models understanding prompts"

—— op7418

"Imo it's quite interesting, but also not a game changer."

—— Devikar

"It is This is instruct-pix2pix on asteroids"

—— cunicode

"Omost is very jank from a software design perspective and reckless from a security perspective."

—— Reddit comment

"I'm tried of prompt engineering for SD Models. Omost has put and end to that"

—— Luka Hou

"This is ridiculous. I’ve use Claude and GPT for the same purpose, yeah it works but really no big news"

—— Reddit Comment

❤️Lovers: Omost is so powerful and have so much poteintial!
💩Haters: fancy concept, but the outcome is no better than a SD 1.5 Realistic Model
"This should be one of the major upfront techs for achieving AGI"

—— iPrompt

"Tried it. Still can't connect an umbrella handle lol."

—— Youtube Comment

How to Deploy Omost Locally

Requirements: 8GB Nvidia VRAM

Who Made Omost

Lvmin Zhang (Lyumin Zhang) is a Ph.D. student in Computer Science at Stanford University, where he has been working under the guidance of Prof. Maneesh Agrawala since 2022. Prior to this, he served as a Research Assistant in the lab of Prof. Tien-Tsin Wong at the Chinese University of Hong Kong starting from 2021. Additionally, he has collaborated on numerous intriguing projects with Prof. Edgar Simo-Serra. Lvmin earned his B.Eng. degree from Soochow University in 2021, under the supervision of Prof. Yi Ji and Prof. Chunping Liu.

Lvmin's research interests span computational art and design, interactive content creation, computer graphics, and image and video processing, with a particular passion for anime. Reflecting this enthusiasm, he founded the Style2Paints Research group, which focuses on these areas. Furthermore, he developed an anime drawing software named Style2Paints.

Zhang's Other Projects
Lvmin Zhang
Lvmin Zhang