Omost Art: 用最短的提示
生成高质量的AI图像
如何使用Omost/最佳实践和说明
Prompt: Little witch in the woods
Prompt: a crowded expo
Prompt: the yellow river
什么是 Omost
Omost是一个项目,旨在利用大型语言模型(LLM)的编码能力来生成和构建图像。
项目名称Omost(发音为“almost”)反映了这一目的:使用Omost后,你的图像几乎完成了。
"O"代表“omni”(多模态),而"most"表示最大化其功能的目标。
Omost 原理
核心概念
Omost使LLM能够编写代码,在虚拟画布上组成视觉内容。该画布充当蓝图,可以通过特定的图像生成器实现渲染为实际图像。本质上,Omost在文本描述和视觉内容生成之间充当中介。
模型训练
Omost提供了三个基于Llama3和Phi3变体的预训练LLM模型。这些模型是使用多种数据源混合训练的:
真实标注:来自多个数据集(包括Open-Images)的精确标注数据。
自动标注图像:通过自动标注图像提取的数据。
直接偏好优化(DPO):基于代码是否可以由Python 3.10编译的强化数据。
微调数据:来自OpenAI GPT-4o多模态功能的一小部分数据。
如何在本地部署Omost
Who Made Omost
Lvmin Zhang (Lyumin Zhang) is a Ph.D. student in Computer Science at Stanford University, where he has been working under the guidance of Prof. Maneesh Agrawala since 2022. Prior to this, he served as a Research Assistant in the lab of Prof. Tien-Tsin Wong at the Chinese University of Hong Kong starting from 2021. Additionally, he has collaborated on numerous intriguing projects with Prof. Edgar Simo-Serra. Lvmin earned his B.Eng. degree from Soochow University in 2021, under the supervision of Prof. Yi Ji and Prof. Chunping Liu.
Lvmin's research interests span computational art and design, interactive content creation, computer graphics, and image and video processing, with a particular passion for anime. Reflecting this enthusiasm, he founded the Style2Paints Research group, which focuses on these areas. Furthermore, he developed an anime drawing software named Style2Paints.