Diff-Oracle: Diffusion Model for Oracle Character Generation with Controllable Styles and Contents

21 Dec 2023  ·  Jing Li, Qiu-Feng Wang, Kaizhu Huang, Rui Zhang, Siyuan Wang ·

Deciphering the oracle bone script plays a significant role in Chinese archaeology and philology. However, it is significantly challenging due to the scarcity of oracle character images. To overcome this issue, we propose Diff-Oracle, based on diffusion models (DMs), to generate sufficient controllable oracle characters. In contrast to most DMs that rely on text prompts, we incorporate a style encoder to control style information during the generation process. This encoder extracts style prompts from existing oracle character images, where style details are converted from a CLIP model into a text embedding format. Inspired by ControlNet, we introduce a content encoder to capture desired content information from content images, ensuring the fidelity of character glyphs. To train Diff-Oracle effectively, we propose to obtain pixel-level paired oracle character images (i.e., style and content images) by a pre-trained image-to-image translation model. Extensive qualitative and quantitative experiments conducted on two benchmark datasets, Oracle-241 and OBC306, demonstrate that our Diff-Oracle outperforms existing generative methods in terms of image generation, further enhancing recognition accuracy. Source codes will be available.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods