FreeStyle

Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

  1. Jinghong Lan1,2,*
  2. Wei Cheng2,*
  3. Yunuo Chen2
  4. Ziqi Ye1
  5. Peng Xing2
  6. Yixiao Fang2
  7. Rui Wang2
  8. Yufeng Yang2
  9. Xuanyang Zhang2
  10. Xianfang Zeng2
  11. Difan Zou4
  12. Gang Yu2,‡
  13. Chi Zhang3,‡
  • 1 Fudan University
  • 2 StepFun
  • 3 Westlake University
  • 4 University of Hong Kong

* Equal contribution. ‡ Corresponding authors.

We propose FreeStyle, a scalable dual-reference generation framework based on community LoRA mining. Treating community LoRAs as compositional anchors for style and content, we design a rigorous generation and filtering pipeline to build large-scale Style-Reference (SRef) and Content-Reference (CRef) triplets across multiple base models, together with a benchmark for style-reference and dual-reference generation. An attention-level constraint and a RoPE weighting method further suppress style-reference leakage while preserving style richness.

Style Transfer

FreeStyle style transfer results

FreeStyle transfers a reference style onto the content while keeping the style strength explicit. Each card shows the two references followed by our output — hover to pause the row, and click any image to open the full preview.

Style-Content Reference Generation

FreeStyle style-content reference generation results

Given a content reference and a style reference, FreeStyle generates an image that follows both. Thanks to the style diversity and style-content disentanglement of our dataset, it handles this dual-reference setting well. Hover an output to read its prompt.

Dataset

A dataset covering both reference settings

FreeStyle covers both reference settings. The CRef+SRef dataset provides triplets for content-and-style dual-reference generation, with 480K sequences (Flux 273,682 + Illustrious 172,589 + Qwen 33,582) spanning 1,704 styles. The traditional SRef dataset targets pure style-reference generation with 619,302 sequences across 622 styles.

SRef dataset

Style Transfer Pair Generated By Style Trigger

In the style transfer dataset pair, we collect effective style trigger words from the community, using Nano Banana model to transfer the style of the collected content image including many different types of objects, like people, animals, and scenes.

CRef + SRef dataset

Content and Style Reference Pair Generated By Community LoRA

We mine the community LoRAs to find the effective style and content LoRAs, and then combine them to generate the dual-reference dataset. The dual-reference dataset covers a wide range of styles and content, which has a positive impact on learning the disentanglement of style and content.