Text to Image

Overview

Image large models themselves have learned many general artistic styles and objects, but they cannot understand what specific people or things look like. Through fine-tuning, we can teach large models to draw specific people or things around you.

Currently supported fine-tuning models include: FLUX.1-Krea-dev, FLUX.1-Dev, Qwen Image, and SDXL 1.0.

This article will use a dataset of 5 specific dog photos and leverage the out-of-the-box model fine-tuning functionality provided by Model Power Ark to make the large model "remember" the characteristics of a specific dog.

Step 1: Create Fine-tuning Task

Find Model Fine-tuning in the left sidebar of the Model Power Ark console, and click New Task in the upper right corner.

entry

Fill in the task name and select task type, fine-tuning method, base model, etc.:

entry

Step 2: Dataset Preparation

The dataset greatly affects fine-tuning results. The dataset includes both data and labeling.

It is recommended that dataset images contain multiple angles of the target object. Supported image upload formats include png, jpg, jpeg, with maximum supported image size of 5M, and no more than 100 images.

Example images for this tutorial

Upload images to the fine-tuning task by clicking the Upload Images button and selecting image files.

entry

After image upload is complete, click the Label/Crop button, and the system will label your images.

entry entry

Click on an image to view the automatic labeling results in a popup, and modify as needed.

Delete labels you think are unrelated to the image
Add label content you think is needed

entry

Step 3: Fine-tuning Parameter Settings

Click Next, Configure Training Parameters to enter the task settings page, including model selection, training epochs, learning iterations per image, etc. You can set these according to your needs.

Epochs: Determines how many times the model learns the entire dataset (saves one LoRA model per epoch)
Learning Iterations per Image: Determines how many times the model learns each image during each epoch

Both determine the model's training time and also affect the model's training effectiveness.

entry

Step 4: Computing Power Selection

Click Next, Select Computing Power to enter the computing power selection page. After confirmation, click Pay and Start Training

entry

Click the Start Training button in the lower right corner to enter task queuing status. Wait a moment, and the system will allocate resources for your task and begin training.

entry

After training is complete, you can view the training results, including the model's loss. The training program saves the model once per epoch and simultaneously generates 3 images to evaluate the model's training effectiveness.

entry

Step 5: View LoRA Effects

After fine-tuning is complete, each epoch generates one LoRA model. Each LoRA model uses labels to infer and generate 3 images. Generally, lower loss values indicate better results.

You can also perform image generation tests by entering custom prompts to view model effects.

entry

Overview​

Step 1: Create Fine-tuning Task​

Step 2: Dataset Preparation​

Step 3: Fine-tuning Parameter Settings​

Step 4: Computing Power Selection​

Step 5: View LoRA Effects​

Overview

Step 1: Create Fine-tuning Task

Step 2: Dataset Preparation

Step 3: Fine-tuning Parameter Settings

Step 4: Computing Power Selection

Step 5: View LoRA Effects