LVM-Image-Text-Similarity Filter (DLC) - Platform For AI

The LVM-Image-Text-Similarity Filter (DLC) component is used to filter the data of an image that has excessively low text-image similarity.

Supported computing resources

Deep Learning Containers (DLC)

Algorithm

The LVM-Image-Text-Similarity Filter (DLC) component calculates the similarity between the description text of an image and the description text in training data based on clip-vit-base-patch32. This way, the component filters the data of an image that has excessively low text-image similarity to ensure the quality of the image. The description text in training data is the content that follows the <__dj__video> field in the training data file. In most cases, the component is used for the subsequent training of image generation models.

The input is a JSONL file. The <__dj__image> field is the start marker of the description text and the <|__dj__eoc|> field is the end marker of the description text.

The images field is the OSS path of the image.
The text field is the description text.

Inputs and outputs

Input ports

The Read File Data component is used to read the OSS path in which the training data is stored.
You can configure the Image Data OSS Path parameter to select the training data file.

For more information about the training data file, see Algorithm.

Output port

The filtering results. For more information, see the parameter description in the following section.

Configure the component

You can configure the parameters of the LVM-Image-Text-Similarity Filter (DLC) component in Machine Learning Designer. The following table describes the parameters.

Tab	Parameter		Required	Description	Default value
Field Settings	Image Data OSS Path		No	The training data file. For more information, see Algorithm.	No default value
	Output File OSS Path		Yes	The OSS directory in which the filtering results are stored. The results include the following files: {name}.jsonl: the output file. You can configure the Output Filename parameter to specify the output file. {name}_stats.jsonl: the state file. dj_run_yaml.yaml: the parameter configuration file used when the algorithm runs.	No default value
	Output Filename		Yes	The file name of the filtering results.	result.jsonl
Parameter Settings	Minimum Text-Frame Similarity Score		Yes	The minimum text-image similarity.	0.1
Parameter Settings	Maximum Text-Frame Similarity Score		Yes	The maximum text-image similarity. In most cases, set this parameter to 1.	1
Execution Tuning	Select Resource Group	Public Resource Group	No	The instance type (CPU or GPU) and virtual private cloud (VPC) that you want to use. You must select the GPU instance type for the algorithm.	No default value
	Select Resource Group	Dedicated resource group	No	The number of CPU cores, memory, shared memory, and number of GPUs that you want to use.	No default value
	Maximum Running Duration (seconds)		No	The maximum period of time for which the component can run. If the specified period of time is exceeded, the job is terminated.	No default value