All Products
Search
Document Center

Platform For AI:LVM-Duration Filter (DLC)

Last Updated:Jul 08, 2024

The LVM-Duration Filter (DLC) component of Platform for AI (PAI) is used to filter the data of a video whose duration is excessively long or excessively short. Only MP4 videos can be processed.

Supported computing resources

Deep Learning Containers (DLC)

Algorithm

The LVM-Duration Filter (DLC) component filters the data of a video whose duration is excessively long or excessively short by calculating the duration of a video to ensure the quality of the video. In most cases, the component is used for the subsequent training of video generation models.

Inputs and outputs

Input ports

  • The Read File Data component is used to read the Object Storage Service (OSS) path in which the training data is stored.

  • You can configure the OSS Data Path parameter to select the OSS directory in which the video data is stored or the video metadata file. For more information, see the parameter description in the following section.

  • You can use any component of LVM Data Processing (DLC) as the input.

Output port

The filtering results. For more information, see the parameter description in the following section.

Configure the component

You can configure the parameters of the LVM-Duration Filter (DLC) component in Machine Learning Designer. The following table describes the parameters.

Tab

Parameter

Required

Description

Default value

Field Settings

Video Data OSS Path

No

If no upstream component exists the first time you run this component, you must manually select the OSS directory in which the video data is stored. When the component runs, the video metadata file video_meta.jsonl is generated in the upper-level directory of the directory specified by this parameter. When you use the component to process the video data later, you can directly select the file video_meta.jsonl.

No default value

Output File OSS Path

Yes

The OSS directory in which the filtering results are stored. The results include the following files:

  • {name}.jsonl: the output file. You can configure the Output Filename parameter to specify the output file.

  • {name}_stats.jsonl: the state file.

  • dj_run_yaml.yaml: the parameter configuration file used when the algorithm runs.

No default value

Output Filename

Yes

The file name of the filtering results.

result.jsonl

Parameter Settings

Minimum Duration Time (s)

Yes

The minimum duration. Unit: seconds.

0

Maximum Duration Time (s)

Yes

The maximum duration. Unit: seconds.

60

Execution Tuning

Number of Processes

Yes

The number of processes.

4

Select Resource Group

Public Resource Group

No

The instance type (CPU or GPU) and virtual private cloud (VPC) that you want to use. We recommend that you use the CPU instance type to save costs.

No default value

Dedicated resource group

No

The number of vCPUs, memory, shared memory, and number of GPUs that you want to use.

No default value

Maximum Running Duration (seconds)

No

The maximum period of time for which the component can run. If the specified period of time is exceeded, the job is terminated.

No default value