HFAI X MMCV¶
为了让用户能够方便的在 mmdetection, mmsegmentation 等框架上使用 FFRecord,我们提供了 FFRecordClient 接口, 用户能够在配置文件中选择使用 FFRecordClient 来替换默认的读取后端。
在使用 FFRecordClient 之前,建议先了解一下 mmcv FileClient 的原理。
使用方法¶
使用 FFRecord 打包整个数据集文件夹:
import ffrecord ffrecord.pack_folder("/path/to/dataset", "packed.ffr")
在训练中使用我们提供的 FFRecordClient 读取后端
在训练代码中导入 FFRecordClient:
import hfai.utils.mm
修改配置文件,为所有的读取小文件的操作添加
file_client_args
参数,比如在 mmseg 中:file_client_args=dict( backend="ffrecord", fname="packed.ffr", ) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', reduce_zero_label=True, file_client_args=file_client_args), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), ... ]
使用示例¶
本示例使用以下版本的包:
mmcv==1.6.2
mmdet==2.25.2
mmseg==0.28.0
mmdetection¶
以 coco 数据集为例,假设我们已经有 coco 的原始数据集,目录结构如下:
coco/
├── annotations
├── train2017
└── val2017
我们先打包整个数据集到
data/coco/coco.ffr
中:import ffrecord ffrecord.pack_folder("coco/", "data/coco/coco.ffr")
然后把 annotations 文件夹单独拷贝出来放到
data/coco/annotations
,现在目录结构如下:data └── coco ├── annotations └── coco.ffr
在训练代码中导入 FFRecordClient:
import hfai.utils.mm
修改配置文件
configs/_base_/datasets/coco_detection.py
的内容为:# dataset settings dataset_type = 'CocoDataset' # data_root = 'data/coco/' file_client_args=dict( backend="ffrecord", fname="data/coco/coco.ffr", ) img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] test_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, ann_file='data/coco/annotations/instances_train2017.json', img_prefix='train2017/', pipeline=train_pipeline), val=dict( type=dataset_type, ann_file='data/coco/annotations/instances_val2017.json', img_prefix='val2017/', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file='data/coco/annotations/instances_val2017.json', img_prefix='val2017/', pipeline=test_pipeline)) evaluation = dict(interval=1, metric='bbox')
mmsegmentation¶
以 ADE20k 数据集为例,假设我们已经有 ADE20k 的原始数据集,目录结构如下:
ADEChallengeData2016
├── annotations
├── images
├── objectInfo150.txt
└── sceneCategories.txt
我们先打包整个数据集到
data/ade20k.ffr
中:import ffrecord ffrecord.pack_folder("ADEChallengeData2016/", "data/ade20k.ffr")
在训练代码中导入 FFRecordClient:
import hfai.utils.mm
修改配置文件
configs/_base_/datasets/ade20k.py
的内容为:# dataset settings dataset_type = 'ADE20KDataset' # data_root = 'data/ade/ADEChallengeData2016' data_root = None img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) file_client_args=dict( backend="ffrecord", fname="data/ade20k.ffr", ) gt_seg_map_loader_cfg=dict(file_client_args=file_client_args) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', reduce_zero_label=True, file_client_args=file_client_args), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type=dataset_type, data_root=data_root, img_dir='images/training', ann_dir='annotations/training', pipeline=train_pipeline, file_client_args=file_client_args), val=dict( type=dataset_type, data_root=data_root, img_dir='images/validation', ann_dir='annotations/validation', pipeline=test_pipeline, file_client_args=file_client_args, gt_seg_map_loader_cfg=gt_seg_map_loader_cfg), test=dict( type=dataset_type, data_root=data_root, img_dir='images/validation', ann_dir='annotations/validation', pipeline=test_pipeline, file_client_args=file_client_args, gt_seg_map_loader_cfg=gt_seg_map_loader_cfg) )