# HFAI X MMCV 为了让用户能够方便的在 mmdetection, mmsegmentation 等框架上使用 FFRecord,我们提供了 FFRecordClient 接口, 用户能够在配置文件中选择使用 FFRecordClient 来替换默认的读取后端。 在使用 FFRecordClient 之前,建议先了解一下 [mmcv FileClient 的原理](https://zhuanlan.zhihu.com/p/339190576)。 ## 使用方法 1. 使用 FFRecord 打包整个数据集文件夹: ```python import ffrecord ffrecord.pack_folder("/path/to/dataset", "packed.ffr") ``` 2. 在训练中使用我们提供的 FFRecordClient 读取后端 - 在训练代码中导入 FFRecordClient: ```python import hfai.utils.mm ``` - 修改配置文件,为所有的读取小文件的操作添加 `file_client_args` 参数,比如在 mmseg 中: ```python file_client_args=dict( backend="ffrecord", fname="packed.ffr", ) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', reduce_zero_label=True, file_client_args=file_client_args), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), ... ] ``` ## 使用示例 本示例使用以下版本的包: ``` mmcv==1.6.2 mmdet==2.25.2 mmseg==0.28.0 ``` ### mmdetection 以 coco 数据集为例,假设我们已经有 coco 的原始数据集,目录结构如下: ``` coco/ ├── annotations ├── train2017 └── val2017 ``` 1. 我们先打包整个数据集到 `data/coco/coco.ffr` 中: ```python import ffrecord ffrecord.pack_folder("coco/", "data/coco/coco.ffr") ``` 2. 然后把 annotations 文件夹单独拷贝出来放到 `data/coco/annotations`,现在目录结构如下: ``` data └── coco ├── annotations └── coco.ffr ``` 3. 在训练代码中导入 FFRecordClient: ```python import hfai.utils.mm ``` 4. 修改配置文件 `configs/_base_/datasets/coco_detection.py` 的内容为: ```python # dataset settings dataset_type = 'CocoDataset' # data_root = 'data/coco/' file_client_args=dict( backend="ffrecord", fname="data/coco/coco.ffr", ) img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] test_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, ann_file='data/coco/annotations/instances_train2017.json', img_prefix='train2017/', pipeline=train_pipeline), val=dict( type=dataset_type, ann_file='data/coco/annotations/instances_val2017.json', img_prefix='val2017/', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file='data/coco/annotations/instances_val2017.json', img_prefix='val2017/', pipeline=test_pipeline)) evaluation = dict(interval=1, metric='bbox') ``` ### mmsegmentation 以 ADE20k 数据集为例,假设我们已经有 ADE20k 的原始数据集,目录结构如下: ``` ADEChallengeData2016 ├── annotations ├── images ├── objectInfo150.txt └── sceneCategories.txt ``` 1. 我们先打包整个数据集到 `data/ade20k.ffr` 中: ```python import ffrecord ffrecord.pack_folder("ADEChallengeData2016/", "data/ade20k.ffr") ``` 3. 在训练代码中导入 FFRecordClient: ```python import hfai.utils.mm ``` 4. 修改配置文件 `configs/_base_/datasets/ade20k.py` 的内容为: ```python # dataset settings dataset_type = 'ADE20KDataset' # data_root = 'data/ade/ADEChallengeData2016' data_root = None img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) file_client_args=dict( backend="ffrecord", fname="data/ade20k.ffr", ) gt_seg_map_loader_cfg=dict(file_client_args=file_client_args) train_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict(type='LoadAnnotations', reduce_zero_label=True, file_client_args=file_client_args), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromFile', file_client_args=file_client_args), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type=dataset_type, data_root=data_root, img_dir='images/training', ann_dir='annotations/training', pipeline=train_pipeline, file_client_args=file_client_args), val=dict( type=dataset_type, data_root=data_root, img_dir='images/validation', ann_dir='annotations/validation', pipeline=test_pipeline, file_client_args=file_client_args, gt_seg_map_loader_cfg=gt_seg_map_loader_cfg), test=dict( type=dataset_type, data_root=data_root, img_dir='images/validation', ann_dir='annotations/validation', pipeline=test_pipeline, file_client_args=file_client_args, gt_seg_map_loader_cfg=gt_seg_map_loader_cfg) ) ```