TfSlim简介
TfSlim提供的预训练模型
准备数据集,生成TFRecord文件
整理自己的图片数据集目录结构
数据集根目录下建立train和val2个文件夹,分布放置训练数据和验证数据, 每个类别一个目录生成TFRecord文件
参考
https://github.com/tensorflow/models/tree/master/research/inception#how-to-construct-a-new-dataset-for-retraining
# location to where to save the TFRecord data.
OUTPUT_DIRECTORY=$HOME/my-custom-data/
# build the preprocessing script.
cd tensorflow-models/inception
bazel build //inception:build_image_data
# convert the data.
bazel-bin/inception/build_image_data \
--train_directory="${TRAIN_DIR}" \
--validation_directory="${VALIDATION_DIR}" \
--output_directory="${OUTPUT_DIRECTORY}" \
--labels_file="${LABELS_FILE}" \
--train_shards=128 \
--validation_shards=24 \
--num_threads=8
在research/slim/datasets
下创建自己的dataset文件,例如mydata.py
把flowers.py中内容复制过来,
按照数据实际情况修改下面几行:
_FILE_PATTERN = 'flowers_%s_*.tfrecord'
SPLITS_TO_SIZES = {'train': 3320, 'validation': 350}
_NUM_CLASSES = 5
_ITEMS_TO_DESCRIPTIONS = {
'image': 'A color image of varying size.',
'label': 'A single integer between 0 and 4',
}
```python
修改`research/slim/datasets/dataset_factory.py`, 增加自己的数据集mydata
from datasets import mydata
datasets_map = {
'cifar10': cifar10,
'flowers': flowers,
'imagenet': imagenet,
'mnist': mnist,
'mydata': mydata,
}
## 从头训练模型(From Scratch)
具体参数需要按照实际训练情况修改
```shell
CUDA_VISIBLE_DEVICES=2 nohup python train_image_classifier.py --train_dir=/tmp/md_train --dataset_name=mydata --dataset_split_name=train --dataset_dir=/data5/mydata_tfrecording/ --model_name=mobilenet_v1 > /tmp/md.txt &
基于预训练模型优化(fune turning)
CUDA_VISIBLE_DEVICES=3 nohup \
python train_image_classifier.py \
--train_dir=/tmp/m2_train \
--dataset_dir=/data5/zxt/fdata/log \
--dataset_name=dishes \
--dataset_split_name=train \
--model_name=mobilenet_v1 \
--checkpoint_path=/data5/model/mobilenet_v1_1.0_224.ckpt
--checkpoint_exclude_scopes=MobilenetV1/Logits,MobilenetV1/AuxLogits \
--trainable_scopes=MobilenetV1/Logits,MobilenetV1/AuxLogits > /tmp/m3.txt &
评估模型
修改代码错误research/slim/eval_image_classifier.py
, 具体错误参考https://github.com/tensorflow/models/issues/694
#line 156修改
Change
slim.metrics.streaming_recall_at_k(logits, labels, 5)
to
slim.metrics.streaming_sparse_recall_at_k(logits, labels, 5)
然后运行就可以了!
CUDA_VISIBLE_DEVICES=2 python eval_image_classifier.py --alsologtostderr --checkpoint_path=/tmp/m2_train/ --eval_dir=/tmp/m2_eval --dataset_dir=/data5/zxt/fdata/log --dataset_name=dishes --dataset_split_name=validation --model_name=mobilenet_v1
最后放个可以同时训练多个模型的python脚本
import os
import tensorflow as tf
slim = tf.contrib.slim
SLIM_DIR = '/data5/zxt/models/research/slim/'
LOG_DIR = '/data5/zxt/flowers/train_log/'
MODELS = ['inception_v3', 'inception_resnet_v2']
model = "inception_v4"
DATASET_NAME = 'flowers'
DATASET_DIR = '/data5/zxt/flowers/log'
CMD_TRAIN = 'CUDA_VISIBLE_DEVICES={0} nohup python train_image_classifier.py --learning_rate=0.01 --num_epochs_per_decay=2.0 --optimizer=adam --train_dir={1}/{2}_train ' \
'--dataset_name={3} --dataset_dir={4} --dataset_split_name=train --model_name={2} > {1}/{3}_{2}_train.txt & '
CMD_VAL = 'CUDA_VISIBLE_DEVICES={0} nohup python eval_image_classifier.py --alsologtostderr --checkpoint_path={1}/{2}_train' \
' --eval_dir={1}/{2}_eval --dataset_name={3} --dataset_dir={4} --dataset_split_name=validation ' \
'--model_name={2} --preprocessing_name inception --eval_image_size 299 --eval_loop=True > {1}/{3}_{2}_eval.txt &'
def race():
# os.makedirs(LOG_DIR)
for index, model in enumerate(MODELS):
index += 1
# print(index)
cmd_train = CMD_TRAIN.format(index * 2 - 1, LOG_DIR, model, DATASET_NAME, DATASET_DIR)
cmd_eval = CMD_VAL.format(index*2, LOG_DIR, model, DATASET_NAME, DATASET_DIR)
print(cmd_train)
print(cmd_eval)
# os.system(cmd_train)
# os.system(cmd_eval)
# os.system("CUDA_VISIBLE_DEVICES=0 tensorboard --logdir {0} &".format(LOG_DIR))
if __name__ == '__main__':
race()