摘要:本篇博客主要介绍GAN的基本原理与代码实现
GAN的基本原理介绍
GAN是一种生成式对抗网络,它属于一种生成式网络,它的对抗性主要由两个网络体现,一个D网络和一个G网络
D网络称为判别器,它的职责是尽量区分开真实样本和生成样本。
G网络称为生成器,它的职责是尽量生成接近真实的样本,让判别器判别G网络生成的样本是真实的。
举例如下:
数学原理
判别器判别公式
其中,最好的判别器和生成器应该使得判别函数D(x)=1/2,即判别器对于生成的样本判别能力是1/2,判别器无法判别生成样本的真实性。
最佳效果如图:
生成样本和真实样本几乎重合,判别函数分布为常分布为1/2.
损失函数定义
对于整个GAN模型的损失函数可以同时最优化判别模型和生成模型:
- 判别模型:最小化 -(log(D1(x)))-log(1-D2(G(z)))
判别模型应使得判别真实样本的概率近似为1,生成样本的概率近似为0,上式x表示真实样本,G(z)表示生成样本,即使得前两项log函数逼近于0,此时判别模型最优 - 生成模型:最小化-(log(D2(G(z))))
生成模型应使得判别器判别生成的样本为真的概率为1,即使得上述最优化目标逼近于0
以上便是GAN的基本原理。
GAN的简单demo
真实样本用均值为4,方差为0.5的高斯分布产生,生成样本使用加噪策略产生。最终结果如图:
用二维曲线模仿GAN模型。
代码框架
1.导入必要的库
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from matplotlib import animation
from scipy.stats import norm #scipy 数值计算库
import seaborn as sns #数据模块可视化
import argparse #解析命令行参数
2.定义样本分布以及生成样本分布
#define Gauss distribution:mean =4, std=0.5
class DataDistribution(object):
def __init__(self):
self.mu= 4
self.sigma=0.5
#define sampling function
def samples(self,N):
#generate the number of n samples that mean=mu,std=sigama
samples=np.random.normal(self.mu,self.sigma,N)
samples.sort()
return samples
class GeneratorDistribution():
def __init__(self,range):
self.range=range
def samples(self,N):
return np.linspace(-self.range,self.range,N)+np.random.random(N)*0.01
样本分布使用均值为4,方差为0.5的高斯分布
生成样本分布采用均匀加噪处理
3.定义网络线性运算y=wx+b
#define a liner computation function
#args input:the inputting samples, output_dim:the dimension of output
#scope:variable space, stddev:std
#the liner function is to compute y=wx+b
def linear(input, output_dim, scope=None, stddev=1.0):
#initialize the norm randoms
norm = tf.random_normal_initializer(stddev=stddev)
#initialize the const
const= tf.constant_initializer(0.0)
#computet the y=wx+b
#open the variable space named arg scope or 'linear'
with tf.variable_scope(scope or 'linear'):
#get existed variable named 'w' whose shape is defined as [input,get_shape()[1],output_dim]
#and use norm or const distribution to initialize the tensor
w = tf.get_variable('w',[input.get_shape()[1],output_dim],initializer=norm)
b= tf.get_variable('b', [output_dim],initializer=const)
return tf.matmul(input,w)+b
上述代码实现了输入样本x,执行y=wx+b的运算,其中norm和const是初始化w和b的分布,使用tf.variable_scope()定义名字空间,有利于简化代码
4.定义生成网络和判别网络
生成网络
#define generator nets using soft-plus function
#whose nets have only one hidden layer one input layer
def generator(input, hidden_size):
#soft-plus function:log(exp(features)+1)
#h0 represents the output of the input layer
h0=tf.nn.softplus(linear(input, hidden_size),'g0')
#the output dimension is 1
h1=linear(h0,1,'g1')
return h1
生成网络是两层的感知器,第一层使用softplus函数来处理输入,第二层即使用简单的线性运算生成数据
判别网络:
#define the discriminator nets using deep tanh function
#because the discriminator nets usually have the stronger learning abilitiy
#to train the better generator
def discriminator(input, hidden_size,minibatch_layer=True):
#the output dimension is 2 multiply hidden_size because its need to be deep
h0=tf.tanh(linear(input,hidden_size*2,'d0'))
h1=tf.tanh(linear(h0,hidden_size*2,'d1'))
if minibatch_layer:
h2=minibatch(h1)
else:
h2=tf.tanh(linear(h1, hidden_size*2,'d2'))
h3 = tf.sigmoid(linear(h2, 1, 'd3'))
return h3
判别网络使用具有4层的深度网络,由于判别网络许需要具有足够的判别能力才能促使生成网络生成更好的样本,所以判别网络在每一层都是用tanh()函数作为激活,最后一层使用sigmoid函数输出。(其中的minibatch是对GAN模型的优化,后面介绍)
5.定义优化器
生成模型、判别模型和预训练生成器模型(后面介绍)都使用同一个优化器,该优化器使用学习速度下降的方法,并使用通常的梯度下降学习器。
#define optimizer
#using decay learning and GradientDescentOptimizer
def optimizer(loss, val_list,initial_learning_rate=0.005):
decay=0.95 #the speed of decline
num_decay_steps= 150 #for every 150 steps learning rate decline
batch=tf.Variable(0)
learning_rate=tf.train.exponential_decay(
initial_learning_rate,
batch,
num_decay_steps,
decay,
staircase=True
)
optimizer=tf.train.GradientDescentOptimizer(learning_rate).minimize(
loss,
global_step=batch,
var_list=val_list
)
return optimizer
6.创建模型
模型都定义好了,优化器选择完毕,则创建训练模型。
def _create_model(self):
#in order to make sure that the discriminator is providing useful gradient
#imformation,we are going to pretrain the discriminator using a maximum
#likehood objective,we define the network for this pretraining step scoped as D_pre
with tf.variable_scope('D_pre'):
self.pre_input=tf.placeholder(tf.float32,shape=[self.batch_size,1])
self.pre_labels=tf.placeholder(tf.float32,shape=[self.batch_size,1])
D_pre=discriminator(self.pre_input, self.mlp_hidden_size,self.minibatch)
self.pre_loss=tf.reduce_mean(tf.square(D_pre-self.pre_labels))
self.pre_opt=optimizer(self.pre_loss,None,self.learning_rate,)
#this defines the generator network:
#it takes samples from a noise distribution
#as input, and passes them through an MLP
with tf.variable_scope('Gen'):
self.z=tf.placeholder(tf.float32,[self.batch_size,1])
self.G=generator(self.z,self.mlp_hidden_size)
#this discriminator tries to tell the difference between samples
#from the true
#x is the real sample while z is the generated samples
with tf.variable_scope('Disc') as scope:
self.x=tf.placeholder(tf.float32,[self.batch_size,1])
self.D1=discriminator(self.x,self.mlp_hidden_size,self.minibatch)
scope.reuse_variables()
self.D2=discriminator(self.G,self.mlp_hidden_size,self.minibatch)
#define the loss for discriminator and generator network
#and create optimizer for both
self.loss_d=tf.reduce_mean(-tf.log(self.D1)-tf.log(1-self.D2))
self.loss_g=tf.reduce_mean(-tf.log(self.D2))
self.d_pre_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'D_pre')
self.d_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'Disc')
self.g_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'Gen')
self.opt_d=optimizer(self.loss_d,self.d_params,self.learning_rate)
self.opt_g=optimizer(self.loss_g,self.g_params,self.learning_rate)
这里包括三个模型D_pre、Gen和Dist,分别为判别器预训练模型,生成模型和判别模型,由于判别器需要有较好的性能才能进行生成对抗,所以提前对判别器进行预训练,可以达到很好的效果。
创建计算图,对三种网络模型定义网络参数,包括输入、输出、损失函数、优化器等。
7.执行训练
def train(self):
with tf.Session() as session:
tf.global_variables_initializer().run()
#pretraining discriminator
num_pretraining_steps=1000
for step in range(num_pretraining_steps):
d=(np.random.random(self.batch_size)-0.5)*10.0
labels=norm.pdf(d,loc=self.data.mu,scale=self.data.sigma)
pretrain_loss,_=session.run(
[self.pre_loss,self.pre_opt],
{
self.pre_input:np.reshape(d,(self.batch_size,1)),
self.pre_labels:np.reshape(labels,(self.batch_size,1))
}
)
self.weightsD=session.run(self.d_pre_params)
#copy weights from pretraining over to new D network
for i,v in enumerate(self.d_params):
session.run(v.assign(self.weightsD[i]))
for step in range(self.num_steps):
#update discriminator
x=self.data.samples(self.batch_size)
z=self.gen.samples(self.batch_size)
loss_d,_=session.run(
[self.loss_d,self.opt_d],
{
self.x:np.reshape(x,(self.batch_size,1)),
self.z:np.reshape(z,(self.batch_size,1))
}
)
#update generator
z=self.gen.samples(self.batch_size)
loss_g,_=session.run(
[self.loss_g,self.opt_g],
{
self.z:np.reshape(z,(self.batch_size,1))
}
)
if step % self.log_every==0:
print('{}:{}\t{}'.format(step,loss_d,loss_g))
if self.anim_path:
self.anim_frames.append(self._samples(session))
if self.anim_path:
self._save_animation()
else:
self._plot_distributions(session)
首先初始化参数,其次抽取样本执行判别器预训练,将与训练的模型参数拷贝到判别模型,然后对抗训练判别模型和生成模型
完整代码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @File : main.py
# @Author: Wong
# @Date : 2018/11/8
# @Desc :简单的曲线生成对抗模型
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from matplotlib import animation
from scipy.stats import norm #scipy 数值计算库
import seaborn as sns #数据模块可视化
import argparse #解析命令行参数
sns.set(color_codes=True) #设置主题颜色
seed=42
#设置随机数,使得每次生成的随机数相同
np.random.seed(seed)
tf.set_random_seed(seed)
#define Gauss distribution:mean =4, std=0.5
class DataDistribution(object):
def __init__(self):
self.mu= 4
self.sigma=0.5
#define sampling function
def samples(self,N):
#generate the number of n samples that mean=mu,std=sigama
samples=np.random.normal(self.mu,self.sigma,N)
samples.sort()
return samples
#define a liner computation function
#args input:the inputting samples, output_dim:the dimension of output
#scope:variable space, stddev:std
#the liner function is to compute y=wx+b
def linear(input, output_dim, scope=None, stddev=1.0):
#initialize the norm randoms
norm = tf.random_normal_initializer(stddev=stddev)
#initialize the const
const= tf.constant_initializer(0.0)
#computet the y=wx+b
#open the variable space named arg scope or 'linear'
with tf.variable_scope(scope or 'linear'):
#get existed variable named 'w' whose shape is defined as [input,get_shape()[1],output_dim]
#and use norm or const distribution to initialize the tensor
w = tf.get_variable('w',[input.get_shape()[1],output_dim],initializer=norm)
b= tf.get_variable('b', [output_dim],initializer=const)
return tf.matmul(input,w)+b
#define the noise distribution
#use linear space to split -range to range into N parts plus random noise
class GeneratorDistribution():
def __init__(self,range):
self.range=range
def samples(self,N):
return np.linspace(-self.range,self.range,N)+np.random.random(N)*0.01
#define generator nets using soft-plus function
#whose nets have only one hidden layer one input layer
def generator(input, hidden_size):
#soft-plus function:log(exp(features)+1)
#h0 represents the output of the input layer
h0=tf.nn.softplus(linear(input, hidden_size),'g0')
#the output dimension is 1
h1=linear(h0,1,'g1')
return h1
#define the discriminator nets using deep tanh function
#because the discriminator nets usually have the stronger learning abilitiy
#to train the better generator
def discriminator(input, hidden_size,minibatch_layer=True):
#the output dimension is 2 multiply hidden_size because its need to be deep
h0=tf.tanh(linear(input,hidden_size*2,'d0'))
h1=tf.tanh(linear(h0,hidden_size*2,'d1'))
if minibatch_layer:
h2=minibatch(h1)
else:
h2=tf.tanh(linear(h1, hidden_size*2,'d2'))
h3 = tf.sigmoid(linear(h2, 1, 'd3'))
return h3
def minibatch(input, num_kernels=5, kernel_dim=3):
x=linear(input, num_kernels*kernel_dim,scope='minibatch',stddev=0.02)
activation=tf.reshape(x,(-1, num_kernels,kernel_dim))
diffs=tf.expand_dims(activation,3)-tf.expand_dims(tf.transpose(activation,[1,2,0]),0)
abs_diffs=tf.reduce_sum(tf.abs(diffs),2)
minibatch_features=tf.reduce_sum(tf.exp(-abs_diffs),2)
return tf.concat([input, minibatch_features],1)
#define optimizer
#using decay learning and GradientDescentOptimizer
def optimizer(loss, val_list,initial_learning_rate=0.005):
decay=0.95 #the speed of decline
num_decay_steps= 150 #for every 150 steps learning rate decline
batch=tf.Variable(0)
learning_rate=tf.train.exponential_decay(
initial_learning_rate,
batch,
num_decay_steps,
decay,
staircase=True
)
optimizer=tf.train.GradientDescentOptimizer(learning_rate).minimize(
loss,
global_step=batch,
var_list=val_list
)
return optimizer
class GAN(object):
def __init__(self,data,gen,num_steps,batch_size,minibatch,log_every,anim_path):
self.data=data
self.gen=gen
self.num_steps=num_steps
self.batch_size=batch_size
self.minibatch=minibatch
self.log_every=log_every
self.mlp_hidden_size=4
self.anim_path=anim_path
self.anim_frames=[]
#if using minibatch then decline the learning rate
#or improve the learning rate
if self.minibatch:
self.learning_rate=0.005
else:
self.learning_rate=0.03
self._create_model()
def _create_model(self):
#in order to make sure that the discriminator is providing useful gradient
#imformation,we are going to pretrain the discriminator using a maximum
#likehood objective,we define the network for this pretraining step scoped as D_pre
with tf.variable_scope('D_pre'):
self.pre_input=tf.placeholder(tf.float32,shape=[self.batch_size,1])
self.pre_labels=tf.placeholder(tf.float32,shape=[self.batch_size,1])
D_pre=discriminator(self.pre_input, self.mlp_hidden_size,self.minibatch)
self.pre_loss=tf.reduce_mean(tf.square(D_pre-self.pre_labels))
self.pre_opt=optimizer(self.pre_loss,None,self.learning_rate,)
#this defines the generator network:
#it takes samples from a noise distribution
#as input, and passes them through an MLP
with tf.variable_scope('Gen'):
self.z=tf.placeholder(tf.float32,[self.batch_size,1])
self.G=generator(self.z,self.mlp_hidden_size)
#this discriminator tries to tell the difference between samples
#from the true
#x is the real sample while z is the generated samples
with tf.variable_scope('Disc') as scope:
self.x=tf.placeholder(tf.float32,[self.batch_size,1])
self.D1=discriminator(self.x,self.mlp_hidden_size,self.minibatch)
scope.reuse_variables()
self.D2=discriminator(self.G,self.mlp_hidden_size,self.minibatch)
#define the loss for discriminator and generator network
#and create optimizer for both
self.loss_d=tf.reduce_mean(-tf.log(self.D1)-tf.log(1-self.D2))
self.loss_g=tf.reduce_mean(-tf.log(self.D2))
self.d_pre_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'D_pre')
self.d_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'Disc')
self.g_params=tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'Gen')
self.opt_d=optimizer(self.loss_d,self.d_params,self.learning_rate)
self.opt_g=optimizer(self.loss_g,self.g_params,self.learning_rate)
def train(self):
with tf.Session() as session:
tf.global_variables_initializer().run()
#pretraining discriminator
num_pretraining_steps=1000
for step in range(num_pretraining_steps):
d=(np.random.random(self.batch_size)-0.5)*10.0
labels=norm.pdf(d,loc=self.data.mu,scale=self.data.sigma)
pretrain_loss,_=session.run(
[self.pre_loss,self.pre_opt],
{
self.pre_input:np.reshape(d,(self.batch_size,1)),
self.pre_labels:np.reshape(labels,(self.batch_size,1))
}
)
self.weightsD=session.run(self.d_pre_params)
#copy weights from pretraining over to new D network
for i,v in enumerate(self.d_params):
session.run(v.assign(self.weightsD[i]))
for step in range(self.num_steps):
#update discriminator
x=self.data.samples(self.batch_size)
z=self.gen.samples(self.batch_size)
loss_d,_=session.run(
[self.loss_d,self.opt_d],
{
self.x:np.reshape(x,(self.batch_size,1)),
self.z:np.reshape(z,(self.batch_size,1))
}
)
#update generator
z=self.gen.samples(self.batch_size)
loss_g,_=session.run(
[self.loss_g,self.opt_g],
{
self.z:np.reshape(z,(self.batch_size,1))
}
)
if step % self.log_every==0:
print('{}:{}\t{}'.format(step,loss_d,loss_g))
if self.anim_path:
self.anim_frames.append(self._samples(session))
if self.anim_path:
self._save_animation()
else:
self._plot_distributions(session)
def _samples(self, session, num_points=10000, num_bins=100):
# return a tuple (db,pd,pg), where db is the current decision boundary
# pd is a histogram of samples from the data distribution,
# and pg is a histogram of generated samples.
xs = np.linspace(-self.gen.range, self.gen.range, num_points)
bins = np.linspace(-self.gen.range, self.gen.range, num_bins)
# decision boundary
db = np.zeros((num_points, 1))
for i in range(num_points // self.batch_size):
db[self.batch_size * i:self.batch_size * (i + 1)] = session.run(
self.D1,
{
self.x: np.reshape(
xs[self.batch_size * i:self.batch_size * (i + 1)],
(self.batch_size, 1)
)
}
)
# data distribution
d = self.data.samples(num_points)
pd, _ = np.histogram(d, bins=bins, density=True)
# generated samples
zs = np.linspace(-self.gen.range, self.gen.range, num_points)
g = np.zeros((num_points, 1))
# // 整数除法
for i in range(num_points // self.batch_size):
g[self.batch_size * i:self.batch_size * (i + 1)] = session.run(
self.G,
{
self.z: np.reshape(
zs[self.batch_size * i: self.batch_size * (i + 1)],
(self.batch_size, 1)
)
}
)
pg, _ = np.histogram(g, bins=bins, density=True)
return db, pd, pg
def _plot_distributions(self, session):
db, pd, pg = self._samples(session)
db_x = np.linspace(-self.gen.range, self.gen.range, len(db))
p_x = np.linspace(-self.gen.range, self.gen.range, len(pd))
f, ax = plt.subplots(1)
ax.plot(db_x, db, label='decision boundary')
ax.set_ylim(0, 1)
plt.plot(p_x, pd, label='real data')
plt.plot(p_x, pg, label='generated data')
plt.title('1D Generative Adversarial Network')
plt.xlabel('Data values')
plt.ylabel('Probability density')
plt.legend()
plt.show()
def _save_animation(self):
f, ax = plt.subplots(figsize=(6, 4))
f.suptitle('1D Generative Adversarial Network', fontsize=15)
plt.xlabel('Data values')
plt.ylabel('Probability density')
ax.set_xlim(-6, 6)
ax.set_ylim(0, 1.4)
line_db, = ax.plot([], [], label='decision boundary')
line_pd, = ax.plot([], [], label='real data')
line_pg, = ax.plot([], [], label='generated data')
frame_number = ax.text(
0.02,
0.95,
'',
horizontalalignment='left',
verticalalignment='top',
transform=ax.transAxes
)
ax.legend()
db, pd, _ = self.anim_frames[0]
db_x = np.linspace(-self.gen.range, self.gen.range, len(db))
p_x = np.linspace(-self.gen.range, self.gen.range, len(pd))
def init():
line_db.set_data([], [])
line_pd.set_data([], [])
line_pg.set_data([], [])
frame_number.set_text('')
return (line_db, line_pd, line_pg, frame_number)
def animate(i):
frame_number.set_text(
'Frame: {}/{}'.format(i, len(self.anim_frames))
)
db, pd, pg = self.anim_frames[i]
line_db.set_data(db_x, db)
line_pd.set_data(p_x, pd)
line_pg.set_data(p_x, pg)
return (line_db, line_pd, line_pg, frame_number)
anim = animation.FuncAnimation(
f,
animate,
init_func=init,
frames=len(self.anim_frames),
blit=True
)
anim.save(self.anim_path, fps=30, extra_args=['-vcodec', 'libx264'])
def main(args):
model = GAN(
DataDistribution(),
GeneratorDistribution(range=8),
args.num_steps,
args.batch_size,
args.minibatch,
args.log_every,
args.anim
)
model.train()
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--num-steps', type=int, default=1200,
help='the number of training steps to take')
parser.add_argument('--batch-size', type=int, default=12,
help='the batch size')
parser.add_argument('--minibatch', type=bool, default=False,
help='use minibatch discrimination')
parser.add_argument('--log-every', type=int, default=10,
help='print loss after this many steps')
parser.add_argument('-anim', type=str, default=None,
help='the name of the output animation file (default: none)')
return parser.parse_args()
if __name__ == '__main__':
main(parse_args())