Tensorflow学习笔记: 变量及共享变量-技术开发专区

Tensorflow学习笔记: 变量及共享变量

作者：于翔宇编辑：刘美利 2018-08-29 17:44 知乎

　　【IT168 技术】TensorFlow中变量主要用来表示机器学习模型中的参数，变量通过 tf.Variable 类进行操作。tf.Variable 表示张量，通过运行 op 可以改变它的值。与 tf.Tensor 对象不同，tf.Variable 存在于单个 session.run 调用的上下文之外。

　　在内部，tf.Variable 存储持久张量。具体 op 允许您读取和修改此张量的值。这些修改在多个 tf.Session 之间是可见的，因此对于一个 tf.Variable，多个工作器可以看到相同的值。

　　1. tf.Variable 创建变量

　　tf.Variable的初始化函数如下所示

　　__init__(

　　initial_value=None,

　　trainable=True,

　　collections=None,

　　validate_shape=True,

　　caching_device=None,

　　name=None,

　　variable_def=None,

　　dtype=None,

　　expected_shape=None,

　　import_scope=None,

　　constraint=None

　　)

　　其中参数

　　initial_value 表示初始化值，用Tensor表示

　　trainable 表示变量是否被训练，如果被训练，将加入到tf.GraphKeys.TRAINABLE_VARIABLES集合中，TensorFlow将计算其梯度的变量

　　collections 表示一个graph collections keys的集合，这个创建的变量将被添加到这些集合中，默认集合是[GraphKeys.GLOBAL_VARIABLES].

　　name: 变量的命名，默认是'Variable'

　　dtype 表示类型

　　例如我们创建一个变量，并且查看其name和shape

　　import tensorflow as tfw1 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights")b1 = tf.Variable(tf.zeros([200]),name="biases")w2 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights") # 名称相同b2 = tf.Variable(tf.zeros([200]),name="biases") # 名称相同print w1.name, w1.shapeprint b1.name, b1.shapeprint w2.name, w2.shapeprint b2.name, b2.shape************************************输出**************************************weights:0 (784, 200)biases:0 (200,)weights_1:0 (784, 200)biases_1:0 (200,)

　　可以看到在命名的时候，如果指定的name重复，那么w2就会被命名为"name_1:0" 这样累加下去。

　　2. 变量集合 collections

　　默认情况下，每个tf.Variable都放置在以下两个集合中：*tf.GraphKeys.GLOBAL_VARIABLES- 可以在多个设备共享的变量，*tf.GraphKeys.TRAINABLE_VARIABLES- TensorFlow 将计算其梯度的变量。

　　2.1 查看集合变量列表

　　要查看放置在某个集合中的所有变量的列表，可以采用如下方式

　　import tensorflow as tfw1 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights")b1 = tf.Variable(tf.zeros([200]),name="biases")w2 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights") # 名称相同b2 = tf.Variable(tf.zeros([200]),name="biases") # 名称相同print tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)************************************输出**************************************[, , , ]

　　可以看到输出结果是所有变量的列表

　　2.2 创建变量集合

　　如果您不希望变量被训练，可以将其添加到 tf.GraphKeys.LOCAL_VARIABLES 集合中。例如，以下代码段展示了如何将名为 my_local 的变量添加到此集合中：

　　my_local = tf.get_variable("my_local", shape=(),

　　collections=[tf.GraphKeys.LOCAL_VARIABLES])

　　或者，您可以指定 trainable=False 为 tf.get_variable 的参数：

　　my_non_trainable = tf.get_variable("my_non_trainable",

　　shape=(),

　　trainable=False)

　　我们测试效果如下所示，可以看到b2的trainable=False，那么输出collection没有b2

　　import tensorflow as tfw1 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights")b1 = tf.Variable(tf.zeros([200]),name="biases")w2 = tf.Variable(tf.random_normal([784,200], stddev = 0.35), name="weights") # 名称相同b2 = tf.Variable(tf.zeros([200]),name="biases", trainable=False) # 名称相同print tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)************************************输出**************************************[, , ]

　　您也可以使用自己的集合。集合名称可为任何字符串，且您无需显式创建集合。创建变量(或任何其他对象)后，要将其添加到集合，请调用 tf.add_to_collection。例如，以下代码将名为 my_local 的现有变量添加到名为 my_collection_name 的集合中：

　　tf.add_to_collection("my_collection_name", my_local)

　　3. 共享变量

　　我们查看下面的代码，表示一个卷积神经网络，其中包括conv1_weights, conv1_biases, conv2_weights, conv2_biases四个参数，也就是4个变量

　　def my_image_filter(input_images):

　　conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),

　　name="conv1_weights")

　　conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")

　　conv1 = tf.nn.conv2d(input_images, conv1_weights,

　　strides=[1, 1, 1, 1], padding='SAME')

　　relu1 = tf.nn.relu(conv1 + conv1_biases)

　　conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),

　　name="conv2_weights")

　　conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")

　　conv2 = tf.nn.conv2d(relu1, conv2_weights,

　　strides=[1, 1, 1, 1], padding='SAME')

　　return tf.nn.relu(conv2 + conv2_biases)

　　假设我们利用这个函数对两张图片进行相同的操作，也就是调用两次，那么每次都会创建4个变量，假设我们在函数内对变量进行了优化求解，那么每次都会重新创建变量，这样就无法复用参数，导致训练过程无效

　　# 第一次执行方法创建4个变量result1 = my_image_filter(image1)# 第二次执行再创建4个变量result2 = my_image_filter(image2)ValueError: Variable weight already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

　　TensowFlow通过变量范围(variable scope)和tf.get_variable方法解决了共享变量(参数)的问题。

　　3.1 tf.variable_scope和tf.get_variable

　　tf.Variable()方法每次被调用都会创建新的变量，这样就无法解决共享变量的问题，而tf.get_variable结合作用域即可表明我们是想创建新的变量，还是共享变量，变量作用域允许在调用隐式创建和使用变量的函数时控制变量重用。作用域还允许您以分层和可理解的方式命名变量。tf.get_variable()的机制跟tf.Variable()有很大不同，如果指定的变量名已经存在(即先前已经用同一个变量名通过get_variable()函数实例化了变量)，那么get_variable()只会返回之前的变量，否则才创造新的变量。我们举例进行说明。

　　例如上面的例子中有两个卷积层，我们先来编写一个函数创建一个卷积/relu层，这个函数使命的变量名称是'weights'和'biases'

　　def conv_relu(input, kernel_shape, bias_shape):

　　# Create variable named "weights".

　　weights = tf.get_variable("weights", kernel_shape,

　　initializer=tf.random_normal_initializer())

　　# Create variable named "biases".

　　biases = tf.get_variable("biases", bias_shape,

　　initializer=tf.constant_initializer(0.0))

　　conv = tf.nn.conv2d(input, weights,

　　strides=[1, 1, 1, 1], padding='SAME')

　　return tf.nn.relu(conv + biases)

　　在真实模型中需要多个卷积层，我们通过变量域来区分不同层的变量，不同的变量域下的变量名车为：scope_name/variable_name, 如下所示，第一个卷积层的变量名称是'conv1/weights', 'conv1/biases', 第二个卷积层的变量名称是 'conv2/weights', 'conv2/biases'。

　　def my_image_filter(input_images):

　　with tf.variable_scope("conv1"):

　　# Variables created here will be named "conv1/weights", "conv1/biases".

　　relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])

　　with tf.variable_scope("conv2"):

　　# Variables created here will be named "conv2/weights", "conv2/biases".

　　return conv_relu(relu1, [5, 5, 32, 32], [32])

　　但即便这样，如果多次调用该函数，也会抛出异常，

　　result1 = my_image_filter(image1)result2 = my_image_filter(image2)# Raises ValueError(... conv1/weights already exists ...)

　　因为用get_variable()创建两个相同名字的变量是会报错的，默认上它只是检查变量名，防止重复，如果要变量共享，就需要指定在哪个域名内可以共享变量。

　　开启共享变量有两种方式

　　方法1

　　采用scope.reuse_variables()触发重用变量，如下所示

　　with tf.variable_scope("model") as scope:

　　output1 = my_image_filter(input1)

　　scope.reuse_variables()

　　output2 = my_image_filter(input2)

　　方法2

　　使用reuse=True 创建具有相同名称的作用域

　　with tf.variable_scope("model"):

　　output1 = my_image_filter(input1)with tf.variable_scope("model", reuse=True):

　　output2 = my_image_filter(input2)

　　3.2 理解variable_scope

　　理解变量域的工作机理非常重要，我们对其进行梳理，当我们调用tf.get_variable(name, shape, dtype, initializer)时，这背后到底做了什么

　　首先，TensorFlow 会判断是否要共享变量，也就是判断 tf.get_variable_scope().reuse 的值，如果结果为 False(即你没有在变量域内调用scope.reuse_variables())，那么 TensorFlow 认为你是要初始化一个新的变量，紧接着它会判断这个命名的变量是否存在。如果存在，会抛出 ValueError 异常，否则，就根据 initializer 初始化变量：

　　with tf.variable_scope("foo"):

　　v = tf.get_variable("v", [1])assert v.name == "foo/v:0"

　　而如果 tf.get_variable_scope().reuse == True，那么 TensorFlow 会执行相反的动作，就是到程序里面寻找变量名为 scope name + name 的变量，如果变量不存在，会抛出 ValueError 异常，否则，就返回找到的变量：

　　with tf.variable_scope("foo"):

　　v = tf.get_variable("v", [1])with tf.variable_scope("foo", reuse=True):

　　v1 = tf.get_variable("v", [1])assert v1 is v

　　变量域可以多层重叠，例如，下面的变量上有两层的变量域，那么变量名是'foo/var/v:0'

　　with tf.variable_scope("foo"):

　　with tf.variable_scope("bar"):

　　v = tf.get_variable("v", [1])

　　assert v.name == "foo/bar/v:0"

　　在同一个变量域中，如果需要调用同名变量，那么需要重用变量即可，例如v1和v两个变量时相同的，因为变量名都是'foo/v'

　　with tf.variable_scope("foo"):

　　v = tf.get_variable("v", [1])

　　tf.get_variable_scope().reuse_variables()

　　v1 = tf.get_variable("v", [1])

　　assert v1 is v

　　总结

　　tf.get_variable()默认上它只检查变量名，如果变量名重复，那么就会报错;tf.Variable()每次被调用都创建相应的变量，即便变量名重复，也会创建新的变量，因此无法共享变量名。

　　如果scope中开启共享变量，那么调用tf.get_variable()就会查找相同变量名的变量，如果有，就直接返回该变量，如果没有，就创建一个新的变量;

　　如果scope没有开启共享变量(默认模式)，那么调用tf.get_variable()发现已有相同变量名的变量，就会报错，如果没有，就创建一个新的变量。

　　要重用变量，需要在scope中开启共享变量，有两种方法，推荐第一种

关注我们