# [Tensorflow]從Pytorch到TF2的學習之路 - Training mode v.s. Inference mode

Posted by John on 2020-08-16
「這個故事是描寫一位從原本在寫Pytorch的熱血少年，因為工作需求所以開始跳槽Tensorflow2，立志寫出厲害的TF2程式碼，在台灣締造的偉大抒情史詩」 (改寫自烘焙王開頭旁白)

## 前言

• Dropout在inference mode下就不會在屏蔽neuron
• BatchNormalization在inference mode下會使用training時得到的平均值作為alpha, beta的參數

## Dropout: training參數

Note that the Dropout layer only applies when training is set to True such that no values are dropped during inference. When using model.fit, training will be appropriately set to True automatically, and in other contexts, you can set the kwarg explicitly to True when calling the layer.

(This is in contrast to setting trainable=False for a Dropout layer. trainable does not affect the layer’s behavior, as Dropout does not have any variables/weights that can be frozen during training.)

When using model.fit, training will be appropriately set to True automatically, and in other contexts, you can set the kwarg explicitly to True when calling the layer.

## BatchNormalization: trainable參數

About setting layer.trainable = False on a BatchNormalization layer:
The meaning of setting layer.trainable = False is to freeze the layer, i.e. its internal state will not change during training: its trainable weights will not be updated during fit() or train_on_batch(), and its state updates will not be run.

Usually, this does not necessarily mean that the layer is run in inference mode (which is normally controlled by the training argument that can be passed when calling a layer). “Frozen state” and “inference mode” are two separate concepts.

However, in the case of the BatchNormalization layer, setting trainable = False on the layer means that the layer will be subsequently run in inference mode (meaning that it will use the moving mean and the moving variance to normalize the current batch, rather than using the mean and variance of the current batch).

This behavior has been introduced in TensorFlow 2.0, in order to enable layer.trainable = False to produce the most commonly expected behavior in the convnet fine-tuning use case.

• 注意這是在TF2之後才引入的，對於TF1中，trainable=False只是凍結了參數，並不會變成inference mode(也就是不會使用平均的值帶入BN參數)

• 對某一個layer設置trainable=True會連帶影響內部的所有layer的trainable參數
• 如果再compile()後才更改trainable參數，那要等到再次呼叫compile()才會更新

