人工智能-深度强化学习Tips-for-Training-Deep-Network课件.pptx
- 【下载声明】
1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
3. 本页资料《人工智能-深度强化学习Tips-for-Training-Deep-Network课件.pptx》由用户(三亚风情)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 人工智能 深度 强化 学习 Tips for Training Deep Network 课件
- 资源描述:
-
1、Tips for Training Deep NetworkOutput Training Strategy:Batch Normalization Activation Function:SELU Network Structure:Highway NetworkBatch NormalizationFeature ScalingThe means of all dimensions are 0,and the variances are all 1 For each dimension i:In general,gradient descent converges much faster
2、with feature scaling than without it.How about Hidden Layer?Layer 1Feature ScalingFeature Scaling?Feature Scaling?Smaller learning rate can be helpful,but the training would be slower.Difficulty:their statistics change during the training Batch normalizationInternal Covariate Shift BatchSigmoid=Sigm
3、oidSigmoidBatchBatch normalizationNote:Batch normalization cannot be applied on small batch.Batch normalizationSigmoidSigmoidSigmoidHow to do backpropogation?Batch normalizationBatch normalization At testing stage:We do not have batch at testing stage.Ideal solution:Practical solution:AccUpdatesBatc
4、h normalization-Benefit BN reduces training times,and make very deep net trainable.Because of less Covariate Shift,we can use larger learning rates.Less exploding/vanishing gradients Especially effective for sigmoid,tanh,etc.Learning is less affected by initialization.BN reduces the demand for regul
5、arization.To learn more Batch Renormalization Layer Normalization Instance Normalization Weight Normalization Spectrum NormalizationActivation Function:SELUReLU Rectified Linear Unit(ReLU)Reason:1.Fast to compute2.Biological reason3.Infinite sigmoid with different biases4.Vanishing gradient problemR
展开阅读全文