# LeNet

lenet

Conv2d output shape: 	 torch.Size([1, 6, 28, 28])
Sigmoid output shape: 	 torch.Size([1, 6, 28, 28])
AvgPool2d output shape: 	 torch.Size([1, 6, 14, 14])
Conv2d output shape: 	 torch.Size([1, 16, 10, 10])
Sigmoid output shape: 	 torch.Size([1, 16, 10, 10])
AvgPool2d output shape: 	 torch.Size([1, 16, 5, 5])
Flatten output shape: 	 torch.Size([1, 400])
Linear output shape: 	 torch.Size([1, 120])
Sigmoid output shape: 	 torch.Size([1, 120])
Linear output shape: 	 torch.Size([1, 84])
Sigmoid output shape: 	 torch.Size([1, 84])
Linear output shape: 	 torch.Size([1, 10])

# 数据集

1	train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=b, num_workers=n)

# 效率评估

# 实验数据

1 2	learning rate = 0.9, num_epochs = 5 GPU = NVIDIA GeForce RTX 3050 4GB Laptop GPU

Batch Size	Num Workers	Device	Time	Accuracy
64	2	GPU	53.6s	0.795
256	4	GPU	57.5s	0.635
32	2	GPU	1m05s	0.803
32	1	GPU	1m09s	0.822
16	1	GPU	1m20s	0.854
8	1	GPU	1m55s	0.856
256	8	GPU	1m55s	0.509
8	1	CPU	3m01s	0.852

# 分析

过多的 num workers 会导致效率下降，可能系硬盘资源占用过多（虚拟内存）
batch size 增大，在训练轮数相同时，一定程度上降低预测准确率
Time 存在瓶颈，不随 batch size 增大一直增大，考虑系数据加载的速度跟不上显卡处理的速度
Batch size =256, 64 时 accuracy 较低，可能是因为训练轮数较少，尚未收敛

# 训练时遇到的问题

DataLoader worker exited unexpectedly
Vscode 卡退 oom crashed
显存不足
训练卡死，长时间保持在一个进度
训练结束后仍占用 GPU 资源 (~1500MB)

前两者主要为电脑内存不足导致，解决方法系高级系统设置，调整虚拟内存（D 盘）

3 解决方法为使用较小的 batch size

4 尚不清楚原因（锁死？GPU 缓存？），解决方法为重启内核

5 会带来什么影响吗，会影响接下来的训练吗，是否需要重启内核或者 torch.cuda.empty_cache() 呢

并且 torch.cuda.empty_cache() 后还会有显存占用，貌似是因为 load_data_fashion_mnist 默认使用 device 为 cuda:0

# 修改网络架构

# 将所有 sigmoid 改为 ReLU

保持训练条件 lr = 0.9, num_epochs = 5 不变，发现：

train acc 和 test acc 均只有 0.1， Loss 相当大，这与未经训练随机猜没有区别，

模型出现欠拟合可能是由学习率过大引发梯度爆炸所致

将 lr 由 0.9 修改为 0.1 后，准确率显著提升，达到 0.84~0.85

经过 10 轮训练， loss 0.290, train acc 0.891, test acc 0.867

# 使用 MaxPool

lr = 0.1, num_epochs = 10

loss 0.261, train acc 0.902, test acc 0.885

效果略有提升

补充：又进行了几次实验，发现 lr = 0.1 还是有概率出现（梯度爆炸）欠拟合情况，故设置 lr = 0.05

# 可视化卷积层结果

# 对比原始 LeNet (b=64, n=2)

# 更多的结果

第一个 ReLU 层的 6 个通道

第二个 ReLU 层的 16 个通道

First written 2024/08/07

Last updated 2024/08/08