728x90
이번에는 이전 post와 같은 MNIST dataset을 활용하여, CNN으로 성능 뽑는것을 진행해본다.
CNN은 Fully-connected layer와 달리 flatten을 해줄 필요가 없어서 parameter가 비교적 적게 들고, 연산이 빠른 장점이 있으며, receptive field를 통해 local feature를 뽑는 것에 강인한 특징이 있음
Library importing 및 device 설정
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable
import os
import matplotlib.pyplot as plt
import numpy as np
torch.__version__
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.manual_seed(2891)
num_gpu = 1
if torch.cuda.device_count() > 1:
num_gpu = torch.cuda.device_count()
print("Let's use", num_gpu, "GPUs!")
print('our device', device)
'''
Let's use 1 GPUs!
our device cuda
'''
2-layer CNN 네트워크 설계 (add here 부분에 batchnormalization, 더 깊게 쌓는 것들을 연습해보세요)
class CNN(nn.Module):
def __init__(self, num_class, drop_prob):
super(CNN, self).__init__()
# input is 28x28
# padding=2 for same padding
self.conv1 = nn.Conv2d(1, 32, 5, padding=2) #input_channel, output_channel, filter_size, padding_size, (kernel=omit)
# feature map size is 14*14 by pooling
# padding=2 for same padding
self.conv2 = nn.Conv2d(32, 64, 5, padding=2)
# feature map size is 7*7 by pooling
'''
add here.. make more deep...
batchnormalization ++
'''
self.dropout = nn.Dropout(p=drop_prob)
self.fc1 = nn.Linear(64*7*7, 1024)
self.reduce_layer = nn.Linear(1024, num_class)
self.log_softmax = nn.LogSoftmax(dim=1)
def forward(self, x):
x = F.max_pool2d(F.relu(self.conv1(x)), 2) # -> (B, 14, 14, 32)
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
'''
add here.. make more deep...
and use dropout
'''
x = x.view(-1, 64*7*7) # reshape Variable for using Linear (because linear only permit 1D. So we call this task as "flatten")
x = F.relu(self.fc1(x))
output = self.reduce_layer(x)
return self.log_softmax(output)
Model loading 및 parameter, shape 체크
model = CNN(10, 0.3)
model.to(device)
'''
CNN(
(conv1): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(dropout): Dropout(p=0.3, inplace=False)
(fc1): Linear(in_features=3136, out_features=1024, bias=True)
(reduce_layer): Linear(in_features=1024, out_features=10, bias=True)
(log_softmax): LogSoftmax(dim=1)
)
'''
#model shape
for p in model.parameters():
print(p.size())
'''
torch.Size([32, 1, 5, 5])
torch.Size([32])
torch.Size([64, 32, 5, 5])
torch.Size([64])
torch.Size([1024, 3136])
torch.Size([1024])
torch.Size([10, 1024])
torch.Size([10])
'''
def count_parameters(model):
return sum(p.numel() for p in model.parameters() if p.requires_grad)
model_hp = count_parameters(model)
print('model"s hyper parameters', model_hp)
# model"s hyper parameters 3274634
Data setup 및 train, test loader 설정
batch_size = 64
train_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=True, download=True, transform=transforms.ToTensor()),batch_size=batch_size, shuffle=True)
print(len(train_loader)) # 938, 64 * 938 = 60032
test_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=False, transform=transforms.ToTensor()),batch_size=1000)
print(len(test_loader)) # 10, (10 * 1000 = 10000)
'''
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
9913344/? [04:51<00:00, 34050.71it/s]
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
29696/? [00:01<00:00, 26930.77it/s]
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664/? [00:00<00:00, 3989386.71it/s]
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120/? [00:00<00:00, 139107.35it/s]
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
Processing...
Done!
938
10
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:502: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
'''
Adam optimizer, learning rate 1e-4로 설정
optimizer = optim.Adam(model.parameters(), lr=1e-4)
Model training, epoch은 10으로 설정
model.train()
epochs = 10 ### change
total_loss = 0
total_acc = 0
train_loss = []
train_accuracy = []
i = 0
for epoch in range(epochs):
for data, target in train_loader:
data, target = Variable(data), Variable(target)
data = data.to(device)
target = target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward() # calc gradients
total_loss += loss
train_loss.append(total_loss/i)
optimizer.step() # update gradients
prediction = output.data.max(1)[1] # first column has actual prob.
accuracy = prediction.eq(target.data).sum()/batch_size*100
total_acc += accuracy
train_accuracy.append(total_acc/i)
if i % 10 == 0:
print('Epoch: {}\t Train Step: {}\tLoss: {:.3f}\tAccuracy: {:.3f}'.format(epoch+1, i, loss, accuracy))
i += 1
print('Epoch: {} finished'.format(epoch+1))
'''
Epoch: 10 Train Step: 8450 Loss: 0.015 Accuracy: 100.000
Epoch: 10 Train Step: 8460 Loss: 0.015 Accuracy: 100.000
Epoch: 10 Train Step: 8470 Loss: 0.052 Accuracy: 98.438
Epoch: 10 Train Step: 8480 Loss: 0.005 Accuracy: 100.000
Epoch: 10 Train Step: 8490 Loss: 0.012 Accuracy: 100.000
Epoch: 10 Train Step: 8500 Loss: 0.032 Accuracy: 98.438
Epoch: 10 Train Step: 8510 Loss: 0.014 Accuracy: 100.000
Epoch: 10 Train Step: 8520 Loss: 0.037 Accuracy: 98.438
Epoch: 10 Train Step: 8530 Loss: 0.006 Accuracy: 100.000
Epoch: 10 Train Step: 8540 Loss: 0.006 Accuracy: 100.000
Epoch: 10 Train Step: 8550 Loss: 0.060 Accuracy: 98.438
Epoch: 10 Train Step: 8560 Loss: 0.004 Accuracy: 100.000
Epoch: 10 Train Step: 8570 Loss: 0.011 Accuracy: 100.000
Epoch: 10 Train Step: 8580 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8590 Loss: 0.075 Accuracy: 96.875
Epoch: 10 Train Step: 8600 Loss: 0.006 Accuracy: 100.000
Epoch: 10 Train Step: 8610 Loss: 0.035 Accuracy: 98.438
Epoch: 10 Train Step: 8620 Loss: 0.005 Accuracy: 100.000
Epoch: 10 Train Step: 8630 Loss: 0.059 Accuracy: 98.438
Epoch: 10 Train Step: 8640 Loss: 0.026 Accuracy: 98.438
Epoch: 10 Train Step: 8650 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 8660 Loss: 0.017 Accuracy: 100.000
Epoch: 10 Train Step: 8670 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 8680 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8690 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 8700 Loss: 0.005 Accuracy: 100.000
Epoch: 10 Train Step: 8710 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8720 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8730 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 8740 Loss: 0.049 Accuracy: 98.438
Epoch: 10 Train Step: 8750 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8760 Loss: 0.028 Accuracy: 98.438
Epoch: 10 Train Step: 8770 Loss: 0.031 Accuracy: 98.438
Epoch: 10 Train Step: 8780 Loss: 0.008 Accuracy: 100.000
Epoch: 10 Train Step: 8790 Loss: 0.059 Accuracy: 98.438
Epoch: 10 Train Step: 8800 Loss: 0.011 Accuracy: 100.000
Epoch: 10 Train Step: 8810 Loss: 0.025 Accuracy: 98.438
Epoch: 10 Train Step: 8820 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8830 Loss: 0.034 Accuracy: 96.875
Epoch: 10 Train Step: 8840 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 8850 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 8860 Loss: 0.009 Accuracy: 100.000
Epoch: 10 Train Step: 8870 Loss: 0.020 Accuracy: 98.438
Epoch: 10 Train Step: 8880 Loss: 0.011 Accuracy: 100.000
Epoch: 10 Train Step: 8890 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 8900 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 8910 Loss: 0.013 Accuracy: 98.438
Epoch: 10 Train Step: 8920 Loss: 0.043 Accuracy: 98.438
Epoch: 10 Train Step: 8930 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 8940 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 8950 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 8960 Loss: 0.018 Accuracy: 98.438
Epoch: 10 Train Step: 8970 Loss: 0.006 Accuracy: 100.000
Epoch: 10 Train Step: 8980 Loss: 0.033 Accuracy: 98.438
Epoch: 10 Train Step: 8990 Loss: 0.022 Accuracy: 100.000
Epoch: 10 Train Step: 9000 Loss: 0.008 Accuracy: 100.000
Epoch: 10 Train Step: 9010 Loss: 0.011 Accuracy: 100.000
Epoch: 10 Train Step: 9020 Loss: 0.000 Accuracy: 100.000
Epoch: 10 Train Step: 9030 Loss: 0.013 Accuracy: 100.000
Epoch: 10 Train Step: 9040 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 9050 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 9060 Loss: 0.030 Accuracy: 98.438
Epoch: 10 Train Step: 9070 Loss: 0.013 Accuracy: 100.000
Epoch: 10 Train Step: 9080 Loss: 0.009 Accuracy: 100.000
Epoch: 10 Train Step: 9090 Loss: 0.018 Accuracy: 98.438
Epoch: 10 Train Step: 9100 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 9110 Loss: 0.007 Accuracy: 100.000
Epoch: 10 Train Step: 9120 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 9130 Loss: 0.008 Accuracy: 100.000
Epoch: 10 Train Step: 9140 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 9150 Loss: 0.042 Accuracy: 98.438
Epoch: 10 Train Step: 9160 Loss: 0.004 Accuracy: 100.000
Epoch: 10 Train Step: 9170 Loss: 0.001 Accuracy: 100.000
Epoch: 10 Train Step: 9180 Loss: 0.055 Accuracy: 98.438
Epoch: 10 Train Step: 9190 Loss: 0.016 Accuracy: 98.438
Epoch: 10 Train Step: 9200 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 9210 Loss: 0.007 Accuracy: 100.000
Epoch: 10 Train Step: 9220 Loss: 0.000 Accuracy: 100.000
Epoch: 10 Train Step: 9230 Loss: 0.007 Accuracy: 100.000
Epoch: 10 Train Step: 9240 Loss: 0.004 Accuracy: 100.000
Epoch: 10 Train Step: 9250 Loss: 0.101 Accuracy: 96.875
Epoch: 10 Train Step: 9260 Loss: 0.017 Accuracy: 100.000
Epoch: 10 Train Step: 9270 Loss: 0.007 Accuracy: 100.000
Epoch: 10 Train Step: 9280 Loss: 0.005 Accuracy: 100.000
Epoch: 10 Train Step: 9290 Loss: 0.002 Accuracy: 100.000
Epoch: 10 Train Step: 9300 Loss: 0.006 Accuracy: 100.000
Epoch: 10 Train Step: 9310 Loss: 0.012 Accuracy: 100.000
Epoch: 10 Train Step: 9320 Loss: 0.009 Accuracy: 100.000
Epoch: 10 Train Step: 9330 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 9340 Loss: 0.004 Accuracy: 100.000
Epoch: 10 Train Step: 9350 Loss: 0.003 Accuracy: 100.000
Epoch: 10 Train Step: 9360 Loss: 0.030 Accuracy: 98.438
Epoch: 10 Train Step: 9370 Loss: 0.008 Accuracy: 100.000
Epoch: 10 finished
'''
Training accuracy에서 MLP 보다 높은 것을 알 수 있음
Plotting 결과
plt.figure()
plt.plot(np.arange(len(train_loss)), train_loss)
plt.show()
#plt.savefig('./train_loss_result.png')
plt.figure()
plt.plot(np.arange(len(train_accuracy)), train_accuracy)
plt.show()
#plt.savefig('./train_accuracy_result.png')
Evaluation 결과
with torch.no_grad():
model.eval()
correct = 0
for data, target in test_loader:
data, target = Variable(data), Variable(target)
data = data.to(device)
target = target.to(device)
output = model(data)
prediction = output.data.max(1)[1]
correct += prediction.eq(target.data).sum()
print('\nTest set: Accuracy: {:.2f}%'.format(100. * correct / len(test_loader.dataset)))
# Test set: Accuracy: 99.11%
마찬가지로 MLP보다 CNN의 성능이 더 높음을 알 수 있음
728x90
'딥러닝 > Pytorch' 카테고리의 다른 글
Recurrent Neural Network (RNN) pytorch 코딩 (0) | 2021.05.02 |
---|---|
Multi-Layer Perceptron (MLP, Fully-connected layer) (0) | 2021.05.02 |