728x90

이번에는 이전 post와 같은 MNIST dataset을 활용하여, CNN으로 성능 뽑는것을 진행해본다.

 

CNN은 Fully-connected layer와 달리 flatten을 해줄 필요가 없어서 parameter가 비교적 적게 들고, 연산이 빠른 장점이 있으며, receptive field를 통해 local feature를 뽑는 것에 강인한 특징이 있음

 

Library importing 및 device 설정

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable
import os
import matplotlib.pyplot as plt
import numpy as np
torch.__version__

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.manual_seed(2891)
num_gpu = 1
if torch.cuda.device_count() > 1:
    num_gpu = torch.cuda.device_count()
print("Let's use", num_gpu, "GPUs!")

print('our device', device)
'''
Let's use 1 GPUs!
our device cuda
'''

2-layer CNN 네트워크 설계 (add here 부분에 batchnormalization, 더 깊게 쌓는 것들을 연습해보세요)

class CNN(nn.Module):
    def __init__(self, num_class, drop_prob):
        super(CNN, self).__init__()
        # input is 28x28
        # padding=2 for same padding
        self.conv1 = nn.Conv2d(1, 32, 5, padding=2) #input_channel, output_channel, filter_size, padding_size, (kernel=omit)
        # feature map size is 14*14 by pooling
        # padding=2 for same padding
        self.conv2 = nn.Conv2d(32, 64, 5, padding=2)
        # feature map size is 7*7 by pooling
        '''
        add here.. make more deep...

        batchnormalization ++
        '''
        self.dropout = nn.Dropout(p=drop_prob)

        self.fc1 = nn.Linear(64*7*7, 1024)
        self.reduce_layer = nn.Linear(1024, num_class)
        self.log_softmax = nn.LogSoftmax(dim=1)
        
    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), 2) # -> (B, 14, 14, 32)
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        '''
        add here.. make more deep...
        and use dropout
        '''
        x = x.view(-1, 64*7*7)   # reshape Variable for using Linear (because linear only permit 1D. So we call this task as "flatten")
        x = F.relu(self.fc1(x))
        
        output = self.reduce_layer(x)
        return self.log_softmax(output)

 

Model loading 및 parameter, shape 체크

model = CNN(10, 0.3)
model.to(device)
'''
CNN(
  (conv1): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv2): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (dropout): Dropout(p=0.3, inplace=False)
  (fc1): Linear(in_features=3136, out_features=1024, bias=True)
  (reduce_layer): Linear(in_features=1024, out_features=10, bias=True)
  (log_softmax): LogSoftmax(dim=1)
)
'''
#model shape
for p in model.parameters():
    print(p.size())
'''
torch.Size([32, 1, 5, 5])
torch.Size([32])
torch.Size([64, 32, 5, 5])
torch.Size([64])
torch.Size([1024, 3136])
torch.Size([1024])
torch.Size([10, 1024])
torch.Size([10])
'''
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

model_hp = count_parameters(model)
print('model"s hyper parameters', model_hp)
# model"s hyper parameters 3274634

Data setup 및 train, test loader 설정

batch_size = 64
train_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=True, download=True, transform=transforms.ToTensor()),batch_size=batch_size, shuffle=True)
print(len(train_loader)) # 938, 64 * 938 = 60032
test_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=False, transform=transforms.ToTensor()),batch_size=1000)
print(len(test_loader)) # 10, (10 * 1000 = 10000)
'''
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
9913344/? [04:51<00:00, 34050.71it/s]

Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
29696/? [00:01<00:00, 26930.77it/s]

Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664/? [00:00<00:00, 3989386.71it/s]

Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120/? [00:00<00:00, 139107.35it/s]

Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw

Processing...
Done!
938
10
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:502: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
'''

 

Adam optimizer, learning rate 1e-4로 설정

optimizer = optim.Adam(model.parameters(), lr=1e-4)

 

Model training, epoch은 10으로 설정

model.train()
epochs = 10 ### change
total_loss = 0
total_acc = 0
train_loss = []
train_accuracy = []
i = 0
for epoch in range(epochs):
    for data, target in train_loader:
        data, target = Variable(data), Variable(target)
        data = data.to(device)        
       
        target = target.to(device)
         
        optimizer.zero_grad()
        output = model(data)
       
        loss = F.nll_loss(output, target)
        loss.backward()    # calc gradients
       
        total_loss += loss
       
        train_loss.append(total_loss/i)
        optimizer.step()   # update gradients
       
        prediction = output.data.max(1)[1]   # first column has actual prob.
        accuracy = prediction.eq(target.data).sum()/batch_size*100
       
        total_acc += accuracy
       
        train_accuracy.append(total_acc/i)
       
        if i % 10 == 0:
            print('Epoch: {}\t Train Step: {}\tLoss: {:.3f}\tAccuracy: {:.3f}'.format(epoch+1, i, loss, accuracy))
        i += 1
    print('Epoch: {} finished'.format(epoch+1))
'''
Epoch: 10	 Train Step: 8450	Loss: 0.015	Accuracy: 100.000
Epoch: 10	 Train Step: 8460	Loss: 0.015	Accuracy: 100.000
Epoch: 10	 Train Step: 8470	Loss: 0.052	Accuracy: 98.438
Epoch: 10	 Train Step: 8480	Loss: 0.005	Accuracy: 100.000
Epoch: 10	 Train Step: 8490	Loss: 0.012	Accuracy: 100.000
Epoch: 10	 Train Step: 8500	Loss: 0.032	Accuracy: 98.438
Epoch: 10	 Train Step: 8510	Loss: 0.014	Accuracy: 100.000
Epoch: 10	 Train Step: 8520	Loss: 0.037	Accuracy: 98.438
Epoch: 10	 Train Step: 8530	Loss: 0.006	Accuracy: 100.000
Epoch: 10	 Train Step: 8540	Loss: 0.006	Accuracy: 100.000
Epoch: 10	 Train Step: 8550	Loss: 0.060	Accuracy: 98.438
Epoch: 10	 Train Step: 8560	Loss: 0.004	Accuracy: 100.000
Epoch: 10	 Train Step: 8570	Loss: 0.011	Accuracy: 100.000
Epoch: 10	 Train Step: 8580	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8590	Loss: 0.075	Accuracy: 96.875
Epoch: 10	 Train Step: 8600	Loss: 0.006	Accuracy: 100.000
Epoch: 10	 Train Step: 8610	Loss: 0.035	Accuracy: 98.438
Epoch: 10	 Train Step: 8620	Loss: 0.005	Accuracy: 100.000
Epoch: 10	 Train Step: 8630	Loss: 0.059	Accuracy: 98.438
Epoch: 10	 Train Step: 8640	Loss: 0.026	Accuracy: 98.438
Epoch: 10	 Train Step: 8650	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 8660	Loss: 0.017	Accuracy: 100.000
Epoch: 10	 Train Step: 8670	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 8680	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8690	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 8700	Loss: 0.005	Accuracy: 100.000
Epoch: 10	 Train Step: 8710	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8720	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8730	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 8740	Loss: 0.049	Accuracy: 98.438
Epoch: 10	 Train Step: 8750	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8760	Loss: 0.028	Accuracy: 98.438
Epoch: 10	 Train Step: 8770	Loss: 0.031	Accuracy: 98.438
Epoch: 10	 Train Step: 8780	Loss: 0.008	Accuracy: 100.000
Epoch: 10	 Train Step: 8790	Loss: 0.059	Accuracy: 98.438
Epoch: 10	 Train Step: 8800	Loss: 0.011	Accuracy: 100.000
Epoch: 10	 Train Step: 8810	Loss: 0.025	Accuracy: 98.438
Epoch: 10	 Train Step: 8820	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8830	Loss: 0.034	Accuracy: 96.875
Epoch: 10	 Train Step: 8840	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 8850	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 8860	Loss: 0.009	Accuracy: 100.000
Epoch: 10	 Train Step: 8870	Loss: 0.020	Accuracy: 98.438
Epoch: 10	 Train Step: 8880	Loss: 0.011	Accuracy: 100.000
Epoch: 10	 Train Step: 8890	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 8900	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 8910	Loss: 0.013	Accuracy: 98.438
Epoch: 10	 Train Step: 8920	Loss: 0.043	Accuracy: 98.438
Epoch: 10	 Train Step: 8930	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 8940	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 8950	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 8960	Loss: 0.018	Accuracy: 98.438
Epoch: 10	 Train Step: 8970	Loss: 0.006	Accuracy: 100.000
Epoch: 10	 Train Step: 8980	Loss: 0.033	Accuracy: 98.438
Epoch: 10	 Train Step: 8990	Loss: 0.022	Accuracy: 100.000
Epoch: 10	 Train Step: 9000	Loss: 0.008	Accuracy: 100.000
Epoch: 10	 Train Step: 9010	Loss: 0.011	Accuracy: 100.000
Epoch: 10	 Train Step: 9020	Loss: 0.000	Accuracy: 100.000
Epoch: 10	 Train Step: 9030	Loss: 0.013	Accuracy: 100.000
Epoch: 10	 Train Step: 9040	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 9050	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 9060	Loss: 0.030	Accuracy: 98.438
Epoch: 10	 Train Step: 9070	Loss: 0.013	Accuracy: 100.000
Epoch: 10	 Train Step: 9080	Loss: 0.009	Accuracy: 100.000
Epoch: 10	 Train Step: 9090	Loss: 0.018	Accuracy: 98.438
Epoch: 10	 Train Step: 9100	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 9110	Loss: 0.007	Accuracy: 100.000
Epoch: 10	 Train Step: 9120	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 9130	Loss: 0.008	Accuracy: 100.000
Epoch: 10	 Train Step: 9140	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 9150	Loss: 0.042	Accuracy: 98.438
Epoch: 10	 Train Step: 9160	Loss: 0.004	Accuracy: 100.000
Epoch: 10	 Train Step: 9170	Loss: 0.001	Accuracy: 100.000
Epoch: 10	 Train Step: 9180	Loss: 0.055	Accuracy: 98.438
Epoch: 10	 Train Step: 9190	Loss: 0.016	Accuracy: 98.438
Epoch: 10	 Train Step: 9200	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 9210	Loss: 0.007	Accuracy: 100.000
Epoch: 10	 Train Step: 9220	Loss: 0.000	Accuracy: 100.000
Epoch: 10	 Train Step: 9230	Loss: 0.007	Accuracy: 100.000
Epoch: 10	 Train Step: 9240	Loss: 0.004	Accuracy: 100.000
Epoch: 10	 Train Step: 9250	Loss: 0.101	Accuracy: 96.875
Epoch: 10	 Train Step: 9260	Loss: 0.017	Accuracy: 100.000
Epoch: 10	 Train Step: 9270	Loss: 0.007	Accuracy: 100.000
Epoch: 10	 Train Step: 9280	Loss: 0.005	Accuracy: 100.000
Epoch: 10	 Train Step: 9290	Loss: 0.002	Accuracy: 100.000
Epoch: 10	 Train Step: 9300	Loss: 0.006	Accuracy: 100.000
Epoch: 10	 Train Step: 9310	Loss: 0.012	Accuracy: 100.000
Epoch: 10	 Train Step: 9320	Loss: 0.009	Accuracy: 100.000
Epoch: 10	 Train Step: 9330	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 9340	Loss: 0.004	Accuracy: 100.000
Epoch: 10	 Train Step: 9350	Loss: 0.003	Accuracy: 100.000
Epoch: 10	 Train Step: 9360	Loss: 0.030	Accuracy: 98.438
Epoch: 10	 Train Step: 9370	Loss: 0.008	Accuracy: 100.000
Epoch: 10 finished
'''

Training accuracy에서 MLP 보다 높은 것을 알 수 있음

 

Plotting 결과

plt.figure()
plt.plot(np.arange(len(train_loss)), train_loss)
plt.show()
#plt.savefig('./train_loss_result.png')

plt.figure()
plt.plot(np.arange(len(train_accuracy)), train_accuracy)
plt.show()
#plt.savefig('./train_accuracy_result.png')

step에 따른 training loss 변화도
step에 따른 training accuracy 변화도

 

Evaluation 결과

with torch.no_grad():
    model.eval()
    correct = 0
   
    for data, target in test_loader:
        data, target = Variable(data), Variable(target)
        data = data.to(device)
        target = target.to(device)
        output = model(data)
        prediction = output.data.max(1)[1]
        correct += prediction.eq(target.data).sum()

print('\nTest set: Accuracy: {:.2f}%'.format(100. * correct / len(test_loader.dataset)))
# Test set: Accuracy: 99.11%

마찬가지로 MLP보다 CNN의 성능이 더 높음을 알 수 있음

 

728x90

+ Recent posts