이번 시간에는 여러 개의 Fully-connected layer를 쌓는 것을 코딩해본다.
먼저 아래의 코드를 통해 library 를 importing 함
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable
import os
import matplotlib.pyplot as plt
import numpy as np
또한 아래의 코드를 통해 현재의 pytorch 버전에 대해 확인함
torch.__version__
이후 연산할 장치에 대해 선언해야 함
아래와 같이 device를 gpu로 설정함
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.manual_seed(2891)
num_gpu = 1
if torch.cuda.device_count() > 1:
num_gpu = torch.cuda.device_count()
print("Let's use", num_gpu, "GPUs!") # 1
print('device', device) # cuda
이후 간단한 MLP (Multi-Layer Perceptron) 모델을 구현함
입력은 MNIST dataset을 사용함
MNIST dataset의 각각의 요소는 (28, 28) 의 shape을 갖고 있기 때문에,
MLP를 통과하기 위해서는 각 요소를 flatten 시켜야 함
즉 reshape 을 사용하여 (28, 28) --> (1, 784)로 변경해야 함
class MnistMLP(nn.Module):
def __init__(self, num_class, drop_prob):
super(MnistMLP, self).__init__()
# input is 28x28
# need for flatten ==> 784
self.dropout = nn.Dropout(p=drop_prob)
self.linear1 = nn.Linear(784, 512)
self.linear2 = nn.Linear(512, 256)
self.linear3 = nn.Linear(256, 10)
self.reduce_layer = nn.Linear(10, num_class)
self.logsoftmax = nn.LogSoftmax(dim=1)
def forward(self, x):
x = x.float()
mlp1 = F.relu(self.linear1(x.view(-1, 784)))
mlp1 = self.dropout(mlp1)
mlp2 = F.relu(self.linear2(mlp1))
mlp2 = self.dropout(mlp2)
mlp3 = F.relu(self.linear3(mlp2))
mlp3 = self.dropout(mlp3)
output = self.reduce_layer(mlp3)
return self.logsoftmax(output)
이후 아래의 코드처럼 모델을 선언하고 gpu에 올림
model = MnistMLP(10, 0.3)
model.to(device)
'''
MnistMLP(
(dropout): Dropout(p=0.3, inplace=False)
(linear1): Linear(in_features=784, out_features=512, bias=True)
(linear2): Linear(in_features=512, out_features=256, bias=True)
(linear3): Linear(in_features=256, out_features=10, bias=True)
(reduce_layer): Linear(in_features=10, out_features=10, bias=True)
(logsoftmax): LogSoftmax(dim=1)
)
'''
MNIST의 class 개수는 10개 이므로, 첫 번째 인자에 10을 넣었고, dropout은 30%확률로 진행
작성한 모델의 매 layer 마다의 shape은 아래를 통해서 확인할 수 있고,
#model shape
for p in model.parameters():
print(p.size())
'''
torch.Size([512, 784])
torch.Size([512])
torch.Size([256, 512])
torch.Size([256])
torch.Size([10, 256])
torch.Size([10])
torch.Size([10, 10])
torch.Size([10])
'''
총 hyperparameter는 아래를 통해 확인 가능함
def count_parameters(model):
return sum(p.numel() for p in model.parameters() if p.requires_grad)
model_hp = count_parameters(model)
print('model"s hyper parameters', model_hp)
'''
model"s hyper parameters 535928
'''
이제 모델 선언은 끝났고, data를 loading 해야 함
아래의 코드를 통해 MNIST dataset을 다운받고, train 및 test로 분할함
batch_size = 128
train_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=True, download=True, transform=transforms.ToTensor()),batch_size=batch_size, shuffle=True)
print(len(train_loader)) # 118, 512 * 118 = 60000
test_loader = torch.utils.data.DataLoader(datasets.MNIST('data', train=False, transform=transforms.ToTensor()),batch_size=1000)
print(len(test_loader)) # 10, 10 * 1000 = 10000
'''
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz
9913344/? [04:54<00:00, 33670.03it/s]
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
29696/? [00:01<00:00, 26891.25it/s]
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664/? [00:00<00:00, 3911534.90it/s]
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120/? [00:00<00:00, 159181.34it/s]
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
Processing...
Done!
469
10
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:502: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:143.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
'''
Optimizer 선언은 아래와 같이 진행하며, 가장 많이 사용되는 Adam을 learning rate 1e-4로 사용
optimizer = optim.Adam(model.parameters(), lr=1e-4)
Training을 진행함 (epoch은 10까지만 진행)
model.train()
epochs = 10 ### change
total_loss = 0
total_acc = 0
train_loss = []
train_accuracy = []
i = 0
for epoch in range(epochs):
for data, target in train_loader:
data, target = Variable(data), Variable(target)
data = data.to(device)
target = target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward() # calc gradients
total_loss += loss
train_loss.append(total_loss/i)
optimizer.step() # update gradients
prediction = output.data.max(1)[1] # first column has actual prob.
accuracy = prediction.eq(target.data).sum()/batch_size*100
total_acc += accuracy
train_accuracy.append(total_acc/i)
if i % 10 == 0:
print('Epoch: {}\t Train Step: {}\tLoss: {:.3f}\tAccuracy: {:.3f}'.format(epoch+1, i, loss, accuracy))
i += 1
print('Epoch: {} finished'.format(epoch+1))
'''
Epoch: 9 Train Step: 4200 Loss: 0.652 Accuracy: 78.906
Epoch: 9 Train Step: 4210 Loss: 0.422 Accuracy: 85.156
Epoch: 9 Train Step: 4220 Loss: 0.496 Accuracy: 61.719
Epoch: 9 finished
Epoch: 10 Train Step: 4230 Loss: 0.432 Accuracy: 84.375
Epoch: 10 Train Step: 4240 Loss: 0.435 Accuracy: 89.062
Epoch: 10 Train Step: 4250 Loss: 0.370 Accuracy: 86.719
Epoch: 10 Train Step: 4260 Loss: 0.468 Accuracy: 83.594
Epoch: 10 Train Step: 4270 Loss: 0.479 Accuracy: 85.156
Epoch: 10 Train Step: 4280 Loss: 0.422 Accuracy: 85.156
Epoch: 10 Train Step: 4290 Loss: 0.538 Accuracy: 78.906
Epoch: 10 Train Step: 4300 Loss: 0.493 Accuracy: 87.500
Epoch: 10 Train Step: 4310 Loss: 0.531 Accuracy: 82.031
Epoch: 10 Train Step: 4320 Loss: 0.524 Accuracy: 82.031
Epoch: 10 Train Step: 4330 Loss: 0.520 Accuracy: 83.594
Epoch: 10 Train Step: 4340 Loss: 0.557 Accuracy: 82.812
Epoch: 10 Train Step: 4350 Loss: 0.597 Accuracy: 80.469
Epoch: 10 Train Step: 4360 Loss: 0.272 Accuracy: 90.625
Epoch: 10 Train Step: 4370 Loss: 0.402 Accuracy: 85.938
Epoch: 10 Train Step: 4380 Loss: 0.552 Accuracy: 78.906
Epoch: 10 Train Step: 4390 Loss: 0.450 Accuracy: 85.156
Epoch: 10 Train Step: 4400 Loss: 0.505 Accuracy: 85.156
Epoch: 10 Train Step: 4410 Loss: 0.498 Accuracy: 79.688
Epoch: 10 Train Step: 4420 Loss: 0.550 Accuracy: 77.344
Epoch: 10 Train Step: 4430 Loss: 0.515 Accuracy: 84.375
Epoch: 10 Train Step: 4440 Loss: 0.556 Accuracy: 78.125
Epoch: 10 Train Step: 4450 Loss: 0.363 Accuracy: 88.281
Epoch: 10 Train Step: 4460 Loss: 0.376 Accuracy: 88.281
Epoch: 10 Train Step: 4470 Loss: 0.409 Accuracy: 86.719
Epoch: 10 Train Step: 4480 Loss: 0.494 Accuracy: 84.375
Epoch: 10 Train Step: 4490 Loss: 0.550 Accuracy: 82.031
Epoch: 10 Train Step: 4500 Loss: 0.349 Accuracy: 90.625
Epoch: 10 Train Step: 4510 Loss: 0.465 Accuracy: 82.812
Epoch: 10 Train Step: 4520 Loss: 0.577 Accuracy: 78.906
Epoch: 10 Train Step: 4530 Loss: 0.412 Accuracy: 85.938
Epoch: 10 Train Step: 4540 Loss: 0.557 Accuracy: 81.250
Epoch: 10 Train Step: 4550 Loss: 0.481 Accuracy: 83.594
Epoch: 10 Train Step: 4560 Loss: 0.373 Accuracy: 86.719
Epoch: 10 Train Step: 4570 Loss: 0.445 Accuracy: 84.375
Epoch: 10 Train Step: 4580 Loss: 0.543 Accuracy: 77.344
Epoch: 10 Train Step: 4590 Loss: 0.358 Accuracy: 88.281
Epoch: 10 Train Step: 4600 Loss: 0.408 Accuracy: 87.500
Epoch: 10 Train Step: 4610 Loss: 0.523 Accuracy: 82.812
Epoch: 10 Train Step: 4620 Loss: 0.418 Accuracy: 86.719
Epoch: 10 Train Step: 4630 Loss: 0.423 Accuracy: 85.938
Epoch: 10 Train Step: 4640 Loss: 0.512 Accuracy: 79.688
Epoch: 10 Train Step: 4650 Loss: 0.625 Accuracy: 77.344
Epoch: 10 Train Step: 4660 Loss: 0.379 Accuracy: 86.719
Epoch: 10 Train Step: 4670 Loss: 0.440 Accuracy: 82.812
Epoch: 10 Train Step: 4680 Loss: 0.499 Accuracy: 81.250
Epoch: 10 finished
'''
Training에 대한 loss를 시각화 하기 위해 matplotlib 사용
plt.figure()
plt.plot(np.arange(len(train_loss)), train_loss)
plt.show()
#plt.savefig('./train_loss_result.png')
plt.figure()
plt.plot(np.arange(len(train_accuracy)), train_accuracy)
plt.show()
#plt.savefig('./train_accuracy_result.png')
모델의 실제 성능 평가를 하기 위해 training에 쓰이지 않은 test data로 아래와 같이 평가 진행
with torch.no_grad():
model.eval()
correct = 0
for data, target in test_loader:
data, target = Variable(data), Variable(target)
data = data.to(device)
target = target.to(device)
output = model(data)
prediction = output.data.max(1)[1]
correct += prediction.eq(target.data).sum()
print('\nTest set: Accuracy: {:.2f}%'.format(100. * correct / len(test_loader.dataset)))
#Test set: Accuracy: 96.04%
간단한 MLP 3-layer 만으로도 96%의 성능을 얻었음
'딥러닝 > Pytorch' 카테고리의 다른 글
Recurrent Neural Network (RNN) pytorch 코딩 (0) | 2021.05.02 |
---|---|
Covolutional Neural Networks (CNN) pytorch 코딩 (0) | 2021.05.02 |