R&D/AI

AWS Ubuntu 22.04 & PyTorch & Llama2 설치

sunshout 2023. 11. 14. 14:59

참조: https://github.com/samlhuillier/code-llama-fine-tune-notebook/blob/main/fine-tune-code-llama.ipynb

기본 환경: Python 3.10, cuda 11.8

EC2 Instance Type: g3s.xlarge

  • GPU가 장착되어 있는 VM을 생성합니다.

PyTorch 설치

root@ip-172-31-10-59:~# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torch
  Downloading https://download.pytorch.org/whl/cu118/torch-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl (2325.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 GB ? eta 0:00:00
Collecting torchvision
  Downloading https://download.pytorch.org/whl/cu118/torchvision-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl (6.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 19.4 MB/s eta 0:00:00
Collecting torchaudio
  Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 11.2 MB/s eta 0:00:00
Collecting networkx
  Downloading https://download.pytorch.org/whl/networkx-3.0-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 16.7 MB/s eta 0:00:00
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch) (3.0.3)
Collecting fsspec
  Downloading https://download.pytorch.org/whl/fsspec-2023.4.0-py3-none-any.whl (153 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.0/154.0 KB 1.4 MB/s eta 0:00:00
Collecting typing-extensions
  Downloading https://download.pytorch.org/whl/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting filelock
  Downloading https://download.pytorch.org/whl/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting triton==2.1.0
  Downloading https://download.pytorch.org/whl/triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.2/89.2 MB 14.4 MB/s eta 0:00:00
Collecting sympy
  Downloading https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 79.6 MB/s eta 0:00:00
Collecting numpy
  Downloading https://download.pytorch.org/whl/numpy-1.24.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.3/17.3 MB 38.1 MB/s eta 0:00:00
Collecting pillow!=8.3.*,>=5.3.0
  Downloading https://download.pytorch.org/whl/Pillow-9.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 18.2 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from torchvision) (2.25.1)
Collecting mpmath>=0.19
  Downloading https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 47.8 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, pillow, numpy, networkx, fsspec, filelock, triton, torch, torchvision, torchaudio
Successfully installed filelock-3.9.0 fsspec-2023.4.0 mpmath-1.3.0 networkx-3.0 numpy-1.24.1 pillow-9.3.0 sympy-1.12 torch-2.1.0+cu118 torchaudio-2.1.0+cu118 torchvision-0.16.0+cu118 triton-2.1.0 typing-extensions-4.4.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
root@ip-172-31-10-59:~#

PyTorch 와 NVidia CUDA가 동작하는지 체크 하려면

import torch
torch.cuda.is_available()

실행 결과가 False 가 나오면 Nvidia CUDA가 아직 설치되어 있지 않은 상태이다.

Nvidia Driver 설치

sudo apt install software-properties-common -y
sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update
sudo apt install ubuntu-drivers-common

sudo ubuntu-drivers autoinstall

예의상 서버를 리부팅하고

nvidia-smi 명령어를 실행해보며 어떤 GPU가 동작하는지 확인할 수 있다.

root@ip-172-31-10-59:~# nvidia-smi
Tue Nov 14 05:58:52 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02              Driver Version: 545.29.02    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla M60                      Off | 00000000:00:1E.0 Off |                    0 |
| N/A   25C    P0              38W / 150W |      0MiB /  7680MiB |     99%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
root@ip-172-31-10-59:~#
ubuntu@ip-172-31-10-59:~$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True