In 2023, the Open Compute Project (OCP) released the Microscaling Formats (MX) Specification, representing an initial effort to standardize an open, interoperable family of data formats with a shared, fine-grained block scale. This standardization effort includes contributions from industry leaders such as AMD, Arm, Intel, Meta, Microsoft, NVIDIA, and Qualcomm, aiming to create a unified approach to narrow precision data formats in AI workloads. To facilitate the adoption of MX data formats, Microsoft has developed a PyTorch emulation library that provides drop-in replacements for standard PyTorch modules and functions.
In this blog, I will go throught the step-by-step process of installing the Microxcaling repo using docker images. (repo link: https://github.com/microsoft/microxcaling)
Below are the desktop configurations.
OS: Ubuntu 20.04
NVIDIA-SMI 550.54.14
Driver Version: 550.54.14
CUDA Version: 12.4
$sudo apt update
$sudo apt install -y docker.io
$sudo docker pull nvcr.io/nvidia/pytorch:24.06-py3
$distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$sudo apt update
$sudo apt install -y nvidia-container-toolkit
$sudo docker run --gpus all -it --rm -v $(pwd):/workspace nvcr.io/nvidia/pytorch:24.06-py3
$python -m pip install --upgrade pip
torch==2.4.1
torchvision==0.19.1
torchaudio==2.4.1
$pip install -r requirements.txt
$export PYTHONPATH=/workspace/microxcaling:$PYTHONPATH
$python -c "import mx"
/workspace/microxcaling/examples# sh run_mxfp6.sh
I have an RTX 4060Ti and Windows 11 desktop. Below shows the steps to set it up.
link : https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/
- GeForce Experience Software
- PhysX
- Driver (alreadly installed)
Example of output: ``` Installed:
Download cudnn for windows (cuda 12.x).
Copy files to the nvidia tool kit folder (include/bin/lib files).
Reboot
Download ChatWithRTX
Run setup, select clean installation (it takes around 30-60 mins ...)
To run it, search "nvidia chat with rtx", select open
$ssh <username>@<remoteIP>
ipython kernel install --user --name=deeplearning
export PATH="/home/leiming/miniconda3/envs/deeplearning/bin:$PATH"
Now, from your local machine, you can ssh to remote.
jupyter notebook --no-browser --port 8889
ssh -N -L localhost:8899:localhost:8889 <username>@<remoteIP>
http://localhost:8889/?token=dd5ac17bc80e068ece002ea35ccf867547030cfa0328
In this installation guide, I recorded the details to install NVIDIA driver 515.57 on CentOS 9.
]]>In this tutorial,2022-rehl8-miniconda-tfgpu.pdf, I go over the following steps to set miniconda and tensorflow-gpu (v2.7.0) on RHEL 8.
When you ssh connection is broken, the previous screen session continues to run on the remote machine. You won’t loose your work!
On Ubuntu
$sudo apt-get install screen
On CentOS/Fedora
$sudo yum install screen
# list active sessions
$screen -ls
# create a new session
$screen -S <name>
# detach current working session
on your keyboard, press ctrl + a, then press d
# resume an active session (if # of sessions > 1)
$screen -r <name>
# resume an active session (if the session is shown attached)
$screen -d -r <name>
# check the session name when you are using a screen session
$echo $STY
# resume an active session (if # of sessions == 1)
$screen -x
# terminate current session
type exit in the terminal, or press ctrl + a, then k, then y
# terminate session outside the active one
use $screen -ls to list sessions (in the form of [pid].[sessionname])
find out the pid number
run $sudo kill [pid], to kill the session
#wipe a dead session
$screen -wipe