InternLM · lvhan028 · Apr 23, 2026 · Copilot · Apr 23, 2026
diff --git a/.github/workflows/cuda12.8_whl_release.yml b/.github/workflows/cuda12.8_whl_release.yml
diff --git a/.github/workflows/pypi.yml b/.github/workflows/pypi.yml
@@ -18,8 +18,8 @@ jobs:
     env:
       PYTHON_VERSION: ${{ matrix.pyver }}
       PLAT_NAME: manylinux2014_x86_64
-      DOCKER_TAG: cuda12.4
-      OUTPUT_FOLDER: cuda12_dist
+      DOCKER_TAG: cuda12.8
+      OUTPUT_FOLDER: cuda12.8_dist
     steps:
       - name: Free disk space
         uses: jlumbroso/free-disk-space@main
@@ -75,11 +75,11 @@ jobs:
         shell: pwsh
         run: ./builder/windows/setup_cuda.ps1
         env:
-            INPUT_CUDA_VERSION: '12.6.2'
+            INPUT_CUDA_VERSION: '12.8.1'
       - name: Build wheel
         run: |
           python -m build --wheel -o build/wheel
-          Get-ChildItem -Path "build" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu121 --delete-old-wheel }
+          Get-ChildItem -Path "build" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu128 --delete-old-wheel }
-          Get-ChildItem -Path "build" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu128 --delete-old-wheel }
+          Get-ChildItem -Path "build/wheel" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu128 --delete-old-wheel }
-          Get-ChildItem -Path "build" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu128 --delete-old-wheel }
+          Get-ChildItem -Path "build/wheel" -Filter "*.whl" | ForEach-Object { change_wheel_version $_.FullName --local-version cu128 --delete-old-wheel }
       - name: Upload Artifacts
         uses: actions/upload-artifact@v4
         with:

diff --git a/README.md b/README.md
@@ -224,15 +224,7 @@ conda activate lmdeploy
 pip install lmdeploy
 ```
 
-Since v0.3.0, the default prebuilt package is compiled on **CUDA 12**. Starting from v0.10.2, LMDeploy no longer supports CUDA 11 series.
-
-If you are using a GeForce RTX 50 series graphics card, please install the LMDeploy prebuilt package compiled with **CUDA 12.8** as follows:
-
-```shell
-export LMDEPLOY_VERSION=0.12.3
-export PYTHON_VERSION=312
-pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
-```
+Starting from **v0.13.0**, the default prebuilt wheels published on **PyPI** are built against **CUDA 12.8**, so `pip install lmdeploy` is sufficient for typical setups including GeForce RTX 50 series.
 
 ## Offline Batch Inference
 

diff --git a/README_ja.md b/README_ja.md
@@ -201,7 +201,10 @@ conda activate lmdeploy
 pip install lmdeploy
 ```
 
-v0.3.0から、デフォルトの事前構築済みパッケージはCUDA 12でコンパイルされています。
+**v0.13.0** 以降、**PyPI** に公開される既定の事前構築wheelは **CUDA 12.8** 向けにビルドされています。v0.10.2以降、LMDeployはCUDA 11系をサポートしません。
+
+GeForce RTX 50シリーズを含む一般的な用途でも、上記の `pip install lmdeploy` で問題ありません。
+
 CUDA 11+プラットフォームでのインストールに関する情報、またはソースからのビルド手順については、[インストールガイドを](docs/en/get_started/installation.md)参照してください。
 
 ## オフラインバッチ推論

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -226,15 +226,7 @@ conda activate lmdeploy
 pip install lmdeploy
 ```
 
-自 v0.3.0 版本起，默认预编译包基于 **CUDA 12** 编译。v0.10.2 及更高版本中，LMDeploy 不再支持 CUDA 11+。
-
-若使用 GeForce RTX 50 系列显卡，请按照如下方式安装基于 **CUDA 12.8** 编译的 LMDeploy 预编译包。
-
-```shell
-export LMDEPLOY_VERSION=0.12.3
-export PYTHON_VERSION=312
-pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
-```
+自 **v0.13.0** 起，**PyPI** 上默认预编译 wheel 基于 **CUDA 12.8** 构建，一般用户（含 GeForce RTX 50 系列）使用上方的 `pip install lmdeploy` 即可。
 
 ## 离线批处理
 

diff --git a/lmdeploy/version.py b/lmdeploy/version.py
@@ -1,6 +1,6 @@
 # Copyright (c) OpenMMLab. All rights reserved.
 
-__version__ = '0.12.3'
+__version__ = '0.13.0'
 short_version = __version__