当前位置: 首页 > news >正文

【CUDA 编译 bug】ld: cannot find -lcudart

我们使用 Conda 安装 pytorch 和 CUDA 环境之后,要用 Conda 的CUDA环境进行某个库编译时,出现了bug:

/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/compiler_compat/ld: cannot find -lcudart: No such file or directorycollect2: error: ld returned 1 exit statuserror: command '/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/bin/g++' failed with exit code 1[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.ERROR: Failed building wheel for diff_gaussian_rasterization_32dRunning setup.py clean for diff_gaussian_rasterization_32d
Failed to build diff_gaussian_rasterization_32d
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (diff_gaussian_rasterization_32d)which nvcc
/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/bin/nvcc
❯ echo $CUDA_HOME
/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar
❯ echo $PATH
/home/xxxx/local/bin:/home/xxxx/local/bin:/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/bin:/mnt/data/home/xxxx/miniforge3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
❯ echo $LD_LIBRARY_PATH
/mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/lib:

去探查发现,这里的软链接出了问题:

ls /mnt/data/home/xxxx/miniforge3/envs/GAGAvatar/liblibcudart.so  -> libcudart.so.12.1.55
libcudart.so.12
libcudart.so.12.1.105

继续探究发现,安装Pytorch时会安装 cuda-cudart=12.1.105

以下是按照Pytorch时会安装的所有以 pytorchnvidia 为 channel 的包:

  + pytorch-mutex         1.0  cuda                          pytorch     Cached+ libcublas       12.1.0.26  0                             nvidia      Cached+ libcufft         11.0.2.4  0                             nvidia      Cached+ libcusolver     11.4.4.55  0                             nvidia      Cached+ libcusparse     12.0.2.55  0                             nvidia      Cached+ libnpp          12.0.2.50  0                             nvidia      Cached+ cuda-cudart      12.1.105  0                             nvidia      Cached+ cuda-nvrtc       12.1.105  0                             nvidia      Cached+ libnvjitlink     12.1.105  0                             nvidia      Cached+ libnvjpeg       12.1.1.14  0                             nvidia      Cached+ cuda-cupti       12.1.105  0                             nvidia      Cached+ cuda-nvtx        12.1.105  0                             nvidia      Cached+ ffmpeg                4.3  hf484d3e_0                    pytorch     Cached+ libjpeg-turbo       2.0.0  h9bf148f_0                    pytorch     Cached+ cuda-version         12.6  3                             nvidia      Cached+ libcurand       10.3.7.77  0                             nvidia      Cached+ libcufile        1.11.1.6  0                             nvidia      Cached+ cuda-opencl       12.6.77  0                             nvidia      Cached+ cuda-libraries     12.1.0  0                             nvidia      Cached+ cuda-runtime       12.1.0  0                             nvidia      Cached+ pytorch-cuda         12.1  ha16c6d3_6                    pytorch     Cached+ pytorch             2.4.1  py3.12_cuda12.1_cudnn9.1.0_0  pytorch     Cached+ torchtriton         3.0.0  py312                         pytorch     Cached+ torchaudio          2.4.1  py312_cu121                   pytorch     Cached+ torchvision        0.19.1  py312_cu121                   pytorch     Cached

而这是安装 cuda-toolkit-12.1.0 的包:

  + cuda-documentation           12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvml-dev                12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ libnvvm-samples              12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cccl                    12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-driver-dev              12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-profiler-api            12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cudart                  12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvrtc                   12.1.55  0      nvidia/label/cuda-12.1.0       21MB+ cuda-opencl                  12.1.56  0      nvidia/label/cuda-12.1.0       11kB+ libcublas                  12.1.0.26  0      nvidia/label/cuda-12.1.0     Cached+ libcufft                    11.0.2.4  0      nvidia/label/cuda-12.1.0     Cached+ libcufile                   1.6.0.25  0      nvidia/label/cuda-12.1.0      782kB+ libcurand                  10.3.2.56  0      nvidia/label/cuda-12.1.0       54MB+ libcusolver                11.4.4.55  0      nvidia/label/cuda-12.1.0     Cached+ libcusparse                12.0.2.55  0      nvidia/label/cuda-12.1.0     Cached+ libnpp                     12.0.2.50  0      nvidia/label/cuda-12.1.0     Cached+ libnvjitlink                 12.1.55  0      nvidia/label/cuda-12.1.0       18MB+ libnvjpeg                  12.1.0.39  0      nvidia/label/cuda-12.1.0        3MB+ cuda-cupti                   12.1.62  0      nvidia/label/cuda-12.1.0        5MB+ cuda-cuobjdump               12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cuxxfilt                12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvcc                    12.1.66  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvprune                 12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-gdb                     12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvdisasm                12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvprof                  12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvtx                    12.1.66  0      nvidia/label/cuda-12.1.0       58kB+ cuda-sanitizer-api           12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nsight                  12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ nsight-compute           2023.1.0.15  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cudart-dev              12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvrtc-dev               12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-opencl-dev              12.1.56  0      nvidia/label/cuda-12.1.0     Cached+ libcublas-dev              12.1.0.26  0      nvidia/label/cuda-12.1.0     Cached+ libcufft-dev                11.0.2.4  0      nvidia/label/cuda-12.1.0     Cached+ gds-tools                   1.6.0.25  0      nvidia/label/cuda-12.1.0     Cached+ libcufile-dev               1.6.0.25  0      nvidia/label/cuda-12.1.0     Cached+ libcurand-dev              10.3.2.56  0      nvidia/label/cuda-12.1.0     Cached+ libcusolver-dev            11.4.4.55  0      nvidia/label/cuda-12.1.0     Cached+ libcusparse-dev            12.0.2.55  0      nvidia/label/cuda-12.1.0     Cached+ libnpp-dev                 12.0.2.50  0      nvidia/label/cuda-12.1.0     Cached+ libnvjitlink-dev             12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ libnvjpeg-dev              12.1.0.39  0      nvidia/label/cuda-12.1.0     Cached+ cuda-libraries                12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cupti-static            12.1.62  0      nvidia/label/cuda-12.1.0     Cached+ cuda-compiler                 12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvvp                    12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-command-line-tools       12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nsight-compute           12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-cudart-static           12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ cuda-nvrtc-static            12.1.55  0      nvidia/label/cuda-12.1.0     Cached+ libcublas-static           12.1.0.26  0      nvidia/label/cuda-12.1.0     Cached+ libcufft-static             11.0.2.4  0      nvidia/label/cuda-12.1.0     Cached+ libcufile-static            1.6.0.25  0      nvidia/label/cuda-12.1.0     Cached+ libcurand-static           10.3.2.56  0      nvidia/label/cuda-12.1.0     Cached+ libcusolver-static         11.4.4.55  0      nvidia/label/cuda-12.1.0     Cached+ libcusparse-static         12.0.2.55  0      nvidia/label/cuda-12.1.0     Cached+ libnpp-static              12.0.2.50  0      nvidia/label/cuda-12.1.0     Cached+ libnvjpeg-static           12.1.0.39  0      nvidia/label/cuda-12.1.0     Cached+ cuda-libraries-dev            12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-libraries-static         12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-visual-tools             12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-tools                    12.1.0  0      nvidia/label/cuda-12.1.0     Cached+ cuda-toolkit                  12.1.0  0      nvidia/label/cuda-12.1.0     Cached

这是安装 cuda-toolkit-12.1.1 的包:

  + cuda-documentation         12.1.105  0      nvidia/label/cuda-12.1.1       91kB+ cuda-nvml-dev              12.1.105  0      nvidia/label/cuda-12.1.1       87kB+ libnvvm-samples            12.1.105  0      nvidia/label/cuda-12.1.1       33kB+ cuda-cccl                  12.1.109  0      nvidia/label/cuda-12.1.1        1MB+ cuda-driver-dev            12.1.105  0      nvidia/label/cuda-12.1.1       17kB+ cuda-profiler-api          12.1.105  0      nvidia/label/cuda-12.1.1       19kB+ cuda-cudart                12.1.105  0      nvidia/label/cuda-12.1.1     Cached+ cuda-nvrtc                 12.1.105  0      nvidia/label/cuda-12.1.1     Cached+ cuda-opencl                12.1.105  0      nvidia/label/cuda-12.1.1       11kB+ libcublas                  12.1.3.1  0      nvidia/label/cuda-12.1.1      367MB+ libcufft                  11.0.2.54  0      nvidia/label/cuda-12.1.1      108MB+ libcufile                   1.6.1.9  0      nvidia/label/cuda-12.1.1      783kB+ libcurand                10.3.2.106  0      nvidia/label/cuda-12.1.1       54MB+ libcusolver              11.4.5.107  0      nvidia/label/cuda-12.1.1      116MB+ libcusparse              12.1.0.106  0      nvidia/label/cuda-12.1.1      177MB+ libnpp                    12.1.0.40  0      nvidia/label/cuda-12.1.1      147MB+ libnvjitlink               12.1.105  0      nvidia/label/cuda-12.1.1     Cached+ libnvjpeg                  12.2.0.2  0      nvidia/label/cuda-12.1.1        3MB+ cuda-cupti                 12.1.105  0      nvidia/label/cuda-12.1.1     Cached+ cuda-cuobjdump             12.1.111  0      nvidia/label/cuda-12.1.1      245kB+ cuda-cuxxfilt              12.1.105  0      nvidia/label/cuda-12.1.1      302kB+ cuda-nvcc                  12.1.105  0      nvidia/label/cuda-12.1.1       55MB+ cuda-nvprune               12.1.105  0      nvidia/label/cuda-12.1.1       67kB+ cuda-gdb                   12.1.105  0      nvidia/label/cuda-12.1.1        6MB+ cuda-nvdisasm              12.1.105  0      nvidia/label/cuda-12.1.1       50MB+ cuda-nvprof                12.1.105  0      nvidia/label/cuda-12.1.1        5MB+ cuda-nvtx                  12.1.105  0      nvidia/label/cuda-12.1.1     Cached+ cuda-sanitizer-api         12.1.105  0      nvidia/label/cuda-12.1.1       18MB+ cuda-nsight                12.1.105  0      nvidia/label/cuda-12.1.1      119MB+ nsight-compute           2023.1.1.4  0      nvidia/label/cuda-12.1.1      808MB+ cuda-cudart-dev            12.1.105  0      nvidia/label/cuda-12.1.1      381kB+ cuda-nvrtc-dev             12.1.105  0      nvidia/label/cuda-12.1.1       12kB+ cuda-opencl-dev            12.1.105  0      nvidia/label/cuda-12.1.1       59kB+ libcublas-dev              12.1.3.1  0      nvidia/label/cuda-12.1.1       76kB+ libcufft-dev              11.0.2.54  0      nvidia/label/cuda-12.1.1       14kB+ gds-tools                   1.6.1.9  0      nvidia/label/cuda-12.1.1       43MB+ libcufile-dev               1.6.1.9  0      nvidia/label/cuda-12.1.1       13kB+ libcurand-dev            10.3.2.106  0      nvidia/label/cuda-12.1.1      460kB+ libcusolver-dev          11.4.5.107  0      nvidia/label/cuda-12.1.1       51kB+ libcusparse-dev          12.1.0.106  0      nvidia/label/cuda-12.1.1      178MB+ libnpp-dev                12.1.0.40  0      nvidia/label/cuda-12.1.1      525kB+ libnvjitlink-dev           12.1.105  0      nvidia/label/cuda-12.1.1       15MB+ libnvjpeg-dev              12.2.0.2  0      nvidia/label/cuda-12.1.1       13kB+ cuda-libraries               12.1.1  0      nvidia/label/cuda-12.1.1        2kB+ cuda-cupti-static          12.1.105  0      nvidia/label/cuda-12.1.1       12MB+ cuda-compiler                12.1.1  0      nvidia/label/cuda-12.1.1        1kB+ cuda-nvvp                  12.1.105  0      nvidia/label/cuda-12.1.1      120MB+ cuda-command-line-tools      12.1.1  0      nvidia/label/cuda-12.1.1        1kB+ cuda-nsight-compute          12.1.1  0      nvidia/label/cuda-12.1.1        1kB+ cuda-cudart-static         12.1.105  0      nvidia/label/cuda-12.1.1      948kB+ cuda-nvrtc-static          12.1.105  0      nvidia/label/cuda-12.1.1       18MB+ libcublas-static           12.1.3.1  0      nvidia/label/cuda-12.1.1      389MB+ libcufft-static           11.0.2.54  0      nvidia/label/cuda-12.1.1      199MB+ libcufile-static            1.6.1.9  0      nvidia/label/cuda-12.1.1        3MB+ libcurand-static         10.3.2.106  0      nvidia/label/cuda-12.1.1       55MB+ libcusolver-static       11.4.5.107  0      nvidia/label/cuda-12.1.1       76MB+ libcusparse-static       12.1.0.106  0      nvidia/label/cuda-12.1.1      185MB+ libnpp-static             12.1.0.40  0      nvidia/label/cuda-12.1.1      143MB+ libnvjpeg-static           12.2.0.2  0      nvidia/label/cuda-12.1.1        3MB+ cuda-libraries-dev           12.1.1  0      nvidia/label/cuda-12.1.1        2kB+ cuda-libraries-static        12.1.1  0      nvidia/label/cuda-12.1.1        2kB+ cuda-visual-tools            12.1.1  0      nvidia/label/cuda-12.1.1        1kB+ cuda-tools                   12.1.1  0      nvidia/label/cuda-12.1.1        1kB+ cuda-toolkit                 12.1.1  0      nvidia/label/cuda-12.1.1        2kB

对比发现是 cuda-12.1.1 才对的上CUDA版本12.1的Pytorch。但是我们在安装的时候,先安装CUDA版本12.1的Pytorch,再安装 cuda-12.1.1 会出现冲突问题:

└─ pytorch-cuda is not installable because it requires└─ libcublas >=12.1.0.26,<12.1.3.1 , which conflicts with any installable versions previously reported.

也就是说,该死的CUDA版本12.1的Pytorch的 libcublas 需要适配 cuda-toolkit-12.1.0 ,但是其的 cuda-cudart 等库又需要适配 cuda-toolkit-12.1.1

可以看到 pytorch-cuda 强要求 libcublas >=12.1.0.26,<12.1.3.1,我们只好迁就 pytorch,安装12.1.0的CUDA,但是呢!我们可以修改Pytorch官方给出的 nvidia channel 为 nvidia/label/cuda-12.1.0

使用以下命令:

mamba install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia/label/cuda-12.1.0

其就会安装与我们安装的 cuda-toolkit-12.1.0 一样的一些 cuda 库了!

  + pytorch-mutex         1.0  cuda                          pytorch                      Cached+ libcublas       12.1.0.26  0                             nvidia/label/cuda-12.1.0     Cached+ libcufft         11.0.2.4  0                             nvidia/label/cuda-12.1.0     Cached+ libcusolver     11.4.4.55  0                             nvidia/label/cuda-12.1.0     Cached+ libcusparse     12.0.2.55  0                             nvidia/label/cuda-12.1.0     Cached+ libnpp          12.0.2.50  0                             nvidia/label/cuda-12.1.0     Cached+ libnvjpeg       12.1.0.39  0                             nvidia/label/cuda-12.1.0        3MB+ cuda-cudart       12.1.55  0                             nvidia/label/cuda-12.1.0     Cached+ cuda-nvrtc        12.1.55  0                             nvidia/label/cuda-12.1.0       21MB+ cuda-opencl       12.1.56  0                             nvidia/label/cuda-12.1.0       11kB+ libcufile        1.6.0.25  0                             nvidia/label/cuda-12.1.0      782kB+ libcurand       10.3.2.56  0                             nvidia/label/cuda-12.1.0       54MB+ cuda-cupti        12.1.62  0                             nvidia/label/cuda-12.1.0        5MB+ cuda-nvtx         12.1.66  0                             nvidia/label/cuda-12.1.0       58kB+ cuda-version         12.1  h1d6eff3_3                    conda-forge                    21kB+ ffmpeg                4.3  hf484d3e_0                    pytorch                      Cached+ libjpeg-turbo       2.0.0  h9bf148f_0                    pytorch                      Cached+ libnvjitlink     12.1.105  hd3aeb46_0                    conda-forge                    16MB+ cuda-libraries     12.1.0  0                             nvidia/label/cuda-12.1.0     Cached+ cuda-runtime       12.1.0  0                             nvidia/label/cuda-12.1.0     Cached+ pytorch-cuda         12.1  ha16c6d3_6                    pytorch                      Cached+ pytorch             2.4.1  py3.12_cuda12.1_cudnn9.1.0_0  pytorch                      Cached+ torchtriton         3.0.0  py312                         pytorch                      Cached+ torchvision        0.19.1  py312_cu121                   pytorch                      Cached+ torchaudio          2.4.1  py312_cu121                   pytorch                      Cached

到这里,问题就解决了:我们之后要安装 pytorch-cuda 和 cuda-toolkit 时,只需要执行以下命令(顺序应该不重要了):

mamba install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia/label/cuda-12.1.0
mamba install nvidia/label/cuda-12.1.0::cuda-toolkit -c nvidia/label/cuda-12.1.0

安装 cuda-toolkit 就相当于在安装完 pytorch-cuda 的需要的部分 cuda 库后,进行了补充安装,都是同一个 channel 的当然就不会有问题了

http://www.xdnf.cn/news/140401.html

相关文章:

  • 代码随想录学习笔记---二叉树
  • Vue前端学习笔记
  • Redis高频面试题——String对象
  • Spring MVC 数据绑定利器:深入理解 @InitBinder
  • 【滑动窗口+哈希表/数组记录】Leetcode 3. 无重复字符的最长子串
  • 全球碳化硅晶片市场深度解析:技术迭代、产业重构与未来赛道争夺战(2025-2031)
  • FlinkJobmanager深度解析
  • Vue 3新手入门指南,从安装到基础语法
  • 基于 Python(selenium) 的百度新闻定向爬虫:根据输入的关键词在百度新闻上进行搜索,并爬取新闻详情页的内容
  • 海之淀攻略
  • 404了怎么办快把路由给我断掉(React配置路由)
  • Zeppelin在spark环境导出dataframe
  • 【Linux庖丁解牛】—进程优先级!
  • C++入门小馆: 深入了解STLlist
  • sql server 开启cdc报事务正在执行
  • Qt ModbusSlave多线程实践总结
  • macOS 更新后找不到钥匙串访问工具的解决方案
  • 手机打电话时电脑坐席同时收听对方说话并插入IVR预录声音片段
  • 使用Python脚本在Mac上彻底清除Chrome浏览历史:开发实战与隐私保护指南
  • 【2025最新面试操作系统八股】CPU利用率和load(负载)的区别,CPU利用率怎么算。
  • 边界凸台建模与实例
  • 电子学会—青少年软件编程 python一级等级考试真题—2025年03月
  • 时间复杂度分析
  • Linux学习笔记之环境变量
  • 住宅IP如何选择:长效VS短效,哪个更适合你的业务?
  • java排序算法-计数排序
  • OCR(Optical Character Recognition),光学字符识别
  • HashMap底层原理 什么是哈希表?哈希冲突?如何处理哈希冲突?
  • kotlin与MVVM结合使用总结(三)
  • (Go Gin)基于Go的WEB开发框架,GO Gin是什么?怎么启动?本文给你答案