当前位置: 首页 > ops >正文

在服务器上安装AlphaFold2遇到的问题(1)

犯了错误,轻信deepseek,误将cuDNN8.9.7删掉

[root@localhost ~]# cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 9
#define CUDNN_PATCHLEVEL 7
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)/* cannot use constexpr here since this is a C-only file */
[root@localhost ~]# ldconfig -p | grep libcudnn.so.8libcudnn.so.8 (libc6,x86-64) => /usr/local/cuda/targets/x86_64-linux/lib/libcudnn.so.8libcudnn.so.8 (libc6,x86-64) => /lib64/libcudnn.so.8
[root@localhost ~]# export LD_PRELOAD=/usr/local/cuda/lib64/libcudnn.so.8
[root@localhost ~]# cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 9
#define CUDNN_PATCHLEVEL 7
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)/* cannot use constexpr here since this is a C-only file */
[root@localhost ~]# cat /usr/include/cudnn_version.h
cat: /usr/include/cudnn_version.h: 没有那个文件或目录
#???
cat /usr/include/cudnn.h
[root@localhost ~]# ls /usr/lib64/libcudnn.so*
/usr/lib64/libcudnn.so.8
[root@localhost ~]# dnf list installed | grep cudnn
cudnn-local-repo-rhel8-9.10.0.x86_64               1.0-1                                                      @System     

剁手!!!

[root@localhost ~]# dnf remove -y libcudnn* libcudnn8* libcudnn-devel*
未找到匹配的参数: libcudnn*
未找到匹配的参数: libcudnn8*
未找到匹配的参数: libcudnn-devel*
没有软件包需要移除。
依赖关系解决。
无需任何处理。
完毕!
[root@localhost ~]# rm -f /usr/local/cuda/include/cudnn*.h
[root@localhost ~]# rm -f /usr/local/cuda/lib64/libcudnn*
[root@localhost ~]# ldconfig
[root@localhost ~]# find / -name "*cudnn*" 2>/dev/null
/home/Softwares/AlphaFold2/cudnn-local-repo-rhel8-9.10.0-1.0-1.x86_64.rpm
/home/Softwares/AlphaFold2/cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz
/home/Softwares/AlphaFold2/cudnn-linux-x86_64-8.9.7.29_cuda12-archive
…………………………
/var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0.solv
/var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0-filenames.solvx
/var/cache/dnf/cudnn-local-rhel8-9.10.0-903bc33f34604e66
/var/cache/dnf/cudnn-local-rhel8-9.10.0.solv
/var/cache/dnf/cudnn-local-rhel8-9.10.0-filenames.solvx
/var/cudnn-local-repo-rhel8-9.10.0
/var/cudnn-local-repo-rhel8-9.10.0/cudnn-9.10.0-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn-jit-9.10.0-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn-local-D3C757D7-keyring.gpg
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-9.10.0-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-cuda-11-8-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-cuda-12-9-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-jit-9.10.0-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-jit-cuda-11-8-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-jit-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-jit-cuda-12-9-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/cudnn9-jit-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-devel-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-devel-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-headers-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-headers-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-jit-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-jit-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-jit-devel-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-jit-devel-cuda-12-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-samples-9.10.0.56-1.noarch.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-static-cuda-11-9.10.0.56-1.x86_64.rpm
/var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-static-cuda-12-9.10.0.56-1.x86_64.rpm

1. 文件来源:
这些文件通常是您通过 NVIDIA 官方 .rpm 包安装 cuDNN 时自动生成的
/var/cudnn-local-repo-* 目录包含本地仓库信息和安装包缓存
2. 文件作用:
cudnn-local-repo-rhel8-9.10.0 是本地仓库配置目录
cudnn-9.10.0-1.x86_64.rpm 是下载的原始安装包
可以安全删除,因为:
这些只是安装过程中的缓存文件和仓库配置
删除不会影响已经安装到系统目录的 cuDNN 库文件
如果需要重新安装,可以从 NVIDIA 官网重新下载
安全删除方法
#删除整个本地仓库目录
sudo rm -rf /var/cudnn-local-repo-rhel8-9.10.0
#也可以选择性只删除 RPM 包
sudo rm -f /var/cudnn-local-repo-rhel8-9.10.0/cudnn-9.10.0-1.x86_64.rpm
删除后建议

  1. 如果您计划未来重新安装相同版本:
    可以保留 .rpm 文件以便重复使用
    但通常建议从官网下载最新版本
  2. 清理后重建仓库缓存:
    sudo dnf clean all
    sudo dnf makecache
  3. 验证已安装的 cuDNN 是否仍然工作:
    ldconfig -p | grep libcudnn

其他类似可删除的文件
同样可以安全删除的类似文件包括:
/var/cuda-repo-* 目录
/var/nvidia-driver-* 目录
任何 /var/ 下以 -repo- 命名的目录
这些都是在安装 NVIDIA 相关软件时创建的临时仓库目录,删除不会影响已安装的软件功能。

因为上面剁手了所以这里出现问题了

[root@localhost ~]# ls -l /var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-static-cuda-12-9.10.0.56-1.x86_64.rpm
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
-rw-r--r--. 1 root root 749479982 55 13:25 /var/cudnn-local-repo-rhel8-9.10.0/libcudnn9-static-cuda-12-9.10.0.56-1.x86_64.rpm
[root@localhost ~]# ls -l /home/Softwares/AlphaFold2/cudnn-linux-x86_64-8.9.7.29_cuda12-archive/include/cudnn_version.h
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
-rw-r--r--. 1 25503 2174 4019 1130 2023 /home/Softwares/AlphaFold2/cudnn-linux-x86_64-8.9.7.29_cuda12-archive/include/cudnn_version.h
[root@localhost ~]# cd /home/
[root@localhost home]# ls
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
Softwares  user  wulu1  yhm
[root@localhost home]# cd Softwares/
[root@localhost Softwares]# ls
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
AlphaFold2
[root@localhost Softwares]# cd AlphaFold2/
[root@localhost AlphaFold2]# ls
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
cuda-repo-rhel8-12-9-local-12.9.0_575.51.03-1.x86_64.rpm  cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz
cudnn-linux-x86_64-8.9.7.29_cuda12-archive                cudnn-local-repo-rhel8-9.10.0-1.0-1.x86_64.rpm
[root@localhost AlphaFold2]# cd cudnn-linux-x86_64-8.9.7.29_cuda12-archive/
[root@localhost ~]# ldconfig -p | grep cudnn
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

想注释环境变量来着

[root@localhost ~]# vi /ect/profile
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[root@localhost ~]# vi ~/.bashrc
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[root@localhost ~]# vi /etc/profile
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[root@localhost ~]# rm -rf /var/cudnn-local-repo*
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

这个错误信息看似与删除操作无关,实际上是系统在尝试加载一个已经被删除或不可用的 CUDA/cuDNN 库文件。
错误原因分析
表面现象:当执行 rm 命令时,系统显示无法预加载 libcudnn.so.8
这不是删除操作本身的错误,而是 LD_PRELOAD 环境变量配置问题
根本原因:
某些程序(可能是之前安装的 CUDA 相关工具)在环境变量中设置了 LD_PRELOAD=/usr/local/cuda/lib64/libcudnn.so.8

1. 临时解决方案(当前会话有效)

# 取消 LD_PRELOAD设置 unset LD_PRELOAD
# 然后重新执行删除命令 rm -rf /var/cudnn-local-repo*

2. 永久解决方案(推荐)

# 检查哪些配置文件设置了 LD_PRELOAD grep -r "LD_PRELOAD" /etc/ ~/.bash* ~/.profile* /etc/profile.d/
# 找到包含该设置的文件后,编辑文件删除或注释相关行 sudo nano /etc/profile #或其他查找到的文件 #注释掉类似这样的行:export LD_PRELOAD=/usr/local/cuda/lib64/libcudnn.so.8
# 使更改生效 source ~/.bashrc # 或其他被修改的文件

3. 如果文件确实需要保留

如果 libcudnn.so.8 是其他程序必需的:
# 重新安装正确的库文件 sudo dnf install libcudnn8
# 或创建正确的符号链接 sudo ln -s /usr/local/cuda/lib64/libcudnn.so.8 /lib64/ sudo ldconfig

现在这个库文件已被删除或路径变更,导致系统无法加载

注释掉环境变量后,反而报了一大堆错误

[root@localhost ~]# source /etc/profile
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

这里应该是取消注释环境变量

[root@localhost ~]# vi /etc/profile
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[root@localhost ~]# ls
ERROR: ld.so: object '/usr/local/cuda/lib64/libcudnn.so.8' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
公共  模板  视频  图片  文档  下载  音乐  桌面  anaconda-ks.cfg  initial-setup-ks.cfg  NVIDIA-Linux-x86_64-550.144.03.run
[root@localhost ~]# unset LD_PRELOAD
[root@localhost ~]# ls
公共  模板  视频  图片  文档  下载  音乐  桌面  anaconda-ks.cfg  initial-setup-ks.cfg  NVIDIA-Linux-x86_64-550.144.03.run
[root@localhost ~]# rm -rf /var/cudnn-local-repo*
[root@localhost ~]# ls
公共  模板  视频  图片  文档  下载  音乐  桌面  anaconda-ks.cfg  initial-setup-ks.cfg  NVIDIA-Linux-x86_64-550.144.03.run
[root@localhost ~]# cd /home/
[root@localhost home]# ls
Softwares  user  wulu1  yhm
[root@localhost home]# cd Softwares/
[root@localhost Softwares]# ls
AlphaFold2
[root@localhost Softwares]# cd AlphaFold2/
[root@localhost AlphaFold2]# ls
cuda-repo-rhel8-12-9-local-12.9.0_575.51.03-1.x86_64.rpm  cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz
cudnn-linux-x86_64-8.9.7.29_cuda12-archive                cudnn-local-repo-rhel8-9.10.0-1.0-1.x86_64.rpm

手动删除残余文件

[root@localhost Softwares]# rm -rf /etc/yum.repos.d/cudnn-local-rhel8-9.10.0.repo
[root@localhost Softwares]# rm -rf /var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0.solv
[root@localhost Softwares]# rm -rf /var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0-filenames.solvx
[root@localhost Softwares]# rm -rf /var/cache/dnf/cudnn-local-rhel8-9.10.0-903bc33f34604e66
[root@localhost Softwares]# rm -rf /var/cache/dnf/cudnn-local-rhel8-9.10.0.solv
[root@localhost Softwares]# rm -rf /var/cache/dnf/cudnn-local-rhel8-9.10.0-filenames.solvx

文件路径类型是否可删除备注

/etc/yum.repos.d/cudnn-local-rhel8-9.10.0.repo仓库配置文件✅ 可删除删除后需通过 dnf clean all 更新缓存
/var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0.solv包管理器缓存✅ 可删除PackageKit 的解决依赖缓存
/var/cache/PackageKit/8.7/hawkey/cudnn-local-rhel8-9.10.0-filenames.solvx包管理器缓存✅ 可删除文件名索引缓存
/var/cache/dnf/cudnn-local-rhel8-9.10.0-903bc33f34604e66DNF 缓存目录✅ 可删除包含下载的元数据
/var/cache/dnf/cudnn-local-rhel8-9.10.0.solvDNF 依赖解析缓存✅ 可删除二进制格式的依赖关系数据
/var/cache/dnf/cudnn-local-rhel8-9.10.0-filenames.solvxDNF 文件名索引✅ 可删除加速包搜索的索引文件

from deepseek
在这里插入图片描述

在这里插入图片描述

[root@localhost Softwares]# cd AlphaFold2/
[root@localhost AlphaFold2]# ls
cuda-repo-rhel8-12-9-local-12.9.0_575.51.03-1.x86_64.rpm  cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz
cudnn-linux-x86_64-8.9.7.29_cuda12-archive                cudnn-local-repo-rhel8-9.10.0-1.0-1.x86_64.rpm
[root@localhost AlphaFold2]# cd cudnn-linux-x86_64-8.9.7.29_cuda12-archive/
[root@localhost cudnn-linux-x86_64-8.9.7.29_cuda12-archive]# ls
include  lib  LICENSE

验证是否清理干净

# 检查仓库配置
ls /etc/yum.repos.d/ | grep -i cudnn
# 检查缓存文件
ls /var/cache/{dnf,PackageKit}/* | grep -i cudnn
[root@localhost cudnn-linux-x86_64-8.9.7.29_cuda12-archive]# ls /etc/yum.repos.d/ | grep -i cudnn
[root@localhost cudnn-linux-x86_64-8.9.7.29_cuda12-archive]# ls /var/cache/{dnf,PackageKit}/* | grep -i cudnn

这些文件删除后,系统会更加干净,不会影响其他CUDA/cuDNN组件的运行。如果未来需要重新安装相同版本,只需重新下载官方安装包即可恢复仓库配置。

[root@localhost cudnn-linux-x86_64-8.9.7.29_cuda12-archive]# dnf clean all
31 文件已删除
[root@localhost cudnn-linux-x86_64-8.9.7.29_cuda12-archive]# dnf makecache
Rocky Linux 8 - AppStream                                                                            1.5 MB/s |  18 MB     00:11    
Rocky Linux 8 - BaseOS                                                                               4.0 MB/s |  23 MB     00:05    
Rocky Linux 8 - Extras                                                                                13 kB/s |  15 kB     00:01    
cuda-rhel8-12-9-local                                                                                 81 MB/s | 121 kB     00:00    
元数据缓存已建立。

完整删除 cuDNN 残留的步骤

  1. 先修复 LD_PRELOAD 问题:unset LD_PRELOAD
  2. 然后删除目标文件:sudo rm -rf /var/cudnn-local-repo*
  3. 清理其他残留:
    sudo rm -f /usr/local/cuda/lib64/libcudnn*
    sudo rm -f /usr/include/cudnn.h
    sudo ldconfig

验证删除结果

#检查文件是否已删除
ls /var/cudnn-local-repo* 2>/dev/null
#检查库加载是否正常
ldconfig -p | grep cudnn
在这里插入图片描述

http://www.xdnf.cn/news/6629.html

相关文章:

  • LLaMA-Factory:准备模型和数据集
  • 【软件测试学习day8】性能测试相关指标
  • WebGL2光照效果
  • DeepSeek解读道德经 第三十四章
  • 背包问题详解
  • 华为云Flexus+DeepSeek征文|SpringBoot开发实战:基于ModelArts Studio高效集成DeepSeek大模型服务
  • 【“星睿O6”评测】对比高通8Gen3分类、检测、分割、超分网络的AIBenchmark测试
  • 对置式光电传感器市场报告:预计2031年全球市场销售额将攀升至 5.68 亿美元
  • ChatGPT再升级!
  • JavaScript 时间转换:从 HH:mm:ss 到十进制小时及反向转换
  • 拟合(最小二乘拟合)
  • OpenCV下安装opencv_contrib 扩展模块进行人脸特征识别mingw32
  • IDEA怎么汉化idea中文改回英文版
  • 【论文阅读】KIMI K1.5: SCALING REINFORCEMENT LEARNING WITH LLMS
  • (7)python开发经验
  • Invicti-Professional-V25.5
  • 尝试解引用泛型指针void*
  • 衡量 5G 和未来网络的安全性
  • UI自动化测试详解
  • Transformer 模型与注意力机制
  • handsome主题美化及优化:10.1.0最新版 - 1
  • 机器视觉光源选型解析:照亮工业检测的“智慧之眼”
  • 国产linux系统(银河麒麟,统信uos)使用 PageOffice自定义Word模版中的数据区域
  • 大模型的实践应用43-基于Qwen3(32B)+LangChain框架+MCP+RAG+传统算法的旅游行程规划系统
  • Quasar组件 Carousel走马灯
  • 小结:网页性能优化
  • 三轴云台之智能分析与识别技术篇
  • MVVM框架
  • LangFlow技术深度解析:可视化编排LangChain应用的新范式 -(3)组件系统
  • OpenAI与微软洽谈新融资及IPO,Instagram因TikTok流失四成用户