当前位置: 首页 > news >正文

病理软件Cellprofiler使用教程

在这里插入图片描述


一、软件安装

下载地址:https://cellprofiler.org/releases/

直接照安装提示安装即可(我的电脑可能安装过相关环境,所以可以直接运行)

若不行可参考其他教程:https://blog.csdn.net/weixin_38594676/article/details/125034672


二、使用

2.1 了解软件大概界面

我第一次使用cellprofiler时根据下面这个视频初步了解了一下软件

https://mp.weixin.qq.com/s/gKlhPD_qBR9QImxykKcPSw


2.2 Cellprofiler pipeline示例

Cellprofiler是通过创建一系列pipeline对图像进行处理(可保存为CellProfiler Project (.cpproj)即流程文件),本次示例pipeline参考下述文献:

Machine learning-based pathomics signature of histology slides as a novel prognostic indicator in primary central nervous system lymphoma https://link.springer.com/article/10.1007/s11060-024-04665-8#MOESM4


示例流程可通过百度网盘提取:

通过网盘分享的文件:cellprofiler demo
链接: https://pan.baidu.com/s/17dhvhnh3VaanLRE2BmvbHQ?pwd=zmpj 提取码: zmpj


具体操作描述内容如下(文献):

First, the images were split into hematoxylin-stained and eosin-stained greyscale images by the “UnmixColors” module. The nuclei of tumor cells were identified with the “IdentifyPrimaryObjects” module. Then the “IdentifySecondaryObjects” module identified the cell body by using the nuclei as a “seed” region, growing outwards until stopped by the image threshold or by a neighbor. Thus it identified the cytoplasm by “subtracting” the nuclei objects from the cell objects using the “IdentifyTertiaryObjects” module. The quantitative features were extracted with modules including “Measure Image Quality,” “Measure Image Intensity,” “Measure Granularity,” “Measure Colocalization,” “Measure Object Intensity,” “Measure Object Neighbors,” “Measure Object Size Shape,” and “Measure Texture” (Fig. 1b).

具体解释如下:
在这里插入图片描述

具体操作实操:

Step1:双击流程文件页面如下,根据Step逐步进行操作(每一步的具体参数调整本篇不做介绍):
在这里插入图片描述

(1)输入:准备一个包含病理patch图片(png/jpg)的文件夹

(2)过程展示:
在这里插入图片描述

在这里插入图片描述

(3)输出文件:

在这里插入图片描述

  • MyExpt_Image:图像级别的全局特征,包括图像的,如平均亮度、标准差、纹理特征、图像中识别出的对象数量(如细胞总数、细胞核总数)。
  • MyExpt_Experiment: 包含关于整个实验或整个批次运行的汇总统计数据和元数据
  • MyExpt_IdentifyPrimaryObjects:包含了每个被识别出的细胞核的详细测量值。
  • MyExpt_IdentifySecondaryObjects:指每个完整的细胞的详细测量值。它们通常是通过从主要对象(细胞核)向外扩展到细胞膜边界来识别的。
  • MyExpt_Cytoplasm:包含了每个细胞的细胞质区域的详细测量值。细胞质通常是通过从次要对象(整个细胞)中减去主要对象(细胞核)来定义的。

Step2合并为单个文件夹用于后面分析

import numpy as np
import pandas as pd
import os# 文件保存目录
indir = "./00.raw_data//"
print(indir)# 图像级别特征
infile = os.path.join(indir, "MyExpt_Image.csv")
print(infile)
df_image = pd.read_csv(infile)
prefixes = ['Correlation','Granularity','ImageQuality','Intensity','Texture','Threshold','FileName_HEslide','ImageNumber'
]# '^' 表示字符串的开头
# '|' 表示 '或' (OR)
# 我们将所有前缀用 '|' 连接起来,并用括号括起来,确保它们是作为一个整体进行 '或' 操作
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_image = df_image.filter(regex=regex_pattern)
df_image = df_image.groupby([ 'FileName_HEslide',"ImageNumber"]).agg("mean").reset_index()
df_image.columns = ["image_" + i for i in df_image.columns]
df_image = df_image.rename({'image_ImageNumber':"ImageNumber",'image_FileName_HEslide':'FileName_HEslide'},axis=1)
df_image["case_id"] = df_image["FileName_HEslide"].str.split("_").str[0]
df_image.head(1)# 细胞核
infile = os.path.join(indir, "MyExpt_IdentifyPrimaryObjects.csv")
print(infile)df_Primary= pd.read_csv(infile)
prefixes = ['Texture','Neighbors','Location','Intensity','AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Primary = df_Primary.filter(regex=regex_pattern)
df_Primary = df_Primary.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Primary.columns = ["nucl_" + i for i in df_Primary.columns]
df_Primary = df_Primary.rename({'nucl_ImageNumber':"ImageNumber"},axis=1)
df_Primary.head(1)# 细胞
infile = os.path.join(indir, "MyExpt_IdentifySecondaryObjects.csv")
print(infile)df_Sec= pd.read_csv(infile)
prefixes = ['Texture','Location','Intensity','AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Sec = df_Sec.filter(regex=regex_pattern)df_Sec = df_Sec.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Sec.columns = ["cell_" + i for i in df_Sec.columns]
df_Sec = df_Sec.rename({'cell_ImageNumber':"ImageNumber"},axis=1)
df_Sec.head(1)# 细胞质
infile = os.path.join(indir, "MyExpt_Cytoplasm.csv")
print(infile)df_Cyto= pd.read_csv(infile)
prefixes = ['Texture', 'Location', 'Intensity', 'AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Cyto = df_Cyto.filter(regex=regex_pattern)
df_Cyto = df_Cyto.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Cyto.columns = ["cyto_" + i for i in df_Cyto.columns]
df_Cyto = df_Cyto.rename({'cyto_ImageNumber':"ImageNumber"},axis=1)
df_Cyto.head(1)# 特征合并
# ImageNumber 指 每个patch,每个ObjectNumber指每张patch上分割出来的核,细胞质等
df_m = pd.merge(df_image,df_Primary,how="inner",on = "ImageNumber")
df_m = pd.merge(df_m,df_Sec,how="inner",on = "ImageNumber")
df_m = pd.merge(df_m,df_Cyto,how="inner",on = "ImageNumber")
print(df_m.shape)
df_m.to_csv("Feature_byCellprofilers_BRAF.csv",index=False)

最后输出文件如下:

在这里插入图片描述

Step3特征提取后处理

特征提取后处理流程1:病理学家标注ROI区域-> ROI patches (512 × 512 pixels) were tiled using OpenSlide -> color-normalized using the Vahadane method -> 50 non-overlapping representative patches that contained more tumor cells from each patient were selected for feature extraction -> The final value of each feature was averaged over 50 patches for each slide.

特征提取后处理流程2:The extracted features wereCellProfiler platform aggregated by mean, median, SD, 25-quantiles, and 75-quantiles of the values for the ROl in each slide. Intotal, 525 pathomics nucleus features (pNUC) features were generated for each patient.


三、一些其他pipeline参考(来于文献)

文献1: Development and interpretation of a pathomics-driven ensemble model for predicting the response to immunotherapy in gastric cancer

  • Cellprofiler Pipeline: First, pathomics tumor nucleus features were extracted. After segmenting tumor nuclei using a HoVer-Net model for each ROI, we extracted three categories of pathomics nucleus features, including nuclear intensity, morphology, and texture features, using the “MeasureObjectIntensity”, “MeasureObject SizeShape”, and “Measure Texture” modules in the CellProfiler platform.

文献2: Clinical use of machine learning-based pathomics signature for diagnosis and survival prediction of bladder cancer https://onlinelibrary.wiley.com/doi/full/10.1111/cas.14927

  • Patch选择策略:Patch with 1000 × 1000 pixels

  • Cellprofiler Pipeline:We built an image processing pipeline (Document S1) for segmentation and feature extraction using multiple modules in CellProfiler. H&E-stained images were firstly unmixed with 1000 × 1000 pixels via the ‘UnmixColors’ module. Afterwards, unmixed images were automatically segmented via an ‘IdentifyPrimaryObjects’ module and an ‘IdentifySecondaryObjects’ module to identify the cell nuclei and cell cytoplasm. Quantitative image features of object shape, size, texture, and pixel intensity distribution were further extracted via multiple modules, including measure models of ‘Object Intensity Distribution’, ‘Object Intensity’, ‘Texture’, and ‘Object Size Shape’. After eliminating unnecessary image features, 345 available quantitative image features (Document S2) were finally selected for further analysis, which were also listed in Table S1.


文献3: Prognostic and predictive value of a pathomics signature in gastric cancer(IF:17)(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653436

  • Patch选择策略: Ten nonoverlapping representative tiles of each case containing the greatest number of tumour cells with a field of view of 1000 × 1000 pixels (one pixel is equal to 0.504 μm) were selected by a pathologist and then confirmed by the other pathologist.

  • Cellprofiler Pipeline: The quantitative pathomics features of the selected tiles were extracted by using CellProfiler (version 4.0.7), an open-source image analysis software developed by the Broad Institute (Cambridge, MA). The H&E-stained images were split into haematoxylin-stained and eosin-stained greyscale images using the “UnmixColors” module. The H&E-stained images were also converted to greyscale images using the “ColorToGray” module based on the “Combine” method for further analysis. First, the features that indicated the image quality of the greyscale H&E, haematoxylin and eosin images were assessed by using the “MeasureImageQuality” and “MeasureImageIntensity” modules with three types of features, including blurred features, intensity features and threshold features. The threshold features were extracted by automatically calculating the threshold for each image to identify the tissue foreground from the unstained background with the Otsu algorithm. Subsequently, the colocalization and correlation between intensities in each haematoxylin-stained image and eosin-stained image were calculated on a pixel-by-pixel basis across an entire image by using the *“*MeasureColocalization” module. In addition, the granularity features of each image were assessed using the “MeasureGranularity” module, which outputted spectra of size measurements of the textures in the image, with a granular spectrum range of 16. Further description of the pipeline for feature extraction is described in the Supplementary Methods. A summary of the pathomics features is presented in Supplementary Table 15.

http://www.xdnf.cn/news/1369549.html

相关文章:

  • vue2 和 vue3 生命周期的区别
  • 一篇文章拆解Java主流垃圾回收器及其调优方法。
  • LeetCode-22day:多维动态规划
  • 代码随想录Day62:图论(Floyd 算法精讲、A * 算法精讲、最短路算法总结、图论总结)
  • vue2和vue3的对比
  • TensorFlow 深度学习:使用 feature_column 训练心脏病分类模型
  • Day3--HOT100--42. 接雨水,3. 无重复字符的最长子串,438. 找到字符串中所有字母异位词
  • CentOS 7 服务器初始化:从 0 到 1 的安全高效配置指南
  • 肌肉力量训练
  • 木马免杀工具使用
  • 产品经理操作手册(3)——产品需求文档
  • 全链路营销增长引擎闭门会北京站开启倒计时,解码营销破局之道
  • 构建生产级 RAG 系统:从数据处理到智能体(Agent)的全流程深度解析
  • 书生大模型InternLM2:从2.6T数据到200K上下文的开源模型王者
  • word批量修改交叉引用颜色
  • 【SystemUI】新增实体键盘快捷键说明
  • 常用Nginx正则匹配规则
  • ruoyi-vue(十二)——定时任务,缓存监控,服务监控以及系统接口
  • 软件检测报告:XML外部实体(XXE)注入漏洞原因和影响
  • 服务器初始化流程***
  • 在分布式环境下正确使用MyBatis二级缓存
  • 在 UniApp 中,实现下拉刷新
  • Python爬虫: 分布式爬虫架构讲解及实现
  • IjkPlayer 播放 MP4 视频时快进导致进度回退的问题
  • iOS 26 正式版即将发布,Flutter 完成全新 devicectl + lldb 的 Debug JIT 运行支持
  • 深度学习(三):PyTorch 损失函数:按任务分类的实用指南
  • Milvus介绍及多模态检索实践
  • 系统设计中的幂等性
  • 【LeetCode 热题 100】31. 下一个排列
  • 进入docker中mysql容器的方法