当前位置: 首页 > ops >正文

Pytorch基础学习--张量(生成,索引,变形)

        张量(tensor)是pytorch框架中最基本的一个操作对象。需要弄明白,tensor是一种类似numpy中ndarray的数据容器,与物理意义上的张量不同。下面截取tensor()方法中的部分代码:

tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor

        可以看出,一个tensor对象包含了:数据(data),数据类型(dtype),运算的设备(device),以及是否计算梯度(requires_grad)等属性,这些属性为使用tensor对象实现数据预处理以及模型自动求导提供了条件。

Tensor的生成

方法1:手动赋值

        使用torch.tensor()方法,输入数据,以及对应的数据类型。

import torch
a = torch.tensor(data=[1,2,3,4],dtype=torch.float)
print(a)####结果
tensor([1., 2., 3., 4.])

方法2:torch.ones函数

def ones(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: r"""ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) -> TensorReturns a tensor filled with the scalar value `1`, with the shape definedby the variable argument :attr:`size`.Args:size (int...): a sequence of integers defining the shape of the output tensor.Can be a variable number of arguments or a collection like a list or tuple.Keyword arguments:out (Tensor, optional): the output tensor.dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.Default: if ``None``, uses a global default (see :func:`torch.set_default_dtype`).layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.Default: ``torch.strided``.device (:class:`torch.device`, optional): the desired device of returned tensor.Default: if ``None``, uses the current device for the default tensor type(see :func:`torch.set_default_device`). :attr:`device` will be the CPUfor CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional): If autograd should record operations on thereturned tensor. Default: ``False``.Example::>>> torch.ones(2, 3)tensor([[ 1.,  1.,  1.],[ 1.,  1.,  1.]])>>> torch.ones(5)tensor([ 1.,  1.,  1.,  1.,  1.])"""

        生成全1的tensor,输入参数一般为需要的尺寸,如2x2,3x3等,也可以指定类型dtype=xxx。

方法3:torch.zeros函数

def zeros(size: Sequence[Union[_int, SymInt]], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: r"""zeros(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) -> TensorReturns a tensor filled with the scalar value `0`, with the shape definedby the variable argument :attr:`size`.Args:size (int...): a sequence of integers defining the shape of the output tensor.Can be a variable number of arguments or a collection like a list or tuple.Keyword args:out (Tensor, optional): the output tensor.dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.Default: if ``None``, uses a global default (see :func:`torch.set_default_dtype`).layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.Default: ``torch.strided``.device (:class:`torch.device`, optional): the desired device of returned tensor.Default: if ``None``, uses the current device for the default tensor type(see :func:`torch.set_default_device`). :attr:`device` will be the CPUfor CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional): If autograd should record operations on thereturned tensor. Default: ``False``.Example::>>> torch.zeros(2, 3)tensor([[ 0.,  0.,  0.],[ 0.,  0.,  0.]])>>> torch.zeros(5)tensor([ 0.,  0.,  0.,  0.,  0.])"""

        生成全0的tensor,输入参数一般为需要的尺寸,如2x2,3x3等,也可以指定类型dtype=xxx。 

方法4:torch.rand函数(均匀分布)

def rand(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: r"""rand(*size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) -> TensorReturns a tensor filled with random numbers from a uniform distributionon the interval :math:`[0, 1)`The shape of the tensor is defined by the variable argument :attr:`size`.Args:size (int...): a sequence of integers defining the shape of the output tensor.Can be a variable number of arguments or a collection like a list or tuple.Keyword args:generator (:class:`torch.Generator`, optional): a pseudorandom number generator for samplingout (Tensor, optional): the output tensor.dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.Default: if ``None``, uses a global default (see :func:`torch.set_default_dtype`).layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.Default: ``torch.strided``.device (:class:`torch.device`, optional): the desired device of returned tensor.Default: if ``None``, uses the current device for the default tensor type(see :func:`torch.set_default_device`). :attr:`device` will be the CPUfor CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional): If autograd should record operations on thereturned tensor. Default: ``False``.pin_memory (bool, optional): If set, returned tensor would be allocated inthe pinned memory. Works only for CPU tensors. Default: ``False``.Example::>>> torch.rand(4)tensor([ 0.5204,  0.2503,  0.3525,  0.5673])>>> torch.rand(2, 3)tensor([[ 0.8237,  0.5781,  0.6879],[ 0.3816,  0.7249,  0.0998]])"""

        生成(0-1)随机的tensor,分布类型为均匀分布。输入参数一般为需要的尺寸,如2x2,3x3等,也可以指定类型dtype=xxx。 

方法5:torch.randn函数(标准正态分布)

def randn(size: Sequence[Union[_int, SymInt]], *, generator: Optional[Generator], names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: r"""randn(*size, *, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) -> TensorReturns a tensor filled with random numbers from a normal distributionwith mean `0` and variance `1` (also called the standard normaldistribution)... math::\text{out}_{i} \sim \mathcal{N}(0, 1)For complex dtypes, the tensor is i.i.d. sampled from a `complex normal distribution`_ with zero mean andunit variance as.. math::\text{out}_{i} \sim \mathcal{CN}(0, 1)This is equivalent to separately sampling the real :math:`(\operatorname{Re})` and imaginary:math:`(\operatorname{Im})` part of :math:`\text{out}_i` as.. math::\operatorname{Re}(\text{out}_{i}) \sim \mathcal{N}(0, \frac{1}{2}),\quad\operatorname{Im}(\text{out}_{i}) \sim \mathcal{N}(0, \frac{1}{2})The shape of the tensor is defined by the variable argument :attr:`size`.Args:size (int...): a sequence of integers defining the shape of the output tensor.Can be a variable number of arguments or a collection like a list or tuple.Keyword args:generator (:class:`torch.Generator`, optional): a pseudorandom number generator for samplingout (Tensor, optional): the output tensor.dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.Default: if ``None``, uses a global default (see :func:`torch.set_default_dtype`).layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.Default: ``torch.strided``.device (:class:`torch.device`, optional): the desired device of returned tensor.Default: if ``None``, uses the current device for the default tensor type(see :func:`torch.set_default_device`). :attr:`device` will be the CPUfor CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional): If autograd should record operations on thereturned tensor. Default: ``False``.pin_memory (bool, optional): If set, returned tensor would be allocated inthe pinned memory. Works only for CPU tensors. Default: ``False``.Example::>>> torch.randn(4)tensor([-2.1436,  0.9966,  2.3426, -0.6366])>>> torch.randn(2, 3)tensor([[ 1.5954,  2.8929, -1.0923],[ 1.1719, -0.4709, -0.1996]])

        用法类似rand,但生成的数据有正有负,符合标准正态分布

方法6:torch.normal函数(生成指定正态分布)

def normal(mean: Tensor, std: Tensor, *, generator: Optional[Generator] = None, out: Optional[Tensor] = None) -> Tensor: r"""normal(mean, std, *, generator=None, out=None) -> TensorReturns a tensor of random numbers drawn from separate normal distributionswhose mean and standard deviation are given.The :attr:`mean` is a tensor with the mean ofeach output element's normal distributionThe :attr:`std` is a tensor with the standard deviation ofeach output element's normal distributionThe shapes of :attr:`mean` and :attr:`std` don't need to match, but thetotal number of elements in each tensor need to be the same... note:: When the shapes do not match, the shape of :attr:`mean`is used as the shape for the returned output tensor.. note:: When :attr:`std` is a CUDA tensor, this function synchronizesits device with the CPU.Args:mean (Tensor): the tensor of per-element meansstd (Tensor): the tensor of per-element standard deviationsKeyword args:generator (:class:`torch.Generator`, optional): a pseudorandom number generator for samplingout (Tensor, optional): the output tensor.Example::>>> torch.normal(mean=torch.arange(1., 11.), std=torch.arange(1, 0, -0.1))tensor([  1.0425,   3.5672,   2.7969,   4.2925,   4.7229,   6.2134,8.0505,   8.1408,   9.0563,  10.0566])

        输入参数为mean,std,以及输出的size。

Tensor的索引

        首先明白tensor的维度这个概念。0维的tensor叫scalar即标量,1维的tensor叫vector即向量,二维的tensor叫matrix即矩阵,三维的tensor可以想象为一个立方体,典型的就是图片数据 CxHxW这样,四维的tensor可以理解为 n张图片,或者一个视频序列,再高维度的涉及的会很少。

数据维度的查看方法:dim(),size(),shape

import torch
a = torch.normal(mean= 2,std = 0.1,size=(3,3))
print(a.dim())
print(a.size())
print(a.shape)####结果
2
torch.Size([3, 3])
torch.Size([3, 3])

数据的索引方法1:整数索引

        多维张量的索引方式中,整数索引是最简单的一种,即直接需要索引的各个维度。(记住,索引是从0开始,到维度数结束;比如一个3x3的矩阵,索引第三行第三列数据索引号应该为2,2而不是3,3)

import torch
a = torch.normal(mean= 2,std = 0.1,size=(3,3))
print(a)
#找第一行第三列的数据
print(a[0,2])####结果
tensor([[2.1063, 2.0607, 2.0592],[2.0272, 2.1103, 2.0569],[1.9521, 2.1360, 2.0703]])
tensor(2.0592)

数据的索引方法2:切片索引

        切片索引即使用:符号进行切片。先来学习一下表示方式:

例如,1:7表示的区间为[1,7)前闭后开。:7表示[0,7)区间的简写。7:表示取[7,无穷),即7以后的所有。单个:表示全部。此外如0:5:2,表示从[0,5)中隔一个取数,2代表步进数,以此实现有规律地跳着取数;输出的结果为[0,2,4]。

import torch
a = torch.normal(mean= 2,std = 0.1,size=(3,3))
print(a)
#找第一行,第二行,第三列的数据
print(a[0:1,2])
#找第一行,第二行,第三列的数据
print(a[:1,2])
#找所有行,第三列的数据
print(a[:,2])
#找第三行,所有列的数据
print(a[2,:])
#找第一行,第3行,第三列的数据
print(a[::2,2])####结果
tensor([[2.1054, 2.0106, 1.9586],[2.0267, 1.9019, 1.9440],[2.0521, 1.9646, 1.9683]])
tensor([1.9586])
tensor([1.9586])
tensor([1.9586, 1.9440, 1.9683])
tensor([2.0521, 1.9646, 1.9683])
tensor([1.9586, 1.9683])

数据的索引方法3:控制索引

        整数索引可以精确根据维度取出数据,但缺点是一次只能取一个数据。切片索引一次能取多个数据,但取出的数据都是连续的,或者规律的(隔几个取)。如果一个10x10的矩阵,只想取第一行和第5行数据改怎么写呢?一个办法当然是用切片索引取两回就行,想一次取完就得使用控制索引了。

        简单来说就是把“整数索引”和切片索引作为数据的索引。因为索引本质上就是对维度进行选取,用一个通俗模型来表示就是“ 数据(维度0,维度1,维度2,,,,维度n)”,整数索引对应就是在维度0,维度1,等这些位置上给个整数。切片索引实际上是在这些维度上给一个序列。那么如果在这些维度上给个列表会怎么样呢?

        回到前面的问题,解决方法就是 “ 数据([0,4] ,:)”,维度0的位置上给了[0,4],表示行取第1行和第5行,维度1的位置给了:,代表取所有列。

import torch
a = torch.normal(mean= 2,std = 0.1,size=(5,5))
print(a)
#取第1行与第五行的数据
print(a[[0,4],:])####结果
tensor([[1.9448, 2.0447, 2.1332, 1.9940, 2.0615],[2.0787, 2.1697, 2.0324, 2.0906, 1.9377],[1.9631, 2.1131, 1.9026, 2.1066, 2.1266],[2.0901, 1.9980, 1.9581, 1.9519, 1.9402],[1.8239, 2.0482, 2.0029, 2.1383, 2.0160]])
tensor([[1.9448, 2.0447, 2.1332, 1.9940, 2.0615],[1.8239, 2.0482, 2.0029, 2.1383, 2.0160]])

数据的索引方法4:布尔索引

        布尔索引的主要作用就是通过条件来筛选(这里叫索引)数据。例如我需要找出上面5x5矩阵中大于2的数据,我可以这样写代码:

import torch
a = torch.normal(mean= 2,std = 0.1,size=(5,5))
print(a)
#筛选大于2的数据
condition_1 = a > 2
print(condition_1)
print(a[condition_1])####结果
tensor([[1.9632, 2.0117, 2.2046, 1.9589, 2.0198],[1.8498, 2.1950, 2.0849, 1.6962, 2.0406],[1.8606, 2.1223, 2.1240, 1.9681, 1.8088],[2.0776, 2.0092, 1.8629, 2.0593, 2.0015],[1.9666, 1.9746, 2.0476, 2.0693, 1.9117]])
tensor([[False,  True,  True, False,  True],[False,  True,  True, False,  True],[False,  True,  True, False, False],[ True,  True, False,  True,  True],[False, False,  True,  True, False]])
tensor([2.0117, 2.2046, 2.0198, 2.1950, 2.0849, 2.0406, 2.1223, 2.1240, 2.0776,2.0092, 2.0593, 2.0015, 2.0476, 2.0693])

        是不是感觉特别神奇?究其主要机理在于使用布尔索引时如果对于位置为True则保留,False则舍弃。

import torch
a = torch.range(1,10)
print(a)
condition_1 = [True,True,True,False,False,False,False,False,False,False]
print(a[condition_1])####结果
tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
tensor([1., 2., 3.])

一个应用的例子,对yoloV1中10x7x7x30的结果,筛选出置信度大于0.5的数据结果:

import torch
a = torch.rand(size=(10,7,7,30))
condition_1 = a[:,:,:,4] > 0.5
out_put = a[condition_1]
print(a.shape)
print(condition_1.shape)
print(out_put.shape)####结果
torch.Size([10, 7, 7, 30])
torch.Size([10, 7, 7])
torch.Size([248, 30])

Tensor变形

        tensor的变形主要涉及两个方面的内容:维度压缩与扩展,以及数据形状的改变。

维度压缩与扩展

        维度压缩方法:torch.squeeze(input,dim),未指定dim时会将维度中包含1的维度都压缩,例如1x3x3压缩为3x3,3x1x3压缩为3x3这样,如果维度中没有1的话输出将等于输入,即不压缩。

import torch
a = torch.rand(size=(3,1,1))
b = torch.squeeze(a)
print(a)
print(b)
print(a.shape)
print(b.shape)##结果
tensor([[[0.6981]],[[0.0029]],[[0.0866]]])
tensor([0.6981, 0.0029, 0.0866])
torch.Size([3, 1, 1])
torch.Size([3])

        究其原因是维度等于1在多维的tensor中其实作用仅为扩充维度,1x3x3和3x3其实表达的数据相同,但维度不一样,因此使用squeeze方法消除多出来的维度1。

        与维度压缩不同,维度的拼接(cat / stack)方法也是一种维度上的操作,之后再讲。

        维度的扩展方法:torch.unsqueeze(input,dim),类似压缩操作去除等于1的维度,扩展操作就是增加等于1的维度。dim为等于1的维度生成的位置。

import torch
a = torch.rand(size=(3,3,3))
b = torch.unsqueeze(a,dim=1)
print(a)
print(b)
print(a.shape)
print(b.shape)###结果
tensor([[[0.5458, 0.1297, 0.9105],[0.3012, 0.3555, 0.3145],[0.7861, 0.7796, 0.9566]],[[0.8420, 0.9238, 0.7980],[0.7637, 0.6829, 0.9350],[0.0533, 0.4995, 0.9782]],[[0.9134, 0.8123, 0.6730],[0.3774, 0.5909, 0.2171],[0.3497, 0.8188, 0.8477]]])
tensor([[[[0.5458, 0.1297, 0.9105],[0.3012, 0.3555, 0.3145],[0.7861, 0.7796, 0.9566]]],[[[0.8420, 0.9238, 0.7980],[0.7637, 0.6829, 0.9350],[0.0533, 0.4995, 0.9782]]],[[[0.9134, 0.8123, 0.6730],[0.3774, 0.5909, 0.2171],[0.3497, 0.8188, 0.8477]]]])
torch.Size([3, 3, 3])
torch.Size([3, 1, 3, 3])

tensor形变

        主要依靠两个方法:reshape以及view方法。

Tensor形变方法

Tensor形变是深度学习中对张量维度进行重新组织的重要操作,PyTorch提供了两种主要方法来实现这一功能:

1. reshape方法

reshape()方法会返回一个新的张量,其数据与原张量相同但具有不同的形状。该方法可以自动计算缺失的维度大小(当使用-1作为参数时)。

特点:

  • 总是返回原数据的拷贝(如果形状兼容)或新分配内存的拷贝(如果形状不兼容)
  • 适用于不确定原张量是否连续的情况
  • 语法:tensor.reshape(new_shape)

示例:

import torch
x = torch.arange(12)  # 创建0-11的一维张量
y = x.reshape(3, 4)   # 转换为3行4列的矩阵
z = x.reshape(2, -1)  # 自动计算第二维为6

2. view方法

view()方法返回与原张量共享数据但具有不同形状的新张量视图。该方法要求原张量在内存中是连续的。

特点:

  • 不复制数据,仅改变"视图"
  • 比reshape更快,但要求张量是连续的
  • 如果形状不兼容会抛出错误
  • 语法:tensor.view(new_shape)

示例:

a = torch.tensor([[1, 2], [3, 4]])
b = a.view(4)    # 展平为一维张量
c = a.view(1, 4) # 变为1x4矩阵

注意事项

  1. 当从高维转为低维时,元素总数必须保持不变
  2. 使用-1作为参数时,PyTorch会自动计算该维度大小
  3. 对于非连续张量,建议使用reshape而非view
  4. 两种方法都不改变原张量本身,而是返回新张量

http://www.xdnf.cn/news/18362.html

相关文章:

  • 从系统漏洞归零到候诊缩短20%:一个信创样本的效能革命
  • 机器学习聚类与集成算法全解析:从 K-Means 到随机森林的实战指南
  • CRMEB私域电商系统后台开发实战:小程序配置全流程解析
  • 贪吃蛇游戏(纯HTML)
  • 什么是区块链?从比特币到Web3的演进
  • 图像中物体计数:基于YOLOv5的目标检测与分割技术
  • 十分钟速通堆叠
  • 智慧城市SaaS平台/市政设施运行监测系统之空气质量监测系统、VOC气体监测系统、污水水质监测系统及环卫车辆定位调度系统架构内容
  • 终结开发混乱,用 Amazon Q 打造AI助手
  • 华为云ModelArts+Dify AI:双剑合璧使能AI应用敏捷开发
  • CSS【详解】性能优化
  • 【知识储备】PyTorch / TensorFlow 和张量的联系
  • 数字货币发展存在的问题:交易平台的问题不断,但监管日益加强
  • React + Antd+TS 动态表单容器组件技术解析与实现
  • Linux -- 封装一个线程池
  • 射频电路的完整性简略
  • ubuntu编译ijkplayer版本k0.8.8(ffmpeg4.0)
  • JVM-(7)堆内存逻辑分区
  • 智能编程中的智能体与 AI 应用:概念、架构与实践场景
  • 【Flutter】Container设置对齐方式会填满父组件剩余空间
  • BaaS(Backend as a Service)技术深度解析:云时代的后端开发革命
  • 数据结构青铜到王者第一话---数据结构基本常识(1)
  • Spring面试宝典:Spring IOC的执行流程解析
  • JavaScript 十六进制与字符串互相转(HEX)
  • 通义千问VL-Plus:当AI“看懂”屏幕,软件测试的OCR时代正式终结!
  • 微信小程序基础Day1
  • iOS 文件管理全景实战 多工具协同提升开发与调试效率
  • ACM模式输入输出
  • mlir CollapseShapeOp ExpandShapeOp的构造
  • 循环神经网络实战:用 LSTM 做中文情感分析(二)