当前位置：首页 > backend >正文

pytorch小记（二十一）：PyTorch 中的 torch.randn 全面指南

backend 2025/7/6 7:02:25

pytorch小记（二十一）：PyTorch 中的 torch.randn 全面指南

PyTorch 中的 `torch.randn` 全面指南
- 一、接口定义
- 二、参数详解
- 三、常见使用场景
- 四、位置参数 vs. Tuple 传参 —— 数值示例
- 五、必须用关键字传入
- 小结

PyTorch 中的 `torch.randn` 全面指南

在深度学习中，我们经常需要从标准正态分布（ $\mathcal{N}(0,1)$ ）中采样，PyTorch 提供了非常灵活的接口 torch.randn。本文将从接口定义、参数详解、常见场景、示例及输出，到关键字参数的设计原理，一一展开。

一、接口定义

torch.randn(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, generator=None) → Tensor

功能：返回一个从标准正态分布中采样的张量。
签名解读：
- *sizes：一个可变长的位置参数列表，或一个整型元组，用来指定输出张量的形状。
- out：可选，指定已有张量存放结果（in-place）。
- dtype：数据类型（如 torch.float32、torch.float64）。
- layout：存储布局，默认 torch.strided（稠密 Tensor）。
- device：设备，如 "cpu"、"cuda:0"。
- requires_grad：是否参与梯度追踪。
- generator：自定义随机数生成器，用于多线程或多卡场景下隔离随机流。

二、参数详解

参数	含义	示例
`*sizes`	输出张量的形状，例如 `2,3` 或 `(2,3)`	`torch.randn(2,3)` 或 `torch.randn((2,3))`
`out`	指定用来存放结果的张量，只能通过 `out=` 关键字传入	`torch.randn(2,3, out=my_tensor)`
`dtype`	输出数据类型	`torch.randn(2,3, dtype=torch.float64)`
`layout`	存储布局，通常无需修改	`torch.randn(2,3, layout=torch.strided)`
`device`	指定设备	`torch.randn(2,3, device='cuda:0')`
`requires_grad`	是否记录梯度	`torch.randn(2,3, requires_grad=True)`
`generator`	指定 `torch.Generator()`	`g = torch.Generator().manual_seed(1)`

三、常见使用场景

模型权重初始化

self.weight = torch.randn(out_channels, in_channels) * std + mean

噪声注入

noise = torch.randn(*x.shape, device=x.device)
x_noisy = x + noise * noise_level

随机输入或仿真

random_input = torch.randn(batch_size, latent_dim)

实验可复现
```
torch.manual_seed(42)
torch.randn(3,3)
```

四、位置参数 vs. Tuple 传参 —— 数值示例

下面以 固定随机种子 的方式，演示两种写法输出的张量形状与内容格式上的一致性。

import torch
torch.manual_seed(0)# 方式 A：位置参数
a = torch.randn(2, 3)
print("a:\n", a)# 方式 B：整型元组
b = torch.randn((2, 3))
print("\nb:\n", b)print("\na.shape =", a.shape, ", b.shape =", b.shape)

运行输出示例：

a:tensor([[ 1.5410, -0.2934, -2.1788],[ 0.5684, -1.0845, -1.3986]])b:tensor([[-0.4033,  0.8380, -0.7193],[ 0.0921, -0.3950, -0.0132]])a.shape = torch.Size([2, 3]) , b.shape = torch.Size([2, 3])

结论：
- 两者都会生成 shape 为 (2,3) 的张量。
- 位置参数和 tuple 形式是等价的，只是 Python 语法上的两种传参方式。

五、必须用关键字传入

在 Python 里，函数签名中 *sizes 表示所有位置参数都会被收集到 sizes 这个元组里。当你调用：

torch.randn(2, 3,     # 这两个位置参数被当作大小out=my_out, # 只能通过关键字指定dtype=torch.float64,  # 关键字形式layout=torch.strided,  # 关键字形式device='cuda:0',       # 关键字形式requires_grad=True,    # 关键字形式generator=g)           # 关键字形式

如果你尝试用位置参数来“偷”传 out，比如写 torch.randn(2,3,my_tensor)，Python 会把 my_tensor 当成第三个维度大小（必须是 int），自然会报类型错误。
因此，out、dtype、layout、device、requires_grad、generator 都被设计成 keyword-only arguments，只能用 key=value 的形式调用，避免和形状参数冲突。

小结

torch.randn(*sizes)：位置参数与整型元组都可用于指定输出形状；
输出示例：两种写法生成相同 shape 的张量，只是随机内容不同；
关键字参数：out、dtype、layout、device、requires_grad、generator 必须写成 名称=值，确保位置参数只对应“形状”这一语义，不会混淆。