当前位置：首页 > backend >正文

python中读取 Excel 表格数据

backend 2025/7/21 6:35:03

在pandas中读取 Excel 表格后，有多种方式可以按列、按行提取数据，下面我将详细介绍常见的方法。

0.声明

在本文中我使用的excel表内容如下：

1. 读取 Excel 文件

首先，我们需要使用 pandas 的 read_excel 函数读取 Excel 文件，示例代码如下：

import pandas as pd# 读取 Excel 文件
df = pd.read_excel(excel_file_path, sheet_name)

excel_file_path为文件路径
sheet_name为工作表名称

2. 按列提取数据

2.1 通过列名提取单列数据

可以使用方括号 [] 并传入列名来提取单列数据，提取后的数据类型为 pandas.Series。

# 提取名为 'age' 的列
column_data = df['age']
print(column_data)

2.2 通过列名提取多列数据

若要提取多列数据，可在方括号 [] 中传入一个包含列名的列表，提取后的数据类型为 pandas.DataFrame。

# 提取 'age' 和 'name' 两列
columns_data = df[['age', 'name']]
print(columns_data)

2.3 通过列索引提取单列数据

使用 iloc 方法，通过列的整数索引来提取单列数据。

# 提取第 0 列（索引从 0 开始）
first_column = df.iloc[:, 0]
print(first_column)

2.4 通过列索引提取多列数据

同样使用 iloc 方法，通过指定列索引范围来提取多列数据。

# 提取第 0 列到第 2 列（不包含第 2 列）
selected_columns = df.iloc[:, 0:2]
print(selected_columns)

3. 按行提取数据

3.1 通过行索引提取单行数据

使用 iloc 方法，通过行的整数索引来提取单行数据，提取后的数据类型为 pandas.Series。

# 提取第 0 行（索引从 0 开始）
first_row = df.iloc[0]
print(first_row)

3.2 通过行索引提取多行数据

使用 iloc 方法，通过指定行索引范围来提取多行数据，提取后的数据类型为 pandas.DataFrame。

# 提取第 0 行到第 2 行（不包含第 2 行）
selected_rows = df.iloc[0:2]
print(selected_rows)

3.3 通过条件筛选提取行数据

可以根据某列的条件来筛选出符合条件的行。

#提取 'age 列中值大于 10 的行
filtered_rows = df[df['age'] > 10]
print(filtered_rows)

4. 按行和列同时提取数据

4.1 使用 `iloc` 按行和列索引提取数据

# 提取第 0 行到第 2 行（不包含第 2 行）和第 0 列到第 2 列（不包含第 2 列）的数据
subset = df.iloc[0:2, 0:2]
print(subset)

5.总代码

这是本文中所使用的代码，大家可以复制下来自己运行一遍

import pandas as pdexcel_file_path="./data.xlsx"
sheet_name="Sheet1"
# 读取 Excel 文件
df = pd.read_excel(excel_file_path, sheet_name)# 提取名为 'age' 的列
print("提取名为 'age' 的列")
column_data = df['age']
print(column_data)# 提取 'age' 和 'name' 两列
print("提取 'age' 和 'name' 两列")
columns_data = df[['age', 'name']]
print(columns_data)# 提取第 0 列（索引从 0 开始）
print("提取第 0 列（索引从 0 开始）")
first_column = df.iloc[:, 0]
print(first_column)# 提取第 0 列到第 2 列（不包含第 2 列）
print("提取第 0 列到第 2 列（不包含第 2 列）")
selected_columns = df.iloc[:, 0:2]
print(selected_columns)# 提取第 0 行（索引从 0 开始）
print("提取第 0 行（索引从 0 开始）")
first_row = df.iloc[0]
print(first_row)# 提取第 0 行到第 2 行（不包含第 2 行）
print("提取第 0 行到第 2 行（不包含第 2 行）")
selected_rows = df.iloc[0:2]
print(selected_rows)# 提取 'age 列中值大于 10 的行
print("提取 'age 列中值大于 10 的行")
filtered_rows = df[df['age'] > 10]
print(filtered_rows)# 提取第 0 行到第 2 行（不包含第 2 行）和第 0 列到第 2 列（不包含第 2 列）的数据
print("提取第 0 行到第 2 行")
subset = df.iloc[0:2, 0:2]
print(subset)