当前位置：首页 > web >正文

【数据结构中哈希函数与哈希表】

web 2025/6/17 11:56:59

文章目录

前言
一、哈希函数是什么？
二、哈希冲突与解决方案
- 1. 拉链法（Separate Chaining）
- 2. 开放地址法（Open Addressing）
三、哈希表的性能分析
四、C++ STL 中的 unordered\_map
五、词频统计
总结

前言

哈希表（Hash Table）就是一种常见的、高效的数据结构，它利用哈希函数将数据映射到固定大小的空间，从而实现常数级别的插入、删除和查找操作。

一、哈希函数是什么？

哈希函数（Hash Function）是将任意长度的数据映射为固定长度的哈希值的函数。它的目标是尽量均匀地分布键值，避免哈希冲突。

示例（字符串转哈希）：

int simpleHash(string key, int tableSize) {int hashVal = 0;for (char ch : key) {hashVal = (hashVal * 31 + ch) % tableSize;}return hashVal;
}

该函数将字符串转换为整数索引，31 是常见的“乘法因子”，用于增强哈希分布。

二、哈希冲突与解决方案

由于哈希函数不可能完全避免冲突，常用的冲突解决方案有以下两种：

1. 拉链法（Separate Chaining）

每个桶（bucket）保存一个链表，将冲突的元素链在一起。

#include <iostream>
#include <list>
#include <vector>
using namespace std;class HashTable {
private:vector<list<int>> table;int size;int hash(int key) {return key % size;}public:HashTable(int s) : size(s) {table.resize(size);}void insert(int key) {int idx = hash(key);table[idx].push_back(key);}bool search(int key) {int idx = hash(key);for (int val : table[idx]) {if (val == key) return true;}return false;}
};

2. 开放地址法（Open Addressing）

如果出现冲突，就向后找下一个空位。常用方法有线性探测、二次探测和双重哈希。

线性探测示例：

#include <iostream>
#include <vector>
using namespace std;class LinearProbingHash {
private:vector<int> table;int size;int EMPTY = -1;int hash(int key) {return key % size;}public:LinearProbingHash(int s) : size(s) {table.resize(size, EMPTY);}void insert(int key) {int idx = hash(key);while (table[idx] != EMPTY) {idx = (idx + 1) % size;}table[idx] = key;}bool search(int key) {int idx = hash(key);int start = idx;while (table[idx] != EMPTY) {if (table[idx] == key) return true;idx = (idx + 1) % size;if (idx == start) break; // 表满}return false;}
};

三、哈希表的性能分析

操作	平均时间复杂度	最坏时间复杂度
查找	O(1)	O(n)（所有冲突）
插入	O(1)	O(n)（所有冲突）
删除	O(1)	O(n)（所有冲突）

注意： 哈希表性能取决于负载因子（load factor），负载因子高时容易冲突，需要扩容重哈希。

四、C++ STL 中的 unordered_map

标准库中的 unordered_map 就是哈希表的封装：

#include <iostream>
#include <unordered_map>
using namespace std;int main() {unordered_map<string, int> mp;mp["apple"] = 3;mp["banana"] = 5;if (mp.find("apple") != mp.end()) {cout << "Found apple, count = " << mp["apple"] << endl;}
}

unordered_map 使用哈希表实现，插入、删除、查找操作平均为 O(1)。

五、词频统计

假设有一段文本，要求统计每个单词出现的频率：

#include <iostream>
#include <unordered_map>
#include <sstream>
using namespace std;int main() {string text = "this is a test this is a test again";unordered_map<string, int> freq;stringstream ss(text);string word;while (ss >> word) {freq[word]++;}for (auto& [k, v] : freq) {cout << k << ": " << v << endl;}return 0;
}