Python JSON 完全指南

本文档面向零基础新手，目标是让你真正理解：

JSON 是什么，长什么样
Python 数据与 JSON 的对应关系
json.dumps() / json.dump() 的所有参数详解
json.loads() / json.load() 的所有参数详解
如何美化输出、排序键、处理中文
如何处理 JSON 中的特殊值（NaN、Infinity、日期等）
自定义编码器和解码器
常见错误与调试技巧
实战案例

配有大量可运行示例，全部从最基础讲起。

第一部分：JSON 是什么？

1.1 JSON 的样子

JSON（JavaScript Object Notation） 是一种轻量级的数据交换格式，长得像 Python 的字典和列表，但有严格的规则。

{
    "姓名": "张三",
    "年龄": 25,
    "身高": 175.5,
    "已婚": false,
    "爱好": ["读书", "爬山", "摄影"],
    "地址": {
        "城市": "北京",
        "区": "海淀区"
    },
    "备注": null
}

JSON 的六种数据类型：

JSON 类型	例子	对应 Python 类型
字符串	`"hello"`	`str`
数字（整数）	`42`	`int`
数字（浮点）	`3.14`	`float`
布尔值	`true` / `false`	`True` / `False`
空值	`null`	`None`
数组	`[1, 2, 3]`	`list`
对象	`{"key": "value"}`	`dict`

注意 JSON 和 Python 的三处关键区别：

JSON 布尔值是 true/false（小写），Python 是 True/False（首字母大写）

JSON 空值是 null，Python 是 None

JSON 字符串只能用双引号 "，Python 单引号双引号都行

1.2 为什么要用 JSON？

# 场景：把程序里的数据保存到文件，或发送给别的程序

# 问题：直接用 str() 转换，别的语言/程序无法解析
data = {"name": "张三", "score": 95}
print(str(data))   # "{'name': '张三', 'score': 95}"
# 这是 Python 特有的格式，JavaScript 无法解析！

# 解决：用 JSON 格式，全球通用
import json
print(json.dumps(data))   # '{"name": "u5f20u4e09", "score": 95}'
# 这是标准 JSON，任何语言都能读懂

JSON 的三大用途：

用途1：数据持久化（存文件）
  Python 字典/列表 → JSON 文件 → 下次读取恢复

用途2：网络传输（API 通信）
  服务器返回 JSON 字符串 → Python 解析 → 使用数据

用途3：配置文件
  程序读取 config.json → 获取配置参数

1.3 json 模块的四个核心函数

import json

# 内存操作（字符串）：
json.dumps(obj)   # Python对象 → JSON字符串    （dumps = dump string）
json.loads(s)     # JSON字符串 → Python对象    （loads = load string）

# 文件操作：
json.dump(obj, f)  # Python对象 → 写入JSON文件  （dump to file）
json.load(f)       # 从JSON文件读取 → Python对象 （load from file）

记忆口诀：

有 s（string）→ 操作字符串（内存）
没有 s      → 操作文件

dumps：把 Python 对象转成 JSON 字符串
loads：把 JSON 字符串转成 Python 对象
dump ：把 Python 对象写入 JSON 文件
load ：从 JSON 文件读取，转成 Python 对象

第二部分：json.dumps()——对象转字符串

2.1 最基本的用法

import json

# ─── 字典 ───
person = {
    "name": "张三",
    "age": 25,
    "score": 95.5,
    "active": True,
    "address": None,
}

json_str = json.dumps(person)
print(json_str)
# {"name": "u5f20u4e09", "age": 25, "score": 95.5, "active": true, "address": null}

print(type(json_str))   # <class 'str'>

# ─── 列表 ───
colors = ["红", "绿", "蓝"]
print(json.dumps(colors))
# ["u7ea2", "u7eff", "u84dd"]

# ─── 嵌套结构 ───
data = {
    "学生": [
        {"name": "张三", "score": 90},
        {"name": "李四", "score": 85},
    ],
    "班级": "高三1班",
}
print(json.dumps(data))

2.2 ensure_ascii 参数——让中文直接显示

import json

data = {"name": "张三", "city": "北京"}

# 默认（ensure_ascii=True）：中文被转义为 uXXXX
print(json.dumps(data))
# {"name": "u5f20u4e09", "city": "u5317u4eac"}

# ensure_ascii=False：中文直接显示（推荐！）
print(json.dumps(data, ensure_ascii=False))
# {"name": "张三", "city": "北京"}

结论：只要涉及中文（或任何非 ASCII 字符），始终加 ensure_ascii=False。

2.3 indent 参数——美化输出（格式化缩进）

import json

data = {
    "学校": "清华大学",
    "学生": [
        {"姓名": "张三", "成绩": 90, "爱好": ["读书", "运动"]},
        {"姓名": "李四", "成绩": 85, "爱好": ["音乐"]},
    ],
    "创建时间": "2024-01-01",
}

# 不缩进（默认）：压缩在一行，难以阅读
print(json.dumps(data, ensure_ascii=False))

print()

# indent=4：4个空格缩进，便于人类阅读
print(json.dumps(data, ensure_ascii=False, indent=4))

输出（indent=4）：

{
    "学校": "清华大学",
    "学生": [
        {
            "姓名": "张三",
            "成绩": 90,
            "爱好": [
                "读书",
                "运动"
            ]
        },
        {
            "姓名": "李四",
            "成绩": 85,
            "爱好": [
                "音乐"
            ]
        }
    ],
    "创建时间": "2024-01-01"
}

小技巧： 保存到文件时用 indent=4（方便人工查看），网络传输时不加 indent（减少数据量）。

2.4 sort_keys 参数——按键名排序

import json

data = {"zebra": 1, "apple": 2, "mango": 3, "banana": 4}

# 默认：保留原始顺序
print(json.dumps(data))
# {"zebra": 1, "apple": 2, "mango": 3, "banana": 4}

# sort_keys=True：按键名字母排序
print(json.dumps(data, sort_keys=True))
# {"apple": 2, "banana": 4, "mango": 3, "zebra": 1}

# 排序 + 缩进，非常整洁
print(json.dumps(data, sort_keys=True, indent=2))

什么时候用 sort_keys=True？

✅ 需要比较两个 JSON 是否相同（排序后对比更可靠）
✅ 生成文档/报告，希望字段顺序稳定
✅ 写入版本控制系统（git diff 更清晰）

2.5 separators 参数——控制分隔符

import json

data = {"a": 1, "b": 2, "c": 3}

# 默认分隔符：", " 和 ": "（有空格）
print(json.dumps(data))
# {"a": 1, "b": 2, "c": 3}

# 去掉空格：网络传输时最小化体积
print(json.dumps(data, separators=(",", ":")))
# {"a":1,"b":2,"c":3}

# 自定义分隔符（特殊用途）
print(json.dumps(data, separators=("; ", " = ")))
# {"a" = 1; "b" = 2; "c" = 3}

2.6 完整参数示例

import json

data = {
    "用户名": "zhang_san",
    "姓名": "张三",
    "分数": 98.6,
    "通过": True,
    "备注": None,
    "标签": ["优秀", "进步"],
}

# 推荐的"人类友好"格式（保存配置/数据文件用）
result = json.dumps(
    data,
    ensure_ascii=False,   # 中文直接显示
    indent=4,             # 4空格缩进
    sort_keys=True,       # 键名排序
)
print(result)

# 推荐的"网络传输"格式（API 接口、减少流量用）
result_compact = json.dumps(
    data,
    ensure_ascii=False,
    separators=(",", ":"),   # 去掉空格
)
print(result_compact)

输出（人类友好格式）：

{
    "备注": null,
    "分数": 98.6,
    "姓名": "张三",
    "标签": [
        "优秀",
        "进步"
    ],
    "通过": true,
    "用户名": "zhang_san"
}

第三部分：json.loads()——字符串转对象

3.1 最基本的用法

import json

# ─── 解析 JSON 字符串 ───
json_str = '{"name": "张三", "age": 25, "score": 95.5, "active": true, "note": null}'

data = json.loads(json_str)

print(data)
# {'name': '张三', 'age': 25, 'score': 95.5, 'active': True, 'note': None}

print(type(data))           # <class 'dict'>
print(data["name"])         # 张三
print(data["active"])       # True（注意：JSON true → Python True）
print(data["note"])         # None（注意：JSON null → Python None）
print(type(data["age"]))    # <class 'int'>
print(type(data["score"]))  # <class 'float'>

3.2 解析各种 JSON 格式

import json

# ─── JSON 数组 → Python 列表 ───
json_array = '[1, 2, 3, "hello", true, null]'
result = json.loads(json_array)
print(result)          # [1, 2, 3, 'hello', True, None]
print(type(result))    # <class 'list'>

# ─── 嵌套 JSON ───
json_nested = '''
{
    "school": "清华大学",
    "students": [
        {"name": "张三", "grades": [90, 85, 92]},
        {"name": "李四", "grades": [78, 82, 88]}
    ],
    "active": true
}
'''
data = json.loads(json_nested)

print(data["school"])                    # 清华大学
print(data["students"][0]["name"])       # 张三
print(data["students"][0]["grades"])     # [90, 85, 92]
print(data["students"][1]["grades"][2])  # 88

# ─── 简单标量 ───
print(json.loads("42"))         # 42（int）
print(json.loads("3.14"))       # 3.14（float）
print(json.loads('"hello"'))    # hello（str）
print(json.loads("true"))       # True（bool）
print(json.loads("null"))       # None

3.3 处理 API 返回的 JSON（最常用场景）

import json

# 模拟一个 HTTP API 返回的响应体（字符串）
api_response = '''
{
    "code": 200,
    "message": "success",
    "data": {
        "users": [
            {"id": 1, "name": "Alice", "email": "alice@example.com", "vip": true},
            {"id": 2, "name": "Bob",   "email": "bob@example.com",   "vip": false},
            {"id": 3, "name": "Carol", "email": null,                "vip": true}
        ],
        "total": 3,
        "page": 1
    }
}
'''

response = json.loads(api_response)

# 检查状态码
if response["code"] == 200:
    users = response["data"]["users"]
    print(f"共 {response['data']['total']} 个用户：n")

    for user in users:
        vip_flag = "⭐VIP" if user["vip"] else "普通"
        email    = user["email"] if user["email"] else "（未填写）"
        print(f"  [{vip_flag}] {user['name']}  邮箱：{email}")

输出：

共 3 个用户：

  [⭐VIP] Alice  邮箱：alice@example.com
  [普通] Bob    邮箱：bob@example.com
  [⭐VIP] Carol  邮箱：（未填写）

第四部分：json.dump()——写入文件

4.1 基本用法

import json

data = {
    "姓名": "张三",
    "年龄": 25,
    "爱好": ["读书", "旅行", "摄影"],
    "地址": {"城市": "北京", "区": "海淀区"},
}

# 写入 JSON 文件
with open("person.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False, indent=4)

print("已写入 person.json")

# 验证：读回来看看
with open("person.json", "r", encoding="utf-8") as f:
    content = f.read()
print(content)

person.json 内容：

{
    "姓名": "张三",
    "年龄": 25,
    "爱好": [
        "读书",
        "旅行",
        "摄影"
    ],
    "地址": {
        "城市": "北京",
        "区": "海淀区"
    }
}

4.2 dump() 的所有参数（和 dumps() 完全一致）

import json

data = {"b": 2, "a": 1, "name": "测试", "active": True}

with open("output.json", "w", encoding="utf-8") as f:
    json.dump(
        data,
        f,
        ensure_ascii=False,   # 中文直接显示（必加！）
        indent=4,             # 缩进空格数
        sort_keys=True,       # 键名排序
        separators=None,      # None 表示使用默认分隔符
    )

4.3 保存列表（多条记录）

import json

# 保存一批学生成绩记录
students = [
    {"id": 1, "name": "张三", "math": 90, "english": 85},
    {"id": 2, "name": "李四", "math": 78, "english": 92},
    {"id": 3, "name": "王五", "math": 88, "english": 79},
    {"id": 4, "name": "赵六", "math": 95, "english": 91},
]

with open("students.json", "w", encoding="utf-8") as f:
    json.dump(students, f, ensure_ascii=False, indent=2)

print(f"已保存 {len(students)} 条学生记录")

# 读取并使用
with open("students.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)

# 计算每个学生的平均分
for s in loaded:
    avg = (s["math"] + s["english"]) / 2
    print(f"{s['name']}: 数学{s['math']}，英语{s['english']}，均分{avg:.1f}")

输出：

已保存 4 条学生记录
张三: 数学90，英语85，均分87.5
李四: 数学78，英语92，均分85.0
王五: 数学88，英语79，均分83.5
赵六: 数学95，英语91，均分93.0

4.4 实战：JSON 作为轻量级数据库

import json
import os

DB_FILE = "contacts.json"

def load_contacts():
    """从文件加载联系人列表，文件不存在则返回空列表"""
    if not os.path.exists(DB_FILE):
        return []
    with open(DB_FILE, "r", encoding="utf-8") as f:
        return json.load(f)

def save_contacts(contacts):
    """把联系人列表保存到文件"""
    with open(DB_FILE, "w", encoding="utf-8") as f:
        json.dump(contacts, f, ensure_ascii=False, indent=2)

def add_contact(name, phone, email=""):
    """添加联系人"""
    contacts = load_contacts()
    # 检查是否已存在
    for c in contacts:
        if c["name"] == name:
            print(f"联系人 '{name}' 已存在！")
            return
    contacts.append({"name": name, "phone": phone, "email": email})
    save_contacts(contacts)
    print(f"已添加：{name} ({phone})")

def find_contact(name):
    """查找联系人"""
    contacts = load_contacts()
    results = [c for c in contacts if name in c["name"]]
    if results:
        for c in results:
            print(f"  姓名：{c['name']}，手机：{c['phone']}，邮箱：{c['email'] or '无'}")
    else:
        print(f"未找到包含 '{name}' 的联系人")

def list_all():
    """列出所有联系人"""
    contacts = load_contacts()
    if not contacts:
        print("联系人列表为空")
        return
    print(f"共 {len(contacts)} 个联系人：")
    for i, c in enumerate(contacts, 1):
        print(f"  {i}. {c['name']}  {c['phone']}")

# 使用示例
add_contact("张三", "13812345678", "zhangsan@example.com")
add_contact("李四", "13987654321")
add_contact("王小明", "15011112222", "wang@qq.com")
add_contact("张三", "13800000000")   # 重复添加

print()
list_all()

print()
print("搜索"张"：")
find_contact("张")

输出：

已添加：张三 (13812345678)
已添加：李四 (13987654321)
已添加：王小明 (15011112222)
联系人 '张三' 已存在！

共 3 个联系人：
  1. 张三  13812345678
  2. 李四  13987654321
  3. 王小明  15011112222

搜索"张"：
  姓名：张三，手机：13812345678，邮箱：zhangsan@example.com

第五部分：json.load()——从文件读取

5.1 基本用法

import json

# 前提：存在一个 config.json 文件
# 先写入一个示例文件
config = {
    "app_name": "我的应用",
    "version": "1.2.3",
    "debug": False,
    "database": {
        "host": "localhost",
        "port": 5432,
        "name": "mydb",
    },
    "allowed_hosts": ["127.0.0.1", "192.168.1.0"],
    "max_connections": 100,
}

with open("config.json", "w", encoding="utf-8") as f:
    json.dump(config, f, ensure_ascii=False, indent=4)

# ─── 读取配置文件 ───
with open("config.json", "r", encoding="utf-8") as f:
    cfg = json.load(f)

print(f"应用名称：{cfg['app_name']}")
print(f"版本号：  {cfg['version']}")
print(f"调试模式：{cfg['debug']}")
print(f"数据库：  {cfg['database']['host']}:{cfg['database']['port']}")
print(f"允许IP：  {cfg['allowed_hosts']}")

输出：

应用名称：我的应用
版本号：  1.2.3
调试模式：False
数据库：  localhost:5432
允许IP：  ['127.0.0.1', '192.168.1.0']

5.2 读取多条记录并处理

import json

# 假设 data.json 内容如下：
sample_json = '''[
    {"product": "苹果", "price": 5.5,  "stock": 100, "on_sale": true},
    {"product": "香蕉", "price": 3.0,  "stock": 0,   "on_sale": false},
    {"product": "橙子", "price": 4.5,  "stock": 50,  "on_sale": true},
    {"product": "葡萄", "price": 12.0, "stock": 30,  "on_sale": false},
    {"product": "草莓", "price": 25.0, "stock": 20,  "on_sale": true}
]'''

with open("products.json", "w", encoding="utf-8") as f:
    f.write(sample_json)

# 读取并分析
with open("products.json", "r", encoding="utf-8") as f:
    products = json.load(f)

print(f"共 {len(products)} 种商品n")

# 只显示有库存且在售的商品
available = [p for p in products if p["stock"] > 0 and p["on_sale"]]
print("=== 有货在售 ===")
for p in available:
    print(f"  {p['product']:4s}  ¥{p['price']:.1f}  库存：{p['stock']}")

# 计算平均价格
avg_price = sum(p["price"] for p in products) / len(products)
print(f"n所有商品平均价格：¥{avg_price:.2f}")

# 找出最贵的商品
most_expensive = max(products, key=lambda p: p["price"])
print(f"最贵商品：{most_expensive['product']}  ¥{most_expensive['price']}")

输出：

共 5 种商品

=== 有货在售 ===
  苹果  ¥5.5  库存：100
  橙子  ¥4.5  库存：50
  草莓  ¥25.0  库存：20

所有商品平均价格：¥10.00
最贵商品：草莓  ¥25.0

第六部分：dumps 与 dump 的区别（总结对比）

6.1 四个函数全面对比

import json

data = {"name": "张三", "age": 25}

# ─── dumps：对象 → 字符串（内存操作）───
s = json.dumps(data, ensure_ascii=False)
print(type(s))   # <class 'str'>
print(s)         # {"name": "张三", "age": 25}

# ─── loads：字符串 → 对象（内存操作）───
obj = json.loads(s)
print(type(obj))   # <class 'dict'>
print(obj)         # {'name': '张三', 'age': 25}

# ─── dump：对象 → 文件（文件操作）───
with open("test.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False, indent=4)
# 文件 test.json 现在包含格式化的 JSON

# ─── load：文件 → 对象（文件操作）───
with open("test.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)
print(type(loaded))  # <class 'dict'>
print(loaded)        # {'name': '张三', 'age': 25}

6.2 dump 和 dumps 的关系

import json

data = {"x": 1}

# dump(data, f) 本质上等价于：
# f.write(dumps(data))
# 只是 dump 直接写文件，dumps 返回字符串

# 用 dumps + write 手动模拟 dump：
with open("manual.json", "w", encoding="utf-8") as f:
    f.write(json.dumps(data, ensure_ascii=False, indent=4))

# 两者效果完全相同

第七部分：数据类型转换详解

7.1 Python → JSON 类型映射

import json

# 验证所有类型的转换
examples = {
    "整数":   42,
    "浮点数": 3.14,
    "字符串": "hello",
    "布尔真": True,
    "布尔假": False,
    "空值":   None,
    "列表":   [1, 2, 3],
    "元组":   (4, 5, 6),       # 元组 → JSON 数组（和列表一样）
    "字典":   {"a": 1},
}

result = json.dumps(examples, ensure_ascii=False, indent=2)
print(result)

输出：

{
  "整数": 42,
  "浮点数": 3.14,
  "字符串": "hello",
  "布尔真": true,
  "布尔假": false,
  "空值": null,
  "列表": [1, 2, 3],
  "元组": [4, 5, 6],
  "字典": {"a": 1}
}

Python → JSON 完整对照表：

Python 类型	JSON 类型	备注
`dict`	object `{}`	键必须是字符串
`list`	array `[]`
`tuple`	array `[]`	转换后类型丢失
`str`	string `""`
`int`	number
`float`	number	`inf`/`nan` 默认报错
`True`	`true`
`False`	`false`
`None`	`null`

7.2 JSON → Python 类型映射

import json

json_str = '''
{
    "integer": 42,
    "float":   3.14,
    "string":  "hello",
    "true":    true,
    "false":   false,
    "null":    null,
    "array":   [1, 2, 3],
    "object":  {"a": 1}
}
'''

result = json.loads(json_str)

for key, value in result.items():
    print(f"{key:10s}: {repr(value):20s} → 类型：{type(value).__name__}")

输出：

integer   : 42                   → 类型：int
float     : 3.14                 → 类型：float
string    : 'hello'              → 类型：str
true      : True                 → 类型：bool
false     : False                → 类型：bool
null      : None                 → 类型：NoneType
array     : [1, 2, 3]            → 类型：list
object    : {'a': 1}             → 类型：dict

注意： JSON 数组 → Python list（不是 tuple）；JSON object → Python dict

7.3 不能直接转换的类型——会报错

import json
from datetime import datetime, date
import decimal

# ❌ 以下类型不能直接 dumps，会报 TypeError

# 日期时间
now = datetime.now()
try:
    json.dumps({"time": now})
except TypeError as e:
    print(f"datetime 报错：{e}")
# TypeError: Object of type datetime is not JSON serializable

# Decimal（精确小数）
price = decimal.Decimal("9.99")
try:
    json.dumps({"price": price})
except TypeError as e:
    print(f"Decimal 报错：{e}")

# 集合
s = {1, 2, 3}
try:
    json.dumps({"set": s})
except TypeError as e:
    print(f"set 报错：{e}")

# bytes
b = b"hello"
try:
    json.dumps({"bytes": b})
except TypeError as e:
    print(f"bytes 报错：{e}")

如何解决？ → 见第八部分（自定义编码器）

第八部分：处理特殊值

8.1 float 特殊值：NaN 和 Infinity

import json
import math

# ─── 问题：Python 的 nan/inf 在标准 JSON 里不合法 ───
data = {
    "a": float("nan"),
    "b": float("inf"),
    "c": float("-inf"),
}

# 默认行为：生成非标准 JSON（某些解析器不认）
print(json.dumps(data))
# {"a": NaN, "b": Infinity, "c": -Infinity}   ← 非标准！

# ─── 方案1：allow_nan=False → 报错，强制不允许 ───
try:
    json.dumps(data, allow_nan=False)
except ValueError as e:
    print(f"报错：{e}")

# ─── 方案2：序列化前手动处理 ───
def clean_floats(obj):
    """把 nan/inf 替换为 None（null）"""
    if isinstance(obj, dict):
        return {k: clean_floats(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [clean_floats(v) for v in obj]
    elif isinstance(obj, float):
        if math.isnan(obj) or math.isinf(obj):
            return None
    return obj

clean_data = clean_floats(data)
print(json.dumps(clean_data))
# {"a": null, "b": null, "c": null}

8.2 自定义编码器——处理 datetime、Decimal 等

import json
from datetime import datetime, date
import decimal

# ─── 方式1：default 参数（推荐，最简洁）───
def custom_default(obj):
    """当对象无法序列化时，这个函数被调用"""
    if isinstance(obj, datetime):
        return obj.strftime("%Y-%m-%d %H:%M:%S")   # 转字符串
    if isinstance(obj, date):
        return obj.strftime("%Y-%m-%d")
    if isinstance(obj, decimal.Decimal):
        return float(obj)    # 转 float（注意精度问题）
    if isinstance(obj, set):
        return list(obj)     # 转 list
    if isinstance(obj, bytes):
        return obj.decode("utf-8")
    # 其他无法处理的类型：抛出 TypeError
    raise TypeError(f"类型 {type(obj)} 无法序列化")

data = {
    "now":     datetime(2024, 1, 15, 10, 30, 0),
    "today":   date(2024, 1, 15),
    "price":   decimal.Decimal("9.99"),
    "tags":    {"Python", "编程", "数据"},
    "content": b"hello world",
}

result = json.dumps(data, default=custom_default, ensure_ascii=False, indent=2)
print(result)

输出：

{
  "now": "2024-01-15 10:30:00",
  "today": "2024-01-15",
  "price": 9.99,
  "tags": ["Python", "编程", "数据"],
  "content": "hello world"
}

8.3 自定义编码器类（更完整的方式）

import json
from datetime import datetime, date
import decimal

class SmartEncoder(json.JSONEncoder):
    """
    自定义 JSON 编码器，支持：
    - datetime / date → 字符串
    - Decimal → float
    - set → list
    - bytes → str
    """

    def default(self, obj):
        if isinstance(obj, datetime):
            return {"__type__": "datetime", "value": obj.isoformat()}
        if isinstance(obj, date):
            return {"__type__": "date", "value": obj.isoformat()}
        if isinstance(obj, decimal.Decimal):
            return {"__type__": "decimal", "value": str(obj)}
        if isinstance(obj, set):
            return {"__type__": "set", "value": sorted(obj)}
        if isinstance(obj, bytes):
            return {"__type__": "bytes", "value": obj.hex()}
        # 调用父类处理其他类型（会抛出 TypeError）
        return super().default(obj)

data = {
    "created_at": datetime(2024, 1, 15, 10, 30),
    "date":       date(2024, 1, 15),
    "price":      decimal.Decimal("99.99"),
    "tags":       {"Python", "JSON"},
}

# 使用自定义编码器
result = json.dumps(data, cls=SmartEncoder, ensure_ascii=False, indent=2)
print(result)

输出：

{
  "created_at": {
    "__type__": "datetime",
    "value": "2024-01-15T10:30:00"
  },
  "date": {
    "__type__": "date",
    "value": "2024-01-15"
  },
  "price": {
    "__type__": "decimal",
    "value": "99.99"
  },
  "tags": {
    "__type__": "set",
    "value": ["JSON", "Python"]
  }
}

8.4 自定义解码器——还原特殊类型

import json
from datetime import datetime, date
import decimal

def smart_decoder(obj):
    """
    object_hook：每当解析出一个 JSON object（字典）时被调用
    可以把特殊标记的字典还原成原始类型
    """
    if "__type__" not in obj:
        return obj   # 普通字典，直接返回

    t     = obj["__type__"]
    value = obj["value"]

    if t == "datetime":
        return datetime.fromisoformat(value)
    if t == "date":
        return date.fromisoformat(value)
    if t == "decimal":
        return decimal.Decimal(value)
    if t == "set":
        return set(value)
    if t == "bytes":
        return bytes.fromhex(value)

    return obj   # 未知类型，原样返回

# 使用 object_hook 解码
json_str = '''
{
    "created_at": {"__type__": "datetime", "value": "2024-01-15T10:30:00"},
    "price":      {"__type__": "decimal",  "value": "99.99"},
    "tags":       {"__type__": "set",      "value": ["JSON", "Python"]}
}
'''

result = json.loads(json_str, object_hook=smart_decoder)

print(result["created_at"])          # 2024-01-15 10:30:00
print(type(result["created_at"]))    # <class 'datetime.datetime'>
print(result["price"])               # 99.99
print(type(result["price"]))         # <class 'decimal.Decimal'>
print(result["tags"])                # {'JSON', 'Python'}
print(type(result["tags"]))          # <class 'set'>

第九部分：错误处理

9.1 json.JSONDecodeError——解析失败

import json

bad_jsons = [
    "{'name': 'Zhang San'}",     # 单引号（Python 字典写法，JSON 不认）
    '{"name": "Zhang San",}',    # 多了一个逗号（JSON 不允许尾随逗号）
    '{"name": "Zhang San"',      # 缺少右括号
    "undefined",                  # JavaScript 的 undefined，JSON 没有
    "",                           # 空字符串
    "None",                       # Python 的 None，JSON 没有（应该是 null）
]

for bad in bad_jsons:
    try:
        result = json.loads(bad)
    except json.JSONDecodeError as e:
        print(f"解析失败：{repr(bad[:30])}")
        print(f"  错误：{e.msg}")
        print(f"  位置：第{e.lineno}行，第{e.colno}列，字符位置{e.pos}")
        print()

输出：

解析失败："{'name': 'Zhang San'}"
  错误：Expecting property name enclosed in double quotes
  位置：第1行，第2列，字符位置1

解析失败：'{"name": "Zhang San",}'
  错误：Expecting property name enclosed in double quotes
  位置：第1行，第22列，字符位置21

...

9.2 安全地解析 JSON

import json

def safe_json_loads(json_str, default=None):
    """安全解析 JSON，失败时返回默认值而不是崩溃"""
    try:
        return json.loads(json_str)
    except json.JSONDecodeError as e:
        print(f"[警告] JSON 解析失败：{e}")
        return default
    except TypeError as e:
        print(f"[警告] 输入类型错误：{e}")
        return default

# 测试
print(safe_json_loads('{"name": "张三"}'))         # {'name': '张三'}
print(safe_json_loads("invalid json", default={})) # {}
print(safe_json_loads(None, default=[]))           # []

9.3 读写文件时的错误处理

import json
import os

def load_json_file(filepath, default=None):
    """安全地读取 JSON 文件"""
    if not os.path.exists(filepath):
        print(f"[信息] 文件不存在：{filepath}，使用默认值")
        return default if default is not None else {}

    try:
        with open(filepath, "r", encoding="utf-8") as f:
            return json.load(f)
    except json.JSONDecodeError as e:
        print(f"[错误] JSON 格式有误：{e}")
        return default if default is not None else {}
    except PermissionError:
        print(f"[错误] 无权限读取：{filepath}")
        return default if default is not None else {}
    except UnicodeDecodeError as e:
        print(f"[错误] 编码错误，尝试使用 gbk 重新读取：{e}")
        try:
            with open(filepath, "r", encoding="gbk") as f:
                return json.load(f)
        except Exception:
            return default if default is not None else {}

def save_json_file(filepath, data):
    """安全地写入 JSON 文件"""
    # 先写临时文件，成功后再替换，防止写到一半崩溃导致数据损坏
    temp_path = filepath + ".tmp"
    try:
        with open(temp_path, "w", encoding="utf-8") as f:
            json.dump(data, f, ensure_ascii=False, indent=2)
        # 原子替换
        os.replace(temp_path, filepath)
        return True
    except (TypeError, ValueError) as e:
        print(f"[错误] 数据无法序列化：{e}")
        return False
    except OSError as e:
        print(f"[错误] 文件写入失败：{e}")
        return False
    finally:
        # 清理临时文件
        if os.path.exists(temp_path):
            os.remove(temp_path)

# 使用示例
config = load_json_file("config.json", default={"debug": False})
config["last_run"] = "2024-01-15"
save_json_file("config.json", config)

第十部分：实战案例

10.1 案例一：读写程序配置文件

import json
import os

CONFIG_FILE = "app_config.json"

DEFAULT_CONFIG = {
    "language":    "zh_CN",
    "theme":       "light",
    "font_size":   14,
    "auto_save":   True,
    "recent_files": [],
    "shortcuts": {
        "save":   "Ctrl+S",
        "open":   "Ctrl+O",
        "quit":   "Ctrl+Q",
    },
}

def load_config():
    """加载配置，不存在则使用默认配置"""
    if os.path.exists(CONFIG_FILE):
        with open(CONFIG_FILE, "r", encoding="utf-8") as f:
            saved = json.load(f)
        # 合并：用保存的值覆盖默认值（保证新增的默认键也有值）
        config = DEFAULT_CONFIG.copy()
        config.update(saved)
        return config
    return DEFAULT_CONFIG.copy()

def save_config(config):
    """保存配置到文件"""
    with open(CONFIG_FILE, "w", encoding="utf-8") as f:
        json.dump(config, f, ensure_ascii=False, indent=4, sort_keys=True)

def update_config(key, value):
    """修改单个配置项"""
    config = load_config()
    config[key] = value
    save_config(config)
    print(f"已更新配置：{key} = {value}")

# 使用示例
config = load_config()
print(f"当前主题：{config['theme']}")
print(f"字体大小：{config['font_size']}")

# 修改配置
update_config("theme", "dark")
update_config("font_size", 16)

# 添加最近文件
config = load_config()
config["recent_files"].append("/home/user/document.txt")
if len(config["recent_files"]) > 5:
    config["recent_files"] = config["recent_files"][-5:]   # 只保留最近5个
save_config(config)

# 查看最终配置
final = load_config()
print(f"n最终配置：")
print(json.dumps(final, ensure_ascii=False, indent=2))

10.2 案例二：处理 API 数据（天气接口示例）

import json

# 模拟天气 API 返回的 JSON（真实 API 会用 requests 库获取）
weather_json = '''
{
    "city": "北京",
    "date": "2024-01-15",
    "current": {
        "temp": -3,
        "feels_like": -8,
        "humidity": 45,
        "wind_speed": 15,
        "wind_dir": "西北",
        "weather": "晴",
        "aqi": 85
    },
    "forecast": [
        {"date": "2024-01-15", "high": 2,  "low": -5, "weather": "晴"},
        {"date": "2024-01-16", "high": 5,  "low": -2, "weather": "多云"},
        {"date": "2024-01-17", "high": 8,  "low": 1,  "weather": "小雪"},
        {"date": "2024-01-18", "high": 3,  "low": -3, "weather": "阴"},
        {"date": "2024-01-19", "high": -1, "low": -8, "weather": "大风"}
    ],
    "alerts": [
        {"level": "蓝色", "type": "大风", "desc": "预计15日夜间到16日大风，阵风7~8级"}
    ]
}
'''

weather = json.loads(weather_json)

# 显示当前天气
cur = weather["current"]
print(f"{'='*30}")
print(f"📍 {weather['city']}  {weather['date']}")
print(f"{'='*30}")
print(f"天气：{cur['weather']}")
print(f"气温：{cur['temp']}°C（体感 {cur['feels_like']}°C）")
print(f"湿度：{cur['humidity']}%")
print(f"风向风速：{cur['wind_dir']}风 {cur['wind_speed']}km/h")
print(f"空气质量：AQI {cur['aqi']}（{'优' if cur['aqi']<50 else '良' if cur['aqi']<100 else '轻度污染'}）")

# 显示预警
if weather["alerts"]:
    print(f"n⚠️  气象预警：")
    for alert in weather["alerts"]:
        print(f"  [{alert['level']}预警·{alert['type']}] {alert['desc']}")

# 显示预报
print(f"n📅 未来5天预报：")
for day in weather["forecast"]:
    print(f"  {day['date']}  {day['weather']:4s}  {day['low']}~{day['high']}°C")

# 找出最冷的一天
coldest = min(weather["forecast"], key=lambda d: d["low"])
print(f"n最冷的一天：{coldest['date']}，最低气温 {coldest['low']}°C")

# 保存到文件，方便后续分析
with open("weather_history.json", "w", encoding="utf-8") as f:
    json.dump(weather, f, ensure_ascii=False, indent=2)
print("n已保存天气数据到 weather_history.json")

输出：

==============================
📍 北京  2024-01-15
==============================
天气：晴
气温：-3°C（体感 -8°C）
湿度：45%
风向风速：西北风 15km/h
空气质量：AQI 85（良）

⚠️  气象预警：
  [蓝色预警·大风] 预计15日夜间到16日大风，阵风7~8级

📅 未来5天预报：
  2024-01-15  晴    -5~2°C
  2024-01-16  多云  -2~5°C
  2024-01-17  小雪  1~8°C
  2024-01-18  阴    -3~3°C
  2024-01-19  大风  -8~-1°C

最冷的一天：2024-01-19，最低气温 -8°C
已保存天气数据到 weather_history.json

10.3 案例三：JSON Lines 格式（每行一个 JSON，适合大数据）

import json

# JSON Lines（.jsonl）：每行是一个独立的 JSON 对象
# 适合：日志文件、大数据处理、流式读取

JSONL_FILE = "logs.jsonl"

# ─── 写入 JSON Lines ───
log_entries = [
    {"time": "2024-01-15 09:00:01", "level": "INFO",  "msg": "服务启动"},
    {"time": "2024-01-15 09:00:05", "level": "INFO",  "msg": "数据库连接成功"},
    {"time": "2024-01-15 09:01:23", "level": "WARN",  "msg": "内存使用率超过80%"},
    {"time": "2024-01-15 09:05:44", "level": "ERROR", "msg": "请求超时", "url": "/api/data"},
    {"time": "2024-01-15 09:10:00", "level": "INFO",  "msg": "定时任务执行完毕"},
]

with open(JSONL_FILE, "w", encoding="utf-8") as f:
    for entry in log_entries:
        # 每行写一个 JSON，不加 indent（保持单行）
        f.write(json.dumps(entry, ensure_ascii=False) + "n")

print(f"已写入 {len(log_entries)} 条日志n")

# ─── 读取 JSON Lines（逐行读取，不会占用大量内存）───
errors = []
with open(JSONL_FILE, "r", encoding="utf-8") as f:
    for line_num, line in enumerate(f, 1):
        line = line.strip()
        if not line:
            continue
        try:
            entry = json.loads(line)
            if entry.get("level") == "ERROR":
                errors.append(entry)
        except json.JSONDecodeError as e:
            print(f"第{line_num}行解析失败：{e}")

print(f"发现 {len(errors)} 条错误日志：")
for e in errors:
    print(f"  [{e['time']}] {e['msg']}", end="")
    if "url" in e:
        print(f"  ({e['url']})", end="")
    print()

# ─── 追加新日志（直接 append 模式写文件，不需要读取全部）───
new_entry = {"time": "2024-01-15 09:15:30", "level": "INFO", "msg": "新增日志"}
with open(JSONL_FILE, "a", encoding="utf-8") as f:
    f.write(json.dumps(new_entry, ensure_ascii=False) + "n")
print(f"n追加了 1 条新日志")

第十一部分：常见陷阱与注意事项

11.1 陷阱一：JSON 键必须是字符串

import json

# ❌ Python 字典可以用整数键，但 JSON 不行
data = {1: "one", 2: "two", 3: "three"}

# dumps 不会报错，但键被强制转成字符串
s = json.dumps(data)
print(s)   # {"1": "one", "2": "two", "3": "three"}  ← 注意：1变成了"1"

# loads 回来后，键变成字符串了！
back = json.loads(s)
print(back)          # {'1': 'one', '2': 'two', '3': 'three'}
print(back[1])       # ← KeyError! 键变成了字符串"1"，不是整数1
print(back["1"])     # ← 正确：'one'

11.2 陷阱二：元组序列化后变成列表

import json

data = {
    "point": (1, 2, 3),   # 元组
    "color": (255, 128, 0),
}

# 序列化
s = json.dumps(data)
print(s)   # {"point": [1, 2, 3], "color": [255, 128, 0]}

# 反序列化后，元组变成了列表！
back = json.loads(s)
print(type(back["point"]))   # <class 'list'>！不是 tuple

# 如果需要保持 tuple，需要手动转回
back["point"] = tuple(back["point"])
print(type(back["point"]))   # <class 'tuple'>

11.3 陷阱三：loads 的参数必须是字符串，不是字节

import json

# ❌ 传入 bytes 对象
b = b'{"name": "Zhang San"}'
try:
    json.loads(b)      # Python 3.6+ 支持 bytes，但更早版本报错
except TypeError as e:
    print(f"错误：{e}")

# ✅ 明确解码为字符串
s = b.decode("utf-8")
print(json.loads(s))   # {'name': 'Zhang San'}

# 或者 Python 3.6+ 直接传 bytes（推荐明确解码）
print(json.loads(b))   # {'name': 'Zhang San'}（Python 3.6+ 支持）

11.4 陷阱四：文件编码问题

import json

data = {"message": "你好，世界！"}

# ❌ 忘记指定编码，在 Windows 上可能用默认的 gbk 写入
with open("bad.json", "w") as f:         # 没有 encoding！
    json.dump(data, f, ensure_ascii=False)
# 在 Windows 上写入 gbk 编码，其他系统读取会乱码

# ✅ 始终指定 encoding="utf-8"
with open("good.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False)

# ✅ 或者用 ensure_ascii=True（转义中文），不依赖编码
with open("safe.json", "w") as f:
    json.dump(data, f)   # 中文被转义为 uXXXX，任何编码都能正常读取

11.5 陷阱五：大数字精度丢失

import json
import decimal

# JSON 的数字规范中，大整数可能在某些语言/系统中精度丢失
big_int = 9007199254740993   # 超过 JavaScript Number.MAX_SAFE_INTEGER

s = json.dumps({"id": big_int})
print(s)   # {"id": 9007199254740993}  ← Python 没问题

# 但如果这个 JSON 被 JavaScript 解析：
# JSON.parse('{"id": 9007199254740993}').id → 9007199254740992（精度丢失！）

# 解决方案：把大整数以字符串形式存储
s = json.dumps({"id": str(big_int)})
print(s)   # {"id": "9007199254740993"}

# 浮点数精度问题
data = {"price": 0.1 + 0.2}   # Python 的浮点数问题
print(json.dumps(data))         # {"price": 0.30000000000000004}

# 用 Decimal 保存精确值（需要自定义编码器）

第十二部分：小结速查表

12.1 四个核心函数速查

import json

# ─── 内存操作 ───
s   = json.dumps(obj)            # 对象 → 字符串（最简）
s   = json.dumps(obj,
         ensure_ascii=False,     # 允许非 ASCII（中文）
         indent=4,               # 格式化缩进
         sort_keys=True,         # 键名排序
         separators=(",", ":"),  # 去掉空格（压缩）
         default=处理函数)        # 处理自定义类型

obj = json.loads(s)              # 字符串 → 对象（最简）
obj = json.loads(s,
         object_hook=解码函数)    # 自定义解码

# ─── 文件操作 ───
with open("file.json", "w", encoding="utf-8") as f:
    json.dump(obj, f,            # 参数和 dumps 完全一样，多一个 f
         ensure_ascii=False,
         indent=4)

with open("file.json", "r", encoding="utf-8") as f:
    obj = json.load(f)           # 参数和 loads 一样，多一个 f

12.2 常用场景模板

import json

# ─── 场景1：保存配置文件 ───
def save_config(config, path="config.json"):
    with open(path, "w", encoding="utf-8") as f:
        json.dump(config, f, ensure_ascii=False, indent=4, sort_keys=True)

def load_config(path="config.json", default=None):
    try:
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except (FileNotFoundError, json.JSONDecodeError):
        return default or {}

# ─── 场景2：处理 API 响应 ───
def parse_api_response(response_text):
    try:
        data = json.loads(response_text)
        return data
    except json.JSONDecodeError:
        return None

# ─── 场景3：漂亮打印 JSON ───
def pretty_print(obj):
    print(json.dumps(obj, ensure_ascii=False, indent=2))

# ─── 场景4：处理带 datetime 的数据 ───
from datetime import datetime

def json_dumps_with_datetime(obj):
    def default(o):
        if isinstance(o, datetime):
            return o.isoformat()
        raise TypeError(f"不支持的类型：{type(o)}")
    return json.dumps(obj, default=default, ensure_ascii=False, indent=2)

12.3 核心概念对照

概念	说明	注意事项
`dumps`	对象 → JSON 字符串	中文要加 `ensure_ascii=False`
`loads`	JSON 字符串 → 对象	输入必须是有效 JSON
`dump`	对象 → JSON 文件	打开文件要加 `encoding="utf-8"`
`load`	JSON 文件 → 对象	打开文件要加 `encoding="utf-8"`
`indent`	缩进格式化	传文件用，网络传输不用
`sort_keys`	键名排序	需要稳定输出时用
`default`	自定义编码	处理 datetime、Decimal 等
`object_hook`	自定义解码	还原特殊类型
`JSONDecodeError`	解析失败异常	解析外部 JSON 时必须捕获

访问数: 13

Python JSON 完全指南

第一部分：JSON 是什么？

1.1 JSON 的样子

1.2 为什么要用 JSON？

1.3 json 模块的四个核心函数

第二部分：json.dumps()——对象转字符串

2.1 最基本的用法

2.2 ensure_ascii 参数——让中文直接显示

2.3 indent 参数——美化输出（格式化缩进）

2.4 sort_keys 参数——按键名排序

2.5 separators 参数——控制分隔符

2.6 完整参数示例

第三部分：json.loads()——字符串转对象

3.1 最基本的用法

3.2 解析各种 JSON 格式

3.3 处理 API 返回的 JSON（最常用场景）

第四部分：json.dump()——写入文件

4.1 基本用法

4.2 dump() 的所有参数（和 dumps() 完全一致）

4.3 保存列表（多条记录）

4.4 实战：JSON 作为轻量级数据库

第五部分：json.load()——从文件读取

5.1 基本用法

5.2 读取多条记录并处理

第六部分：dumps 与 dump 的区别（总结对比）

6.1 四个函数全面对比

6.2 dump 和 dumps 的关系

第七部分：数据类型转换详解

7.1 Python → JSON 类型映射

7.2 JSON → Python 类型映射

7.3 不能直接转换的类型——会报错

第八部分：处理特殊值

8.1 float 特殊值：NaN 和 Infinity

8.2 自定义编码器——处理 datetime、Decimal 等

8.3 自定义编码器类（更完整的方式）

8.4 自定义解码器——还原特殊类型

第九部分：错误处理

9.1 json.JSONDecodeError——解析失败

9.2 安全地解析 JSON

9.3 读写文件时的错误处理

第十部分：实战案例

10.1 案例一：读写程序配置文件

10.2 案例二：处理 API 数据（天气接口示例）

10.3 案例三：JSON Lines 格式（每行一个 JSON，适合大数据）

第十一部分：常见陷阱与注意事项

11.1 陷阱一：JSON 键必须是字符串

11.2 陷阱二：元组序列化后变成列表

11.3 陷阱三：loads 的参数必须是字符串，不是字节

11.4 陷阱四：文件编码问题

11.5 陷阱五：大数字精度丢失

第十二部分：小结速查表

12.1 四个核心函数速查

12.2 常用场景模板

12.3 核心概念对照

发表评论 取消回复

发表评论取消回复