使用Python处理JSON数据

25.1 JSON简介

25.1.1 什么是JSON

    JSON全称为JavaScript Object Notation,一般翻译为JS标记,是一种轻量级的数据交换格式。是基于ECMAScript的一个子集,采用完全独立于编程语言的文本格式来存储和表示数据。简洁和清晰的层次结构使得JSON成为理想的数据交换语言,其主要特点有:易于阅读、易于机器生成、有效提升网络速度等。

25.1.2 JSON的两种结构

    JSON简单来说,可以理解为JavaScript中的数组和对象,通过这两种结构,可以表示各种复杂的结构。

25.1.2.1 数组

    数组在JavaScript是使用中括号[ ]来定义的,一般定义格式如下所示:

let array=["Surpass","28","Shanghai"];

    若要对数组取值,则需要使用索引。元素的类型可以是数字、字符串、数组和对象等。

25.1.2.2 对象

    对象在JavaScript是使用大括号{ }来定义的,一般定义格式如下所示:

let personInfo={

name:"Surpass",

age:28,

location:"Shanghai"

}

    对象一般是基于key和value,在JavaScript中,其取值方式也非常简单variable.key即可。元素value的类型可以是数字、字符串、数组和对象等。

25.1.3 支持的数据格式

    JSON支持的主要数据格式如下所示:

数组:使用中括号对象:使用大括号整型、浮点型、布尔类型和null字符串类型:必须使用双引号,不能使用单引号

    多个数据之间使用逗号做为分隔符,基与Python中的数据类型对应表如下所示:

JSONPythonObjectdictarrayliststringstrnumber(int)intnumber(real)floattrueTruefalseFalsenullNone

25.2 Python对JSON的支持

25.2.1 Python 和 JSON 数据类型

    在Python中主要使用json模块来对JSON数据进行处理。在使用前,需要导入json模块,用法如下所示:

import json

    json模块中主要包含以下四个操作函数,如下所示:

    在json的处理过种中,Python中的原始类型与JSON类型会存在相互转换,具体的转换表如下所示:

Python 转换为 JSON

PythonJSONdictObjectlistarraytuplearraystrstringintnumberfloatnumberTruetrueFalsefalseNonenull

JSON 转换为 Python

JSONPythonObjectdictarrayliststringstrnumber(int)intnumber(real)floattrueTruefalseFalsenullNone

25.2.2 json模块常用方法

    关于Python 内置的json模块,可以查看之前我写的文章:https://www.cnblogs.com/surpassme/p/13034972.html

25.3 使用JSONPath处理JSON数据

    内置的json模块,在处理简单的JSON数据时,易用且非常非常方便,但在处理比较复杂且特别大的JSON数据,还是有一些费力,今天我们使用一个第三方的工具来处理JSON数据,叫JSONPath。

25.3.1 什么是JSONPath

    JSONPath是一种用于解析JSON数据的表达语言。经常用于解析和处理多层嵌套的JSON数据,其用法与解析XML数据的XPath表达式语言非常相似。

25.3.2 安装

    安装方法如下所示:

# pip install -U jsonpath

25.3.3 JSONPath语法

    JSONPath语法与XPath非常相似,其对应参照表如下所示:

XPathJSONPath描述/$根节点/元素.@当前节点/元素/. or []子元素..n/a父元素//..递归向下搜索子元素**通配符,表示所有元素@n/a访问属性,JSON结构的数据没有这种属性[][]子元素操作符(可以在里面做简单的迭代操作,如数据索引,根据内容选值等)|[,]支持迭代器中做多选n/a[start :end :step]数组分割操作[]?()筛选表达式n/a()支持表达式计算()n/a分组,JSONPath不支持

以上内容可查阅官方文档:JSONPath - XPath for JSON

    我们以下示例数据为例,来进行对比,如下所示:

{ "store":

{

"book": [

{ "category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{ "category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{ "category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{ "category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

],

"bicycle": {

"color": "red",

"price": 19.95

}

}

}

XPathJSONPath结果/store/book/author$.store.book[*].author获取book节点中所有author//author$..author获取所有author/store/*$.store.*获取store的元素,包含book和bicycle/store//price$.store..price获取store中的所有price//book[3]$..book[2]获取第三本书所有信息//book[last()]..����[(@.�����ℎ−1)]..book[-1:]获取最后一本书的信息//book[position()❤️]..����[0,1]..book[:2]获取前面的两本书//book[isbn]$..book[?(@.isbn)]根据isbn进行过滤//book[price<10]$..book[?(@.price<10)]根据price进行筛选//*$..*所有元素

在XPath中,下标是1开始,而在JSONPath中是从0开始

JSONPath在线练习网址:JSONPath Online Evaluator

25.3.4 JSONPath用法

    其基本用法形式如下所示:

jsonPath(obj, expr [, args])

    基参数如下所示:

obj (object|array):

    JSON数据对象

expr (string):

    JSONPath表达式

args (object|undefined):

    改变输出格式,比如是输出是值还是路径,

args.resultType可选的输出格式为:"VALUE"、"PATH"、"IPATH"

返回类型为(array|false):

    若返回array,则代表成功匹配到数据,false则代表未匹配到数据。

25.3.5 在Python中的使用

from jsonpath import jsonpath

import json

data = {

"store":

{

"book": [

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

],

"bicycle": {

"color": "red",

"price": 19.95

}

}

}

# 获取book节点中所有author

getAllBookAuthor=jsonpath(data,"$.store.book[*].author")

print(f"getAllBookAuthor is :{json.dumps(getAllBookAuthor,indent=4)}")

# 获取book节点中所有author

getAllAuthor=jsonpath(data,"$..author")

print(f"getAllAuthor is {json.dumps(getAllAuthor,indent=4)}")

# 获取store的元素,包含book和bicycle

getAllStoreElement=jsonpath(data,"$.store.*")

print(f"getAllStoreElement is {json.dumps(getAllStoreElement,indent=4)}")

# 获取store中的所有price

getAllStorePriceA=jsonpath(data,"$[store]..price")

getAllStorePriceB=jsonpath(data,"$.store..price")

print(f"getAllStorePrictA is {getAllStorePriceA}\ngetAllStorePriceB is {getAllStorePriceB}")

# 获取第三本书所有信息

getThirdBookInfo=jsonpath(data,"$..book[2]")

print(f"getThirdBookInfo is {json.dumps(getThirdBookInfo,indent=4)}")

# 获取最后一本书的信息

getLastBookInfo=jsonpath(data,"$..book[-1:]")

print(f"getLastBookInfo is {json.dumps(getLastBookInfo,indent=4)}")

# 获取前面的两本书

getFirstAndSecondBookInfo=jsonpath(data,"$..book[:2]")

print(f"getFirstAndSecondBookInfo is {json.dumps(getFirstAndSecondBookInfo,indent=4)}")

# 根据isbn进行过滤

getWithFilterISBN=jsonpath(data,"$..book[?(@.isbn)]")

print(f"getWithFilterISBN is {json.dumps(getWithFilterISBN,indent=4)}")

# 根据price进行筛选

getWithFilterPrice=jsonpath(data,"$..book[?(@.price<10)]")

print(f"getWithFilterPrice is {json.dumps(getWithFilterPrice,indent=4)}")

# 所有元素

getAllElement=jsonpath(data,"$..*")

print(f"getAllElement is {json.dumps(getAllElement,indent=4)}")

# 未能匹配到元素时

noMatchElement=jsonpath(data,"$..surpass")

print(f"noMatchElement is {noMatchElement}")

# 调整输出格式

controlleOutput=jsonpath(data,expr="$..author",result_type="PATH")

print(f"controlleOutput is {json.dumps(controlleOutput,indent=4)}")

    最终输出结果如下扬尘:

getAllBookAuthor is :[

"Nigel Rees",

"Evelyn Waugh",

"Herman Melville",

"J. R. R. Tolkien"

]

getAllAuthor is [

"Nigel Rees",

"Evelyn Waugh",

"Herman Melville",

"J. R. R. Tolkien"

]

getAllStoreElement is [

[

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

],

{

"color": "red",

"price": 19.95

}

]

getAllStorePrictA is [8.95, 12.99, 8.99, 22.99, 19.95]

getAllStorePriceB is [8.95, 12.99, 8.99, 22.99, 19.95]

getThirdBookInfo is [

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

}

]

getLastBookInfo is [

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

]

getFirstAndSecondBookInfo is [

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

}

]

getWithFilterISBN is [

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

]

getWithFilterPrice is [

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

}

]

getAllElement is [

{

"book": [

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

],

"bicycle": {

"color": "red",

"price": 19.95

}

},

[

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

}

],

{

"color": "red",

"price": 19.95

},

{

"category": "reference",

"author": "Nigel Rees",

"title": "Sayings of the Century",

"price": 8.95

},

{

"category": "fiction",

"author": "Evelyn Waugh",

"title": "Sword of Honour",

"price": 12.99

},

{

"category": "fiction",

"author": "Herman Melville",

"title": "Moby Dick",

"isbn": "0-553-21311-3",

"price": 8.99

},

{

"category": "fiction",

"author": "J. R. R. Tolkien",

"title": "The Lord of the Rings",

"isbn": "0-395-19395-8",

"price": 22.99

},

"reference",

"Nigel Rees",

"Sayings of the Century",

8.95,

"fiction",

"Evelyn Waugh",

"Sword of Honour",

12.99,

"fiction",

"Herman Melville",

"Moby Dick",

"0-553-21311-3",

8.99,

"fiction",

"J. R. R. Tolkien",

"The Lord of the Rings",

"0-395-19395-8",

22.99,

"red",

19.95

]

noMatchElement is False

controlleOutput is [

"$['store']['book'][0]['author']",

"$['store']['book'][1]['author']",

"$['store']['book'][2]['author']",

"$['store']['book'][3]['author']"

]

精彩文章

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: