We’ve already discovered about pickle, so why do we’d like one other option to (de)serialize Python objects to(from) disk or a community connection? There are three main causes to choose JSON over pickle:
- Whenever you’re unpickling knowledge, you’re primarily permitting your knowledge supply to execute arbitrary Python instructions. If the information is reliable (say saved in a sufficiently protected listing), that will not be an issue, but it surely’s usually very easy to unintentionally depart a file unprotected (or learn one thing from community). In these circumstances, you wish to load knowledge, and never execute probably malicious Python code!
- Pickled knowledge isn’t simple to learn, and nearly unattainable to jot down for people. For instance, the pickled model of
{"reply": [42]}
appears to be like like this:
(dp0
S'reply'
p1
(lp2
I42
as.
In distinction, the JSON illustration of {"reply": [42]}
is {"reply": [42]}
. In case you can learn Python, you possibly can learn JSON; since all JSON is legitimate Python code!
So how do you get the JSON illustration of an object? It’s easy, simply name json.dumps
:
import json
obj = {u"reply": [42.2], u"abs": 42}
print(json.dumps(obj))
# output: {"reply": [42.2], "abs": 42}
Typically, you wish to write to a file or community stream. In each Python 2.x and three.x you possibly can name dump
to do this, however in 3.x the output should be a personality stream, whereas 2.x expects a byte stream.
Let’s look learn how to load what we wrote. Fittingly, the operate to load is named hundreds
(to load from a string) / load
(to load from a stream):
import json
obj_json = u'{"reply": [42.2], "abs": 42}'
obj = json.hundreds(obj_json)
print(repr(obj))
When the objects we load and retailer develop bigger, we puny people usually want some hints on the place a brand new sub-object begins. To get these, merely go an indent measurement, like this:
import json
obj = {u"reply": [42.2], u"abs": 42}
print(json.dumps(obj, indent=4))
Now, the output shall be an attractive
{
"abs": 42,
"reply": [
42.2
]
}
I usually use this indentation function to debug advanced knowledge constructions.
The worth of JSON’s interoperability is that we can not retailer arbitrary Python objects. In actual fact, JSON can solely retailer the next objects:
- character strings
- numbers
- booleans (
True
/False
) None
- lists
- dictionaries with character string keys
Each object that’s not considered one of these should be transformed – that features each object of a customized class. Say we’ve got an object alice
as follows:
class Person(object):
def __init__(self, identify, password):
self.identify = identify
self.password = password
alice = Person('Alice A. Adams', 'secret')
then changing this object to JSON will fail:
>>> import json
>>> json.dumps(alice)
Traceback (most up-to-date name final):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.3/json/__init__.py", line 236, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.3/json/encoder.py", line 191, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.3/json/encoder.py", line 249, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.3/json/encoder.py", line 173, in default
elevate TypeError(repr(o) + " isn't JSON serializable")
TypeError: <__main__.Person object at 0x7f2eccc88150> isn't JSON serializable
Thankfully, there’s a easy hook for conversion: Merely outline a default
methodology:
def jdefault(o):
return o.__dict__
print(json.dumps(alice, default=jdefault))
# outputs: {"password": "secret", "identify": "Alice A. Adams"}
o.__dict__
is a straightforward catch-all for user-defined objects, however we are able to additionally add assist for different objects. For instance, let’s add assist for units by treating them like lists:
def jdefault(o):
if isinstance(o, set):
return listing(o)
return o.__dict__
pets = set([u'Tiger', u'Panther', u'Toad'])
print(json.dumps(pets, default=jdefault))
# outputs: ["Tiger", "Panther", "Toad"]
For extra choices and particulars (ensure_ascii
and sort_keys
could also be attention-grabbing choices to set), take a look on the official documentation for JSON. JSON is out there by default in Python 2.6 and newer, earlier than that you should use simplejson as a fallback.