Thursday, March 28, 2024
HomePythonStoring and Loading Knowledge with JSON

Storing and Loading Knowledge with JSON


We’ve already discovered about pickle, so why do we’d like one other option to (de)serialize Python objects to(from) disk or a community connection? There are three main causes to choose JSON over pickle:

  • Whenever you’re unpickling knowledge, you’re primarily permitting your knowledge supply to execute arbitrary Python instructions. If the information is reliable (say saved in a sufficiently protected listing), that will not be an issue, but it surely’s usually very easy to unintentionally depart a file unprotected (or learn one thing from community). In these circumstances, you wish to load knowledge, and never execute probably malicious Python code!
  • Pickled knowledge isn’t simple to learn, and nearly unattainable to jot down for people. For instance, the pickled model of {"reply": [42]} appears to be like like this:
  (dp0
  S'reply'
  p1
  (lp2
  I42
  as.

In distinction, the JSON illustration of {"reply": [42]} is {"reply": [42]}. In case you can learn Python, you possibly can learn JSON; since all JSON is legitimate Python code!

So how do you get the JSON illustration of an object? It’s easy, simply name json.dumps:

import json
obj = {u"reply": [42.2], u"abs": 42}
print(json.dumps(obj))
# output:  {"reply": [42.2], "abs": 42}

Typically, you wish to write to a file or community stream. In each Python 2.x and three.x you possibly can name dump to do this, however in 3.x the output should be a personality stream, whereas 2.x expects a byte stream.

Let’s look learn how to load what we wrote. Fittingly, the operate to load is named hundreds (to load from a string) / load (to load from a stream):

import json
obj_json = u'{"reply": [42.2], "abs": 42}'
obj = json.hundreds(obj_json)
print(repr(obj))

When the objects we load and retailer develop bigger, we puny people usually want some hints on the place a brand new sub-object begins. To get these, merely go an indent measurement, like this:

import json
obj = {u"reply": [42.2], u"abs": 42}
print(json.dumps(obj, indent=4))

Now, the output shall be an attractive

{
    "abs": 42, 
    "reply": [
        42.2
    ]
}

I usually use this indentation function to debug advanced knowledge constructions.

The worth of JSON’s interoperability is that we can not retailer arbitrary Python objects. In actual fact, JSON can solely retailer the next objects:

  • character strings
  • numbers
  • booleans (True/False)
  • None
  • lists
  • dictionaries with character string keys

Each object that’s not considered one of these should be transformed – that features each object of a customized class. Say we’ve got an object alice as follows:

class Person(object):
    def __init__(self, identify, password):
        self.identify = identify
        self.password = password
alice = Person('Alice A. Adams', 'secret')

then changing this object to JSON will fail:

>>> import json
>>> json.dumps(alice)
Traceback (most up-to-date name final):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/json/__init__.py", line 236, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.3/json/encoder.py", line 191, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.3/json/encoder.py", line 249, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.3/json/encoder.py", line 173, in default
    elevate TypeError(repr(o) + " isn't JSON serializable")
TypeError: <__main__.Person object at 0x7f2eccc88150> isn't JSON serializable

Thankfully, there’s a easy hook for conversion: Merely outline a default methodology:

def jdefault(o):
    return o.__dict__
print(json.dumps(alice, default=jdefault))
# outputs: {"password": "secret", "identify": "Alice A. Adams"}

o.__dict__ is a straightforward catch-all for user-defined objects, however we are able to additionally add assist for different objects. For instance, let’s add assist for units by treating them like lists:

def jdefault(o):
    if isinstance(o, set):
        return listing(o)
    return o.__dict__

pets = set([u'Tiger', u'Panther', u'Toad'])
print(json.dumps(pets, default=jdefault))
# outputs: ["Tiger", "Panther", "Toad"]

For extra choices and particulars (ensure_ascii and sort_keys could also be attention-grabbing choices to set), take a look on the official documentation for JSON. JSON is out there by default in Python 2.6 and newer, earlier than that you should use simplejson as a fallback.

You may additionally like

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments