Saturday, May 4, 2024
HomePythonGetting began with Rocksdb and Python

Getting began with Rocksdb and Python


On this submit, I’m going to debate RocksDB.

RocksDB is an embeddable persistent key-value retailer system developed by Fb. It was initially forked from LevelDB which was created by Google.

In response to Wikipedia:

RocksDB is a excessive efficiency embedded database for key-value information. It’s a fork of Google’s LevelDB optimized to take advantage of many CPU cores, and make environment friendly use of quick storage, corresponding to solid-state drives (SSD), for enter/output (I/O) sure workloads. It’s primarily based on a log-structured merge-tree (LSM tree) information construction. It’s written in C++ and gives official language bindings for C++, C, and Java; alongside many third-party language bindings.

RocksDB has significantly been optimized for flash drives and quick storage for low latency information entry. Like Redis, RocksDB additionally shops in-memory information however in contrast to Redis, it’s not a server, it’s an embeddable library much like SQLite. RocksDB is used extensively for storing persistent information on SSD at Fb and by numerous companies that serve on-line queries on laborious drives.

The probabilities of RocksDB utilization are infinite, it’s possible you’ll use it as a storage engine that shops information and generates a customized dwelling web page for every consumer. So as a substitute of constructing a number of SELECTs queries primarily based on the consumer that burdens the DB, it’s possible you’ll retailer that information in Key/Worth format the place the UserID may function a Key that holds all the info in codecs like JSON.

On this submit, I’m discussing RocksDB fundamental utilization no matter a sure use case and the way you should utilize it in your Python purposes.

Set up and Setup

With the intention to use RocksDB in Python, you will need to have RocksDB put in in your system, after which with the assistance of RocksDB’s Python binding, it’s possible you’ll entry RocksDB in your applications. Since I didn’t wish to mess with my Mac atmosphere, I downloaded a Debian-based Python picture and put in RocksDB in it. Under are the steps. First, set up required dependencies and RocksDB itself:

apt set up rocksdb-tools librocksdb5.17 librocksdb-dev libsnappy-dev liblz4-dev

As of now, librocksdb5.17 was the most recent model accessible for me in Docker.

after which

pip set up python-rocksdb

I’ve related my VS Code with the distant container in order that I can immediately code throughout the container. Use this VSCode extension for this objective and connect your container. That is how my VSCode takes care of attaching it with a distant container and deciding on a distant docker-based Python interpreter.

Candy! isn’t it?

Improvement

Let’s import the library and see whether or not it actually works or not

import rocksdb

if __name__ == " __main__":
    print(rocksdb)

If issues are actually put in, it would output like the next:

root@9f7d3fc73b74:/code# /usr/native/bin/python /code/fundamental.py
<module 'rocksdb' from '/usr/native/lib/python3.9/site-packages/rocksdb/ __init__.py'>

Let’s transfer ahead

if __name__ == " __main__":
    db = rocksdb.DB("take a look at.db", rocksdb.Choices(create_if_missing=True))
    db.put(b"a", b"ROFL")
    print(db.get(b"a").decode("utf-8"))

The primary line opens the DB file with sure choices. Right here, I’ve set create_if_missing to True to keep away from file not discovered errors. Then, I set the a key with the textual content ROFL. In case you discover I’m utilizing byte kind b right here for each keys and values. RocksDB helps byte stream for keys as a substitute of string or different information kind. I later transformed it right into astr by calling decode('utf-8)

Let’s see what occurs within the folder the place the DB was created. The very first thing which I observed that was stunning for me that the take a look at.db was not truly a file however a folder.

root@9f7d3fc73b74:/code# ls -la
whole 16
drwxr-xr-x 3 root root 4096 Oct 3 15:24 .
drwxr-xr-x 1 root root 4096 Oct 3 13:37 ..
-rw-r--r-- 1 root root 180 Oct 3 15:20 fundamental.py
drwxr-xr-x 2 root root 4096 Oct 3 15:24 take a look at.db
root@9f7d3fc73b74:/code#

If you execute cd take a look at.db and record recordsdata it reveals the next:

root@9f7d3fc73b74:/code/take a look at.db# ls -l
whole 152
-rw-r--r-- 1 root root 27 Oct 3 15:24 000003.log
-rw-r--r-- 1 root root 16 Oct 3 15:24 CURRENT
-rw-r--r-- 1 root root 37 Oct 3 15:24 IDENTITY
-rw-r--r-- 1 root root 0 Oct 3 15:24 LOCK
-rw-r--r-- 1 root root 15695 Oct 3 15:24 LOG
-rw-r--r-- 1 root root 13 Oct 3 15:24 MANIFEST-000001
-rw-r--r-- 1 root root 4721 Oct 3 15:24 OPTIONS-000005

It accommodates a log file, an possibility file, and some extra. let’s view the content material of 000003.log file.

root@9f7d3fc73b74:/code/take a look at.db# cat 000003.log 
���aROFLroot@9f7d3fc73b74:/code/take a look at.db#

As you figured, it’s not storing information in plain-text format. You possibly can clearly see an _a(Key) _and ROFL(worth) saved in compressed binary format. CURRENT tells in regards to the newest manifest log.

root@9f7d3fc73b74:/code/take a look at.db# cat CURRENT 
MANIFEST-000001

IDENTITY retains monitor of edits. In my case it reveals:

root@9f7d3fc73b74:/code/take a look at.db# cat IDENTITY 
c55f9d31-f622-4335-8cbf-f3ca9ce324ef

Then a 0-byte LOCK file. In RocksDB solely a single course of can open the file therefore a single course of can write information. LOG file because the title suggests logs every thing. The MANIFEST-000001 didn’t have something readable in it. The following file is OPTIONS-000005 which has all of the accessible choices accessible with their present values. Upon working the grep instructions it reveals the next:

root@9f7d3fc73b74:/code/take a look at.db# cat OPTIONS-000005 | grep lacking
  create_missing_column_families=false
  create_if_missing=true

As you may see,create_if_missing is about to true which is kind of apparent.

If you’re additional within the internals it’s possible you’ll go to this Wiki web page.

Equally, it’s possible you’ll delete a key.

db.delete(b"a")
print(db.get(b"a").decode("utf-8"))

Upon working it outputs the next error:

root@9f7d3fc73b74:/code/take a look at.db# /usr/native/bin/python /code/fundamental.py
ROFL
Traceback (most up-to-date name final):
  File "/code/fundamental.py", line 8, in <module>
    print(db.get(b"a").decode("utf-8"))
AttributeError: 'NoneType' object has no attribute 'decode'
root@9f7d3fc73b74:/code/take a look at.db#

Because the a key was already eliminated, upon accessing it threw an exception. The C++ and Java port additionally gives the choice of TtlDB which allow you to set an expiry for the keys, a function that you should utilize for utilizing RocksDB as an internet cache. Sadly, it’s nonetheless not accessible in Python bindings.

Conclusion

On this submit, I launched RocksDB as a key/worth retailer. RocksDB can also be used as a storage engine and is utilized by the DB techniques like ArangoDB, MyRocks(MySQL Storage Engine primarily based on RocksDB), CockroachDB, and others. It’s possible you’ll use it as a alternative for Memcache if you’re utilizing flash storage gadgets. Here’s a complete record of RocksDB utilization.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments