Sequence Index
Python and Go: Half I – gRPC
Python and Go: Half II – Extending Python With Go
Python and Go: Half III – Packaging Python Code
Python and Go: Half IV – Utilizing Python in Reminiscence
Introduction
In the earlier put up we noticed how a Go service can name a Python service utilizing gRPC. Utilizing gRPC to attach a Go and Python program collectively generally is a nice selection, however there’s a complexity value that goes with it. It’s good to handle yet one more service, deployment turns into extra complicated, and also you want monitoring plus alerting for every service. In comparison with a monolithic utility, there’s an order of magnitude extra complexity.
On this put up, we’re going to cut back the complexity of utilizing gRPC by writing a shared library in Go {that a} Python program can devour straight. With this strategy, there’s no networking concerned and relying on the info varieties, no marshalling as effectively. Out of the a number of approaches of calling features from a shared library in Python, we determined to make use of Python’s ctypes module.
Notice: ctypes makes use of libffi below the hood. If you wish to learn some actually scary C code – head over to the repo and begin studying. 🙂
I’ll additionally present my workflow that is likely one of the huge elements in my productiveness. We’ll first write “pure” Go code, then write code to export it to a shared library. Then we’ll swap to the Python world and use Python’s interactive immediate to Mess around with the code. As soon as we’re pleased, we’ll use what we’ve discovered within the interactive immediate to jot down a Python module.
Instance: Checking the Digital Signature of A number of Information in Parallel
Think about you have got a listing with knowledge recordsdata, and that you must validate the integrity of those recordsdata. The listing accommodates a sha1sum.txt
file with a sha1 digital signature for each file. Go, with its concurrency primitives and skill to make use of all of the cores of your machine, is a lot better suited to this process than Python.
Itemizing 1: sha1sum.txt
6659cb84ab403dc85962fc77b9156924bbbaab2c httpd-00.log
5693325790ee53629d6ed3264760c4463a3615ee httpd-01.log
fce486edf5251951c7b92a3d5098ea6400bfd63f httpd-02.log
b5b04eb809e9c737dbb5de76576019e9db1958fd httpd-03.log
ff0e3f644371d0fbce954dace6f678f9f77c3e08 httpd-04.log
c154b2aa27122c07da77b85165036906fb6cbc3c httpd-05.log
28fccd72fb6fe88e1665a15df397c1d207de94ef httpd-06.log
86ed10cd87ac6f9fb62f6c29e82365c614089ae8 httpd-07.log
feaf526473cb2887781f4904bd26f021a91ee9eb httpd-08.log
330d03af58919dd12b32804d9742b55c7ed16038 httpd-09.log
Itemizing 1 reveals an instance of a digital signature file. It supplies hash codes for all of the totally different log recordsdata contained within the listing. This file can be utilized to confirm {that a} log file was downloaded appropriately or has not been tampered with. We’ll write Go code to calculate the hash code of every log file after which match it towards the hash code listed within the digital signature file.
To hurry this course of up, we’ll calculate the digital signature of every file in a separate goroutine, spreading the work throughout the entire CPUs on our machine.
Structure Overview & Work Plan
On the Python facet of the code, we’re going to jot down a perform named check_signatures
and on the Go facet, we’re going to jot down a perform (that does the precise work) named CheckSignatures
. In between these two features, we’ll use the ctypes
module (on the Python facet) and write a confirm
perform (on the Go facet) to offer marshaling help.
Determine 1
Determine 1 reveals the stream of knowledge from the Python perform to the Go perform and again.
Listed here are the steps we’re going to observe for the remainder of the put up:
- Write Go code (
CheckSignature
), - Exporting to the shared library (
confirm
) - Use ctypes within the Python interactive immediate to name the Go code
- Write and package deal the Python code (
check_signatures
) - We’ll do that half within the subsequent weblog put up (this one is already lengthy sufficient).
Go Code – The “CheckSignatures” Operate
I’m not going to interrupt down the entire Go supply code right here, for those who’re curious to see all of it, take a look at this supply code file.
The vital a part of the code to see now could be the definition of the CheckSignatures
perform.
Itemizing 2: CheckSignatures perform definition
// CheckSignatures calculates sha1 signatures for recordsdata in rootDir and evaluate
// them with signatures discovered at "sha1sum.txt" in the identical listing. It's going to
// return an error if one of many signatures do not match
func CheckSignatures(rootDir string) error {
Itemizing 2 reveals the definition of the CheckSignatures
perform. This perform will spin a goroutine per file to examine if the calculated sha1 signature of any given file matches the one in “sha1sum.txt”. If there’s a mismatch in a number of recordsdata, the perform will return an error.
Exporting Go Code to a Shared Library
With the Go code written and examined, we will transfer on to exporting it to a shared library.
Listed here are the steps we’ll observe with a view to compile the Go supply code right into a shared library so Python can name it:
- import the
C
package deal (aka cgo) - Use the
//export
directives on each perform we have to expose - Have an empty
principal
perform - Construct the supply code with the particular
-buildmode=c-shared
flag
Notice: Other than the Go toolchain, we’ll additionally want a C compiler (akin to gcc
in your machine). There’s a great free C compiler for every of the most important platforms: gcc
for Linux, clang
on OSX (by way of XCode) and Visible Studio for Home windows
Itemizing 3: export.go
01 package deal principal
02
03 import "C"
04
05 //export confirm
06 func confirm(root *C.char) *C.char {
07 rootDir := C.GoString(root)
08 if err := CheckSignatures(rootDir); err != nil {
09 return C.CString(err.Error())
10 }
11
12 return nil
13 }
14
15 func principal() {}
Itemizing 3 reveals the export.go
file from the undertaking. We import “C” on line 03 after which on line 05, the confirm
perform is marked to be exported within the shared library. It’s vital that the remark is offered precisely as is. You’ll be able to see on line 06, the confirm
perform accepts a C based mostly string pointer utilizing the C package deal char
kind. For Go code to work with C strings, the C package deal supplies a GoString
perform (which is used on line 07) and a CString
perform (which is utilized in line 09). Lastly, an empty principal
perform is said on the finish.
To construct the shared library, that you must run the go construct
command with a particular flag.
Itemizing 4: Constructing the Shared Library
$ go construct -buildmode=c-shared -o _checksig.so
Itemizing 4 reveals the command to generate the C based mostly shared library which will probably be named _checksig.so
.
Notice: The explanation for utilizing _
is to keep away from title collision with the checksig.py
Python module that we’ll present later. If the shared library was named checksig.so
then executing import checksig
in Python will load the shared library as a substitute of the Python file.
Getting ready the Take a look at Knowledge
Earlier than we will attempt calling confirm
from Python, we want some knowledge. You’ll discover a listing known as logs within the code repository. This listing accommodates some log recordsdata and a sha1sum.txt
file.
Notice: The signature for http08.log
is deliberately improper.
On my machine, this listing is situated at /tmp/logs
.
A Python Session
I really like the interactive shell in Python, it lets me mess around with code in small chunks. After I’ve a working model, I write the Python code in a file.
Itemizing 5: A Python Session
$ python
Python 3.8.3 (default, Might 17 2020, 18:15:42)
[GCC 10.1.0] on linux
Sort "assist", "copyright", "credit" or "license" for extra data.
01 >>> import ctypes
02 >>> so = ctypes.cdll.LoadLibrary('./_checksig.so')
03 >>> confirm = so.confirm
04 >>> confirm.argtypes = [ctypes.c_char_p]
05 >>> confirm.restype = ctypes.c_void_p
06 >>> free = so.free
07 >>> free.argtypes = [ctypes.c_void_p]
08 >>> ptr = confirm('/tmp/logs'.encode('utf-8'))
09 >>> out = ctypes.string_at(ptr)
10 >>> free(ptr)
11 >>> print(out.decode('utf-8'))
12 "/tmp/logs/httpd-08.log" - mismatch
Itemizing 6 reveals an interactive Python session that walks you thru testing the usage of the exported to a shared library we wrote in Go. line 01 we import the ctypes
module to start out. Then on line 06, we load the shared library into reminiscence. On strains 03-05 we load the confirm
perform from the shared library and set the enter and output varieties. Traces 06-07 load the free
perform so we will free the reminiscence allotted by Go (see extra under).
Line 08 is the precise perform name to confirm
. We have to convert the listing title to Python’s bytes
earlier than passing it to the perform. The return worth, which is a C string, is saved in ptr
. On line 09, we convert the C string to a Python bytes
and on line 10 we free the reminiscence allotted by Go. Lastly, on line 11, we convert out
from bytes
to str
earlier than printing it.
Notice: Line 02 assumes the shared library, _checksig.so
is within the present listing. If you happen to began the Python session elsewhere, change the trail to the _checksig.so
in line 02.
With little or no effort we’re capable of name Go code from Python.
Intermezzo: Sharing Reminiscence Between Go & Python
Each Python & Go have a rubbish collector that may mechanically free unused reminiscence. Nevertheless having a rubbish collector doesn’t imply you may’t leak reminiscence.
Notice: You must learn Invoice’s Rubbish Assortment In Go weblog posts. They gives you a great understanding on rubbish collectors normally and on the Go rubbish collector particularly.
It’s good to be additional cautious when sharing reminiscence between Go and Python (or C). Generally it’s not clear when a reminiscence allocation occurs. In export.go
on line 13, we’ve the next code:
Itemizing 6: Changing Go Error to C String
str := C.CString(err.Error())
The documentation for C.String
says:
Itemizing 7
> // Go string to C string
> // The C string is allotted within the C heap utilizing malloc.
> // **It's the caller's accountability to rearrange for it to be
> // freed**, akin to by calling C.free (make sure to embody stdlib.h
> // if C.free is required).
> func C.CString(string) *C.char
To keep away from a reminiscence leak in our interactive immediate, we loaded the free
perform and used it to free the reminiscence allotted by Go.
Conclusion
With little or no code, you need to use Go from Python. Not like the earlier installment, there’s no RPC step – that means you don’t have to marshal and unmarshal parameters on each perform name and there’s no community concerned as effectively. Calling from Python to C this manner is far quicker than a gRPC name. However, that you must be extra cautious with reminiscence administration and the packaging course of is extra complicated.
Notice: A easy benchmark on my machine clocks gRPC perform name at 128µs vs a shared library name at 3.61µs – about 35 occasions quicker.
I hope that you just’ve discovered this fashion of writing code: Pure Go first, then exporting after which attempting it out in an interactive session interesting. I urge you to do this workflow your self subsequent time you write some code.
Within the subsequent installment, we’ll end the final step of my workflow and package deal the Python code as a module.