Friday, April 26, 2024
HomePythonPython dis module and fixed folding

Python dis module and fixed folding


Hello individuals! Just lately, I used to be tremendous confused once I discovered that:

>>> pow(3,89)

runs slower than:

>>> 3**89

I attempted to think about an appropriate reply however couldn’t discover any. I timed the execution of each of those statements utilizing the timeit module in Python3:

$ python3 -m timeit 'pow(3,89)'
500000 loops, better of 5: 688 nsec per loop

$ python3 -m timeit '3**89'
500000 loops, better of 5: 519 nsec per loop

The distinction shouldn’t be huge. It’s only 0.1µs nevertheless it was nonetheless bugging me. If I can’t clarify one thing in programming, I often find yourself having sleepless nights 😅

I discovered the reply by the Python IRC channel on Freenode. The rationale why pow is barely slower is that in CPython there may be a further step of loading pow from the namespace. Whereas, in 3**9 there isn’t any such loading required. This additionally implies that the distinction will stay roughly fixed even when the enter numbers get larger and greater.

The speculation is true:

$ python3 -m timeit 'pow(3,9999)'
5000 loops, better of 5: 58.5 usec per loop

$ python3 -m timeit '3**9999'
5000 loops, better of 5: 57.3 usec per loop

Through the technique of exploring the answer to this query I additionally bought to be taught in regards to the dis module. It lets you decompile the Python Bytecode and examine it. This was a brilliant thrilling discovery primarily as a result of I’m studying extra about reverse engineering binaries now-a-days and this module suits proper in.

I disassembled the bytecode of the above statements like this in Python:

>>> import dis
>>> dis.dis('pow(3,89)')
#  1           0 LOAD_NAME                0 (pow)
#              2 LOAD_CONST               0 (3)
#              4 LOAD_CONST               1 (89)
#              6 CALL_FUNCTION            2
#              8 RETURN_VALUE

>>> dis.dis('3**64')
#  1           0 LOAD_CONST               0 (3433683820292512484657849089281)
#              2 RETURN_VALUE

>>> dis.dis('3**65')
#  1           0 LOAD_CONST               0 (3)
#              2 LOAD_CONST               1 (65)
#              4 BINARY_POWER
#              6 RETURN_VALUE

You may find out about the right way to perceive the output of dis.dis by studying this reply on Stackoverflow

Okay again to the code. The disassembly of pow is smart. It’s loading pow from the namespace after which loading 3 and 89 to registers and at last calling the pow operate. However why does the output of the following two disassemblies differ from one another? The one factor we modified is the exponent worth from 64 to 65!

This query launched me to a different new idea of “fixed folding“. It simply implies that when we’ve a continuing expression Python evaluates the worth of that expression at compile time in order that once you truly run this system it doesn’t take as lengthy to run as a result of Python makes use of the already computed worth. Consider it like this:

def one_plue_one():
    return 1+1

# --vs--

def one_plue_one():
    return 2

Python compiles the primary operate to the second and makes use of that once you run the code. Neat, eh?

So why is fixed folding working for 3**64 and never 3**65? Properly, I don’t know. This most likely has one thing to do with a restrict to what number of powers the system has pre-computed in reminiscence. I may be completely flawed. The subsequent step in my thoughts is to dabble within the Python supply code in my free time and take a look at to determine what is occurring. I’m nonetheless making an attempt to determine a solution for this so in case you have any recommendations please drop them within the feedback part beneath.

What I would like you to remove from this publish is that you need to search options to primary questions. You by no means know the place the solutions will lead you. You would possibly find yourself studying one thing totally new (like I simply did)! I hope you guys preserve this flame of curiosity burning. So long 🙂

PS: If you wish to be taught extra about Python Bytecode and the right way to mess around with it, somebody (njs on #python) instructed me this discuss. I personally haven’t watched it however I most likely will as soon as I get some free time.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments