Wednesday 25 February 2015

Open letter for the sync world

Theses days, I've seen more and more haters about the async community in Python, especially around AsyncIO.
I think this is sad and counter-productive.
I feel that for some people, frustrations or misunderstandings about the place of this new tool might be the cause, so I'd like to share some of my thoughts about it.


Just a proven pattern, not a "who has the biggest d*" contest


Some micro-benchmarks have been published to try to explain that AsyncIO isn't really efficient.
We all know that it is possible to have benchmarks prove about anything, and that the world isn't black or white.
So just for the sake of completeness, here are some macro-benchmarks based on Web applications examples: http://blog.gmludo.eu/2015/02/macro-benchmark-with-django-flask-and-asyncio.html


Now, before starting a ping-pong to try to determine who has the biggest, please read further:

Asynchronous/coroutine pattern isn't a new fancy stuff to decrease developer productivity and performance.
In fact, the idea of asynchrounous, non-blocking IO has been around in many OSes and programming languages for years.
In Linux for example, Asynchronous I/O Support was added to kernel 2.5, back in 2003, you can even find some specifications back in 1997 (http://pubs.opengroup.org/onlinepubs/007908799/xsh/aio.h.html)
It started to gain more visibility with (amongst others) NodeJS a couple of years ago.
This pattern is now included in most new languages (Go...) and is made available in older languages (Python, C#...).

Async isn't a silver bullet, especially for intensive calculations, but for I/O, at least from my experience, it seems to be much more efficient.


The lengthy but successful maturation process of a new standard


In the Python world, a number of alternatives were available (Gevent, Twisted, Eventlet, libevent, stackless,...) each with their own strengths and weaknesses.
Each of them went to a maturation process and could eventually be used on real production environments.

It was really clever for Guido to take all good ideas from all these async frameworks to create AsyncIO.
Instead of having a number of different frameworks, each of them reinventing the wheel on an island,
AsyncIO should help to have a "lingua franca" for doing async in Python.
This is pretty important because once you enter in the async world, all your usual tools and libs (like your favourite DB lib) should also be async compliant.
Because, AsyncIO isn't just a library, it will become the "standard" way to write async code with Python.


If Async means rewriting my perfectly working code, why should I bother ?


To integrate cleanly AsyncIO in your library or your application, you have to rethink the internal architecture.
When you start a new project in "async mode", you can't keep sync for the part of it: to get all async benefits, everything should be async.

But, this isn't mandatory from day 1: you can start simple, and port your code to the async pattern step-by-step.

I can understand some haters reactions: Internet is a big swarm where you have a lot of trends and hype.
Finally, few tools and patterns will really survive to the production's fire.
Meanwhile, you already wrote a lot of perfectly working code, and obviously you really don't want to rewrite that just for the promises of the latest buzz-word.

It's like oriented object programming, years ago, it suddenly became the new "proper" way of writing your code (some said),
and you couldn't be object and procedural in the same time.
Years later, procedural isn't completely dead, because in fact, OO sometimes brings unnecessary overhead.
It really depends on what sort of things you are writing (size matters!).
On the other hand, in 2015, who writes a full-Monty application with procedural only ?

I think one day, it will be the same for the async pattern.
It is always better to driving the change than to endure the change.
Think organic: on the long term, it is not the strongest that survives, nor is it the most intelligent.
It is usually the one being most open and adaptive to changes.


Buzzword, or real paradigm change ?


We don't know for sure if the async pattern is only a temporary fashion buzzword or a real paradigm shift in IT,  just like virtualization has become a de-facto standard over the last few years.

But my feeling is that it is here to stay, even if it won't be relevant for all Python projects.
I think it will become the right way to build efficient and scalable I/O-Bound projects,

For example, in an Internet (network) driven world, I see more and more projects centred around piping between cloud-based services.
For this type of developments, I'm personally convinced a paradigm shift has become unavoidable, and for Pythonists AsyncIO is probably the right horse to bet on.



Does anyone really care or "will I be paid more" ?


Let's face it, beside your geek fellows, nobody cares about the tools you are using:
Your users just want features for yesterday, as few bugs as possible, and they want their application to be fast and responsive.
Who cares if you use async, or some other hoodoo-voodoo-black-magic to reach the goal ?

I think that, by starting a "religious war" between sync and async Python developers, we would all waste our (precious) time.
Instead, we should cultivate emulation between Pythonistas, build solutions to increase real-world performances and stability.
Then let Darwin show us the long term path and adapt to it.

In the end, the whole Python community will benefit if Python is considered as a great language to write business logic with ease AND with brute performance.
We are all tired to hear people in other communities say that Python is slow, we are all convinced this is simply not true.

This is a communication war that the Python community has to win as a team.

PS: Special thanks to Nicolas Stein, aka. Nike, for the review of this text and his precious advices in general to stimulate a scientific approach of problems.