Python Concurrency with asyncio
A week ago, I was working on a project that involved calling a REST API end
point 32 million times to retrieve certain type of documents. The input to the
API was a presigned URL that had a validity of few days. Hence I did not have
the luxury of doing things in sequential manner. A rough calculation for the
time taken to perform the task using a simple for loop made me realize that
the task is a nice little use case for parallelizing. That’s when I started
looking at asyncio. In the first go at my task, I ventured along with a
standard approach of using multithreading functions in python. However there
was always an itch to see if I could get better performance using ayncio and
multithreading. The book titled “Python Concurrency with asyncio” written by
“Matthew Fowler” helped me understand the basics of concurrent and parallel
computing with asyncio. Subsequently I went back and performed the task of
pinging an API 32 million times to retrieve 32 million json documents using
asyncio and multithreading. In this post, I will summarize a few chapters
that I found it useful to get my work done.