> each and every request gets a different process not sharing that cache What we...

matsemann · on July 9, 2021

I might have been a bit unclear. Not every request gets a new process, but there are a pool of processes. For instance gunicorn will maybe have ~20 processes ready to serve requests, with a main process sending work to the workers. Since it's different processes, there is very little communication or shared resources between them. It's almost as if you're running your main entry point multiple times in different terminals, just with gunicorn managing it for you.

This is different than for instance java, which there is one process and then multiple threads. But because of the GIL in python, only one thread can ever run at the same time (in the same process), therefore one has to launch multiple instances/processes of the application.

zbentley · on July 9, 2021

That's true, cache utilization declines as a function of your processing parallel (and in gunicorn/celery, your max-requests-per-process-before-suicide setting). Cached properties still can help a good deal though, especially when coupled with a) careful prefork initialization of cached data and b) even more careful use of multiprocessing's little-known shared memory interface which really does allow caches to be shared, without read locks, between multiple processes.