I might have been a bit unclear. Not every request gets a new process, but there are a pool of processes. For instance gunicorn will maybe have ~20 processes ready to serve requests, with a main process sending work to the workers. Since it's different processes, there is very little communication or shared resources between them. It's almost as if you're running your main entry point multiple times in different terminals, just with gunicorn managing it for you.
This is different than for instance java, which there is one process and then multiple threads. But because of the GIL in python, only one thread can ever run at the same time (in the same process), therefore one has to launch multiple instances/processes of the application.
That's true, cache utilization declines as a function of your processing parallel (and in gunicorn/celery, your max-requests-per-process-before-suicide setting). Cached properties still can help a good deal though, especially when coupled with a) careful prefork initialization of cached data and b) even more careful use of multiprocessing's little-known shared memory interface which really does allow caches to be shared, without read locks, between multiple processes.
What web server is this? I’ve never heard about this behavior before.