Trying no GIL on scientific programming

dans Bloc-notes

Par Cheuk Ting Ho - Salle Thomas Edison

Trying no GIL on scientific programming

logo PyConFr Bordeaux 2023

In this talk, we will have a look at what is no-gil Python and how it may improve the performance of some scientific calculations. First of all, we will touch upon the background knowledge of the Python GIL, what is it and why it is needed. On the contrary, why it is stopping multi-threaded CPU processes to take advantage of multi-core machines.

After that, we will have a look at no-gil Python, a fork of CPython 3.9 by Same Gross. How it provides an alternative to using Python with no GIL and demonstrates it could be the future of the newer versions of Python. With that, we will try out this version of Python in some popular yet calculation-heavy algorithms in scientific programming and data sciences e.g. PCA, clustering, categorization and data manipulation with Scikit-learn and Pandas. We will compare the performance of this no-gil version with the original standard CPython distribution.

This talk is for Pythonistas who have intermediate knowledge of Python and are interested in using Python for scientific programming or data science. It may shine some light on having a more efficient way of using Python in their tasks and interest in trying the no-gil version of Python.


Notes personnelles

  • GIL
    • Global Interpreter Lock
    • only a single thread is used to run Python
    • limit access to only one oobject by one thread
    • driver metapnor
    • other program have multiple locks tools (more complicated)
  • No GIL?
    • 4 attempts before (greg stein 2016, Adam Olsen 2007, Larry Hasting 2016)
    • Sam Gross
    • Why?: N cores == speed x N
  • Challenges
    • reference counting / bias reference counting
    • make commonly used object immortal (no ref count)
    • make some objects deferred ref counting (add counts at GC)
    • thread safety for objects (dict() & °list())
    • using small locks
    • manually write the lock orders using CPython API
    • replace built-in allocator pynalloc with mimalloc (thread safety)
  • Scientific uses?
  • my tries
  • Why didn't see a improvement
    • C extention already use C multi-threading
    • C extentions may expect a GIL
    • Compatibilty issues?
    • tries only on dual cores
  • notes
    • Sam Groth Europython keynote
    • Blog post Lukasz Langa