SlideShare a Scribd company logo
Shared Memory Parallelism with Python
PyData London 2014, Februray 22, 2014
London,
Author: Dr.-Ing. Mike Müller
Email: mmueller@python-academy.de
Cython
• Mixture between Python and C
• Standard Python is valid Cython
• Gradually add more C-ish features
• Call existing C/C++ code (source and libs)
• Compile to Python extension (*.so or *.pyd)
Prominent Cython Users
• SciPy
• pandas
• PyTables
• zeromq
• Sage
Cython Workflow
• Write Cython code in *.pyx file
• Compile, i.e execute your setup.py
• Get extension module (*.so, *.pyd)
• Use your extension from Python
Example - Cython Code
# file: cy_101.pyx
# could be used in Python directly
def pure_python_func(a, b):
return a + b
# cannot be called from Python
cdef double cython_func(double a, double b):
return a + b
# wrapper to call from Python
def typed_python_func(a, b):
return cython_func(a, b)
Example - Compile
• Use a setup.py file:
from distutils.core import setup
from Cython.Build import cythonize
setup(
name = 'cython101',
ext_modules = cythonize("cy_101.pyx", annotate=True),
)
• Compile to an extension:
python setup.py build_ext --inplace
Example - Use
# file cy_101_test.py
import cy_101
a = 10
b = 20
print cy_101.pure_python_func(a, b)
print cy_101.typed_python_func(a, b)
python cy_101_test.py
30
30.0
Annotations - Cython at Work
• 1778 lines of C code
• Python (yellow) and pure C (white), click to see C source
Pure Python Mode - Decorators are Great
# file:cy_101_deco.py
import cython
@cython.locals(a=cython.double, b=cython.double)
def add(a, b):
return a + b
Automate the Automation
• pyximport tries to compile all pyx and py file with Cython
cdef double cython_func(double a, double b):
return a + b
def typed_python_func(a, b):
return cython_func(a, b)
import pyximport
pyximport.install()
import cy_101_pyximport
print(cy_101_pyximport.typed_python_func(3, 4))
The Buffer Interface
• NumPy-inspired standard to access C data structures
• Cython supports it
• Fewer conversions between Python and C data types
typedef struct bufferinfo {
void *buf; // buffer memory pointer
PyObject *obj; // owning object
Py_ssize_t len; // memory buffer length
Py_ssize_t itemsize; // byte size of one item
int readonly; // read-only flag
int ndim; // number of dimensions
char *format; // item format description
Py_ssize_t *shape; // array[ndim]: length of each dimension
Py_ssize_t *strides; // array[ndim]: byte offset to next item in each d
Py_ssize_t *suboffsets; // array[ndim]: further offset for indirect indexi
void *internal; // reserved for owner
} Py_buffer;
Shared Memory Parallelism with Python by Dr.-Ing Mike Muller
Example
• a and b are 2D NumPy arrays with same shape
• (a + b) * 2 + a * b
• Size: 2000 x 2000
With Multiprocessing
def test_multi(a, b, pool):
assert a.shape == b.shape
v = a.shape[0] // 2
h = a.shape[1] // 2
quads = [(slice(None, v), slice(None, h)),
(slice(None, v), slice(h, None)),
(slice(v, None), slice(h, None)),
(slice(v, None), slice(None, h))]
results = [pool.apply_async(test_numpy, [a[quad], b[quad]])
for quad in quads]
output = numpy.empty_like(a)
for quad, res in zip(quads, results):
output[quad] = res.get()
return output
• multiprocessing solution is 6 times slower than NumPy solution
The Buffer Interface From Cython
import numpy
import cython
@cython.boundscheck(False)
@cython.wraparound(False)
def func(object[double, ndim=2] buf1 not None,
object[double, ndim=2] buf2 not None,
object[double, ndim=2] output=None,):
cdef unsigned int x, y, inner, outer
if buf1.shape != buf2.shape:
raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape,
buf2.shape))
if output is None:
output = numpy.empty_like(buf1)
outer = buf1.shape[0]
inner = buf1.shape[1]
The Buffer Interface From Cython II
for x in xrange(outer):
for y in xrange(inner):
output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 +
buf1[x, y] * buf2[x, y])
return output
Memory Views -Quadrant
import numpy
import cython
@cython.boundscheck(False)
@cython.wraparound(False)
cdef add_arrays_2d_views(double[:,:] buf1,
double[:,:] buf2,
double[:,:] output):
cdef unsigned int x, y, inner, outer
outer = buf1.shape[0]
inner = buf1.shape[1]
for x in xrange(outer):
for y in xrange(inner):
output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 +
buf1[x, y] * buf2[x, y])
return output
Memory Views - Whole I
@cython.boundscheck(False)
@cython.wraparound(False)
def add_arrays_2d(object[double, ndim=2] buf1 not None,
object[double, ndim=2] buf2 not None,
object[double, ndim=2] output=None,):
cdef unsigned int v, h
if buf1.size != buf2.size:
raise TypeError('Arrays have different sizes: %d, %d' % (buf1.size,
buf2.size))
if buf1.shape != buf2.shape:
raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape,
buf2.shape))
if output is None:
output = numpy.empty_like(buf1)
Memory Views - Whole II
v = buf1.shape[0] // 2
h = buf1.shape[1] // 2
quad1 = slice(None, v), slice(None, h)
quad2 = slice(None, v), slice(h, None)
quad3 = slice(v, None), slice(h, None)
quad4 = slice(v, None), slice(None, h)
add_arrays_2d_views(buf1[quad1], buf2[quad1], output[quad1])
add_arrays_2d_views(buf1[quad2], buf2[quad2], output[quad2])
add_arrays_2d_views(buf1[quad3], buf2[quad3], output[quad3])
add_arrays_2d_views(buf1[quad4], buf2[quad4], output[quad4])
return output
OpenMP
• De-facto standard for shared memory parallel programming
The OpenMP API supports multi-platform shared-memory
parallel programming in C/C++ and Fortran. The OpenMP API
defines a portable, scalable model with a simple and flexible
interface for developing parallel applications on platforms from
the desktop to the supercomputer.
-- from openmp.org
OpenMP with Cython - Threads I
# distutils: extra_compile_args = -fopenmp
# distutils: extra_link_args = -fopenmp
import numpy
import cython
from cython cimport parallel
OpenMP with Cython - Threads II
@cython.boundscheck(False)
@cython.wraparound(False)
cdef int add_arrays_2d_views(double[:,:] buf1, double[:,:] buf2,
double[:,:] output) nogil:
cdef unsigned int x, y, inner, outer
outer = buf1.shape[0]
inner = buf1.shape[1]
for x in xrange(outer):
for y in xrange(inner):
output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 +
buf1[x, y] * buf2[x, y])
return 0
OpenMP with Cython - Threads III
@cython.boundscheck(False)
@cython.wraparound(False)
def add_arrays_2d(double[:,:] buf1 not None,
double[:,:] buf2 not None,
double[:,:] output=None,):
cdef unsigned int v, h, thread_id
if buf1.shape[0] != buf2.shape[0] or buf1.shape[1] != buf2.shape[1]:
raise TypeError('Arrays have different shapes: (%d, %d) (%d, %d)' % (
buf1.shape[0], buf1.shape[1], buf2.shape[0],
buf1.shape[1],))
if output is None:
output = numpy.zeros_like(buf1)
OpenMP with Cython - Threads IV
v = buf1.shape[0] // 2
h = buf1.shape[1] // 2
ids = []
with nogil, parallel.parallel(num_threads=4):
thread_id = parallel.threadid()
with gil:
ids.append(thread_id)
if thread_id == 0:
add_arrays_2d_views(buf1[:v,:h], buf2[:v,:h], output[:v,:h])
elif thread_id == 1:
add_arrays_2d_views(buf1[:v,h:], buf2[:v,h:], output[:v,h:])
elif thread_id == 2:
add_arrays_2d_views(buf1[v:,h:], buf2[v:,h:], output[v:,h:])
elif thread_id == 3:
add_arrays_2d_views(buf1[v:,:h], buf2[v:,:h], output[v:,:h])
print ids
return output
OpenMP with Cython - Parallel Range I
# distutils: extra_compile_args = -fopenmp
# distutils: extra_link_args = -fopenmp
import numpy
import cython
from cython cimport parallel
OpenMP with Cython - Parallel Range II
@cython.boundscheck(False)
@cython.wraparound(False)
def func(object[double, ndim=2] buf1 not None,
object[double, ndim=2] buf2 not None,
object[double, ndim=2] output=None,
int num_threads=2):
cdef unsigned int x, y, inner, outer
if buf1.shape != buf2.shape:
raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape,
buf2.shape))
if output is None:
output = numpy.empty_like(buf1)
outer = buf1.shape[0]
inner = buf1.shape[1]
OpenMP with Cython - Parallel Range III
inner = buf1.shape[1]
with nogil, cython.boundscheck(False), cython.wraparound(False):
for x in parallel.prange(outer, schedule='static',
num_threads=num_threads):
for y in xrange(inner):
output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 +
buf1[x, y] * buf2[x, y])
return output
Speedup
Threads Speedup
1 1.0
2 1.6
3 1.8
4 2.0
Conclusions
• Cython + OpenMP allow to work without the GIL
• Threads run in parallel for CPU-bound tasks
• There is a price:
• You need to write more code
• You loose part of the Python safety net
• You need to know C and learn Cython
Ad

More Related Content

What's hot (20)

C++ via C#
C++ via C#C++ via C#
C++ via C#
Egor Bogatov
 
COSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in JuliaCOSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in Julia
岳華 杜
 
Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11
corehard_by
 
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
Dong-hee Na
 
#OOP_D_ITS - 2nd - C++ Getting Started
#OOP_D_ITS - 2nd - C++ Getting Started#OOP_D_ITS - 2nd - C++ Getting Started
#OOP_D_ITS - 2nd - C++ Getting Started
Hadziq Fabroyir
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
Moriyoshi Koizumi
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 Autumn
Moriyoshi Koizumi
 
Csdfsadf
CsdfsadfCsdfsadf
Csdfsadf
Atul Setu
 
C
CC
C
Khan Rahimeen
 
Writing a Python C extension
Writing a Python C extensionWriting a Python C extension
Writing a Python C extension
Sqreen
 
Mono + .NET Core = ❤️
Mono + .NET Core =  ❤️Mono + .NET Core =  ❤️
Mono + .NET Core = ❤️
Egor Bogatov
 
Summary of C++17 features
Summary of C++17 featuresSummary of C++17 features
Summary of C++17 features
Bartlomiej Filipek
 
Extending Python - EuroPython 2014
Extending Python - EuroPython 2014Extending Python - EuroPython 2014
Extending Python - EuroPython 2014
fcofdezc
 
Files
FilesFiles
Files
Karthika Parthasarathy
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
PyData
 
Golang iran - tutorial go programming language - Preliminary
Golang iran - tutorial  go programming language - PreliminaryGolang iran - tutorial  go programming language - Preliminary
Golang iran - tutorial go programming language - Preliminary
go-lang
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)
Olve Maudal
 
Why my Go program is slow?
Why my Go program is slow?Why my Go program is slow?
Why my Go program is slow?
Inada Naoki
 
Diving into byte code optimization in python
Diving into byte code optimization in python Diving into byte code optimization in python
Diving into byte code optimization in python
Chetan Giridhar
 
Python 如何執行
Python 如何執行Python 如何執行
Python 如何執行
kao kuo-tung
 
COSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in JuliaCOSCUP: Foreign Function Call in Julia
COSCUP: Foreign Function Call in Julia
岳華 杜
 
Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11
corehard_by
 
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPy
Dong-hee Na
 
#OOP_D_ITS - 2nd - C++ Getting Started
#OOP_D_ITS - 2nd - C++ Getting Started#OOP_D_ITS - 2nd - C++ Getting Started
#OOP_D_ITS - 2nd - C++ Getting Started
Hadziq Fabroyir
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
Moriyoshi Koizumi
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 Autumn
Moriyoshi Koizumi
 
Writing a Python C extension
Writing a Python C extensionWriting a Python C extension
Writing a Python C extension
Sqreen
 
Mono + .NET Core = ❤️
Mono + .NET Core =  ❤️Mono + .NET Core =  ❤️
Mono + .NET Core = ❤️
Egor Bogatov
 
Extending Python - EuroPython 2014
Extending Python - EuroPython 2014Extending Python - EuroPython 2014
Extending Python - EuroPython 2014
fcofdezc
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
PyData
 
Golang iran - tutorial go programming language - Preliminary
Golang iran - tutorial  go programming language - PreliminaryGolang iran - tutorial  go programming language - Preliminary
Golang iran - tutorial go programming language - Preliminary
go-lang
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)
Olve Maudal
 
Why my Go program is slow?
Why my Go program is slow?Why my Go program is slow?
Why my Go program is slow?
Inada Naoki
 
Diving into byte code optimization in python
Diving into byte code optimization in python Diving into byte code optimization in python
Diving into byte code optimization in python
Chetan Giridhar
 
Python 如何執行
Python 如何執行Python 如何執行
Python 如何執行
kao kuo-tung
 

Similar to Shared Memory Parallelism with Python by Dr.-Ing Mike Muller (20)

Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Take advantage of C++ from Python
Take advantage of C++ from PythonTake advantage of C++ from Python
Take advantage of C++ from Python
Yung-Yu Chen
 
Getting Started Cpp
Getting Started CppGetting Started Cpp
Getting Started Cpp
Long Cao
 
Object.__class__.__dict__ - python object model and friends - with examples
Object.__class__.__dict__ - python object model and friends - with examplesObject.__class__.__dict__ - python object model and friends - with examples
Object.__class__.__dict__ - python object model and friends - with examples
Robert Lujo
 
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
Frankie Dintino
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
Michael Pirnat
 
Intro to Python
Intro to PythonIntro to Python
Intro to Python
Daniel Greenfeld
 
Intro
IntroIntro
Intro
Daniel Greenfeld
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Brendan Eich
 
C
CC
C
Anuja Lad
 
Introduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : NotesIntroduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : Notes
Subhajit Sahu
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional Programming
Adam Getchell
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Takayuki Shimizukawa
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
amnis_azeneth
 
Verilog tutorial
Verilog tutorialVerilog tutorial
Verilog tutorial
Abhiraj Bohra
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
Lennart Regebro
 
OOP_EXPLAINED_example_of_cod_and_explainations.pdf
OOP_EXPLAINED_example_of_cod_and_explainations.pdfOOP_EXPLAINED_example_of_cod_and_explainations.pdf
OOP_EXPLAINED_example_of_cod_and_explainations.pdf
DerekDixmanChakowela
 
C++_notes.pdf
C++_notes.pdfC++_notes.pdf
C++_notes.pdf
HimanshuSharma997566
 
funadamentals of python programming language (right from scratch)
funadamentals of python programming language (right from scratch)funadamentals of python programming language (right from scratch)
funadamentals of python programming language (right from scratch)
MdFurquan7
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
Take advantage of C++ from Python
Take advantage of C++ from PythonTake advantage of C++ from Python
Take advantage of C++ from Python
Yung-Yu Chen
 
Getting Started Cpp
Getting Started CppGetting Started Cpp
Getting Started Cpp
Long Cao
 
Object.__class__.__dict__ - python object model and friends - with examples
Object.__class__.__dict__ - python object model and friends - with examplesObject.__class__.__dict__ - python object model and friends - with examples
Object.__class__.__dict__ - python object model and friends - with examples
Robert Lujo
 
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
2 + 2 = 5: Monkey-patching CPython with ctypes to conform to Party doctrine
Frankie Dintino
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
Michael Pirnat
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Brendan Eich
 
Introduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : NotesIntroduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : Notes
Subhajit Sahu
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional Programming
Adam Getchell
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Takayuki Shimizukawa
 
OOP_EXPLAINED_example_of_cod_and_explainations.pdf
OOP_EXPLAINED_example_of_cod_and_explainations.pdfOOP_EXPLAINED_example_of_cod_and_explainations.pdf
OOP_EXPLAINED_example_of_cod_and_explainations.pdf
DerekDixmanChakowela
 
funadamentals of python programming language (right from scratch)
funadamentals of python programming language (right from scratch)funadamentals of python programming language (right from scratch)
funadamentals of python programming language (right from scratch)
MdFurquan7
 
Ad

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Ad

Recently uploaded (20)

Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 

Shared Memory Parallelism with Python by Dr.-Ing Mike Muller

  • 1. Shared Memory Parallelism with Python PyData London 2014, Februray 22, 2014 London, Author: Dr.-Ing. Mike Müller Email: mmueller@python-academy.de
  • 2. Cython • Mixture between Python and C • Standard Python is valid Cython • Gradually add more C-ish features • Call existing C/C++ code (source and libs) • Compile to Python extension (*.so or *.pyd)
  • 3. Prominent Cython Users • SciPy • pandas • PyTables • zeromq • Sage
  • 4. Cython Workflow • Write Cython code in *.pyx file • Compile, i.e execute your setup.py • Get extension module (*.so, *.pyd) • Use your extension from Python
  • 5. Example - Cython Code # file: cy_101.pyx # could be used in Python directly def pure_python_func(a, b): return a + b # cannot be called from Python cdef double cython_func(double a, double b): return a + b # wrapper to call from Python def typed_python_func(a, b): return cython_func(a, b)
  • 6. Example - Compile • Use a setup.py file: from distutils.core import setup from Cython.Build import cythonize setup( name = 'cython101', ext_modules = cythonize("cy_101.pyx", annotate=True), ) • Compile to an extension: python setup.py build_ext --inplace
  • 7. Example - Use # file cy_101_test.py import cy_101 a = 10 b = 20 print cy_101.pure_python_func(a, b) print cy_101.typed_python_func(a, b) python cy_101_test.py 30 30.0
  • 8. Annotations - Cython at Work • 1778 lines of C code • Python (yellow) and pure C (white), click to see C source
  • 9. Pure Python Mode - Decorators are Great # file:cy_101_deco.py import cython @cython.locals(a=cython.double, b=cython.double) def add(a, b): return a + b
  • 10. Automate the Automation • pyximport tries to compile all pyx and py file with Cython cdef double cython_func(double a, double b): return a + b def typed_python_func(a, b): return cython_func(a, b) import pyximport pyximport.install() import cy_101_pyximport print(cy_101_pyximport.typed_python_func(3, 4))
  • 11. The Buffer Interface • NumPy-inspired standard to access C data structures • Cython supports it • Fewer conversions between Python and C data types typedef struct bufferinfo { void *buf; // buffer memory pointer PyObject *obj; // owning object Py_ssize_t len; // memory buffer length Py_ssize_t itemsize; // byte size of one item int readonly; // read-only flag int ndim; // number of dimensions char *format; // item format description Py_ssize_t *shape; // array[ndim]: length of each dimension Py_ssize_t *strides; // array[ndim]: byte offset to next item in each d Py_ssize_t *suboffsets; // array[ndim]: further offset for indirect indexi void *internal; // reserved for owner } Py_buffer;
  • 13. Example • a and b are 2D NumPy arrays with same shape • (a + b) * 2 + a * b • Size: 2000 x 2000
  • 14. With Multiprocessing def test_multi(a, b, pool): assert a.shape == b.shape v = a.shape[0] // 2 h = a.shape[1] // 2 quads = [(slice(None, v), slice(None, h)), (slice(None, v), slice(h, None)), (slice(v, None), slice(h, None)), (slice(v, None), slice(None, h))] results = [pool.apply_async(test_numpy, [a[quad], b[quad]]) for quad in quads] output = numpy.empty_like(a) for quad, res in zip(quads, results): output[quad] = res.get() return output • multiprocessing solution is 6 times slower than NumPy solution
  • 15. The Buffer Interface From Cython import numpy import cython @cython.boundscheck(False) @cython.wraparound(False) def func(object[double, ndim=2] buf1 not None, object[double, ndim=2] buf2 not None, object[double, ndim=2] output=None,): cdef unsigned int x, y, inner, outer if buf1.shape != buf2.shape: raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape, buf2.shape)) if output is None: output = numpy.empty_like(buf1) outer = buf1.shape[0] inner = buf1.shape[1]
  • 16. The Buffer Interface From Cython II for x in xrange(outer): for y in xrange(inner): output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 + buf1[x, y] * buf2[x, y]) return output
  • 17. Memory Views -Quadrant import numpy import cython @cython.boundscheck(False) @cython.wraparound(False) cdef add_arrays_2d_views(double[:,:] buf1, double[:,:] buf2, double[:,:] output): cdef unsigned int x, y, inner, outer outer = buf1.shape[0] inner = buf1.shape[1] for x in xrange(outer): for y in xrange(inner): output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 + buf1[x, y] * buf2[x, y]) return output
  • 18. Memory Views - Whole I @cython.boundscheck(False) @cython.wraparound(False) def add_arrays_2d(object[double, ndim=2] buf1 not None, object[double, ndim=2] buf2 not None, object[double, ndim=2] output=None,): cdef unsigned int v, h if buf1.size != buf2.size: raise TypeError('Arrays have different sizes: %d, %d' % (buf1.size, buf2.size)) if buf1.shape != buf2.shape: raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape, buf2.shape)) if output is None: output = numpy.empty_like(buf1)
  • 19. Memory Views - Whole II v = buf1.shape[0] // 2 h = buf1.shape[1] // 2 quad1 = slice(None, v), slice(None, h) quad2 = slice(None, v), slice(h, None) quad3 = slice(v, None), slice(h, None) quad4 = slice(v, None), slice(None, h) add_arrays_2d_views(buf1[quad1], buf2[quad1], output[quad1]) add_arrays_2d_views(buf1[quad2], buf2[quad2], output[quad2]) add_arrays_2d_views(buf1[quad3], buf2[quad3], output[quad3]) add_arrays_2d_views(buf1[quad4], buf2[quad4], output[quad4]) return output
  • 20. OpenMP • De-facto standard for shared memory parallel programming The OpenMP API supports multi-platform shared-memory parallel programming in C/C++ and Fortran. The OpenMP API defines a portable, scalable model with a simple and flexible interface for developing parallel applications on platforms from the desktop to the supercomputer. -- from openmp.org
  • 21. OpenMP with Cython - Threads I # distutils: extra_compile_args = -fopenmp # distutils: extra_link_args = -fopenmp import numpy import cython from cython cimport parallel
  • 22. OpenMP with Cython - Threads II @cython.boundscheck(False) @cython.wraparound(False) cdef int add_arrays_2d_views(double[:,:] buf1, double[:,:] buf2, double[:,:] output) nogil: cdef unsigned int x, y, inner, outer outer = buf1.shape[0] inner = buf1.shape[1] for x in xrange(outer): for y in xrange(inner): output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 + buf1[x, y] * buf2[x, y]) return 0
  • 23. OpenMP with Cython - Threads III @cython.boundscheck(False) @cython.wraparound(False) def add_arrays_2d(double[:,:] buf1 not None, double[:,:] buf2 not None, double[:,:] output=None,): cdef unsigned int v, h, thread_id if buf1.shape[0] != buf2.shape[0] or buf1.shape[1] != buf2.shape[1]: raise TypeError('Arrays have different shapes: (%d, %d) (%d, %d)' % ( buf1.shape[0], buf1.shape[1], buf2.shape[0], buf1.shape[1],)) if output is None: output = numpy.zeros_like(buf1)
  • 24. OpenMP with Cython - Threads IV v = buf1.shape[0] // 2 h = buf1.shape[1] // 2 ids = [] with nogil, parallel.parallel(num_threads=4): thread_id = parallel.threadid() with gil: ids.append(thread_id) if thread_id == 0: add_arrays_2d_views(buf1[:v,:h], buf2[:v,:h], output[:v,:h]) elif thread_id == 1: add_arrays_2d_views(buf1[:v,h:], buf2[:v,h:], output[:v,h:]) elif thread_id == 2: add_arrays_2d_views(buf1[v:,h:], buf2[v:,h:], output[v:,h:]) elif thread_id == 3: add_arrays_2d_views(buf1[v:,:h], buf2[v:,:h], output[v:,:h]) print ids return output
  • 25. OpenMP with Cython - Parallel Range I # distutils: extra_compile_args = -fopenmp # distutils: extra_link_args = -fopenmp import numpy import cython from cython cimport parallel
  • 26. OpenMP with Cython - Parallel Range II @cython.boundscheck(False) @cython.wraparound(False) def func(object[double, ndim=2] buf1 not None, object[double, ndim=2] buf2 not None, object[double, ndim=2] output=None, int num_threads=2): cdef unsigned int x, y, inner, outer if buf1.shape != buf2.shape: raise TypeError('Arrays have different shapes: %s, %s' % (buf1.shape, buf2.shape)) if output is None: output = numpy.empty_like(buf1) outer = buf1.shape[0] inner = buf1.shape[1]
  • 27. OpenMP with Cython - Parallel Range III inner = buf1.shape[1] with nogil, cython.boundscheck(False), cython.wraparound(False): for x in parallel.prange(outer, schedule='static', num_threads=num_threads): for y in xrange(inner): output[x, y] = ((buf1[x, y] + buf2[x, y]) * 2 + buf1[x, y] * buf2[x, y]) return output
  • 29. Conclusions • Cython + OpenMP allow to work without the GIL • Threads run in parallel for CPU-bound tasks • There is a price: • You need to write more code • You loose part of the Python safety net • You need to know C and learn Cython
  翻译: