Python and Julia

Jochen Schulz, Christoph Rügge

17.05.2016

general-purpose high-level scripting language

  • interpreted bytecode
  • dynamic type system + runtime type inference (late binding)
  • big community, libraries for almost everything
  • mature

high-level technical computing language

  • native just in time compiler (jit)
  • dynamic type system + compile time type inference
  • abstract types, subtyping
  • macros (can change code before it runs)
  • middle sized community
  • relatively young

Differences

Python

  • numerical arrays are fast, loops very slow:
    • C-arrays, runtime type inference on arrays
    • loops: runtime type inference per element
  • extensive library (numpy, scipy, sympy)

Julia

  • numerical computation generally fast:
    • type inference, jit
  • built-in numeric functionality
  • built-in parallelization
  • option to transparently call python-code

Mandelbrot

The set of all points \(c \in \mathbb{C}\) for which the sequence \[ z_0:=c\] \[z_{n+1} = z_n^2 +c, \quad n \in \mathbb{N}\] is bounded.

matrix

def mandelbox (cent, xsize, ysize):
    si = 2000
    [X,Y] = np.mgrid[cent[0]-xsize/2.:cent[0]+xsize/2.:si*1j,
                     cent[1]-ysize/2.:cent[1]+ysize/2.:si*1j]
    C = (X + 1j*Y)
    Z = np.copy(C)
    it_max = 500
    Anz = np.zeros(Z.shape)
    for k in range(1,it_max+1):
        Z = Z**2+C
        Anz += np.isfinite(Z)
function mandelbox(cent, xsize, ysize)
    si = 2000
    x = linspace(cent[1]-xsize/2., cent[1]+xsize/2., si)
    y = linspace(cent[2]-ysize/2., cent[2]+ysize/2., si)
    X, Y = [i for i in x, j in y], [j for i in x, j in y]
    C = (X + Y*im)
    Z = copy(C)
    it_max = 500
    Anz = zeros(si,si)
    for k in range(1,it_max)
        Z = Z.^2 + C
        Anz += isfinite(Z)
    end
end

loop

for i in range(0,si):
    for j in range(0,si):
        Z[i,j] = Z[i,j]**2 + C[i,j]
@inbounds for i in eachindex(C)
    Z[i] = Z[i]^2 + C[i]
end

pixelwise

def mandel(z):
    c = z
    maxiter = 50
    for n in range(0,maxiter):
        if abs(z) > 2:
            return n-1
        z = z**2 + c
    return maxiter
for a in range(0,si):
    for b in range(0,si):
        Anz[a,b] = mandel(C[a,b])
@everywhere function mandel(z)
    c = z
    maxiter = 50
    for n = range(1,maxiter)
        if abs(z) > 2
            return n-1
        end
        z = z^2 + c
    end
    return maxiter
end
for a in eachindex(C)
    Anz[a] = mandel(C[a])
end

parallel

def mandelonep(C, imin, imax):
    si = np.shape(C)
    res = np.zeros((imax-imin,))
    for idx in range(imin,imax):
        res[idx-imin] = mandel(C[np.unravel_index(idx,si)])

p = Pool(2)
slices = distribute(int(si*si),2)

results = [p.apply_async(mandelonep, (C, imin, imax))
    for (imin, imax) in slices]
for r, (imin, imax) in zip(results, slices):
    Anz[np.unravel_index(range(imin,imax),(si,si))] = r.get()
function mandelonep(cent, xsize, ysize, si, C,Anz)
  @sync @parallel for a in eachindex(C)
    Anz[a] = mandel(C[a])
  end

Anz = SharedArray(Float64,(si,si))
mandelonep(cent, xsize, ysize, si, C, Anz)

performance

type python julia
matrix 50s 93s
loop 86m 20s 38s
pixelwise 31m 59s 11s
parallel 23m 22s 6.15s

Note: for matrix: internally parallized through BLAS

Improvement

Python

  • numpy tricks (mainly slices)
  • cython (fixing types at compile time)
  • numba (just in time compiler)
  • shared and distributed parallelization

Julia

  • disable bound-checks (@inbounds)
  • shared and distributed parallelization (@parallel, remotecall)
  • devectorize code (@devec)
  • SIMD (@simd)

multiple dispatch

functions have specific methods for specific argument types

function mandel(z::Complex{Float64})

function mandel(z::Array{Float64,2})
  • different julia implementations for different types
  • different compiled versions for different concrete types

type annotations

python: cython

type annotation + compilation

def cython_mandel(double x,double y):
    cdef np.ndarray[double,ndim=1] x = linspace(-2.1,1.2,pointsx)
    cdef np.ndarray[double,ndim=1] y = linspace(-1.1,1.1,pointsy)
  • can significantly speed up loops

julia

::Int

see types:

@code_warntype somecode
  • type annotations for dispatch
  • often unneccessary for performance
  • useful for error reporting
  • need to write "well-typed" code (type stability)
  • non-obvious subtleties, pitfalls

python interface

can call python transparently (without knowing much of python)

using PyCall
@pyimport mayavi.mlab as m
a = m.surf(Anz)
m.colorbar()
m.show()

Subjective conclusions

How easy?

  • both clean and intuitive language
  • julia: easy access to advanced techniques, more coherent language
  • python: standard array computations are somewhat simpler, more efficient

How fast?

  • both fast in array-operations
  • julia: also fast in loop/elementwise

How practical?

  • python: extensive library
  • julia: numerical calculations
    • needs a little more understanding, potential pitfalls

Thank You!