Making your own modules and some numba tips#

Directory structure:

Consider ./ to be your current directory with the structure:

./                              Current directory (always on PYTHONPATH)
    mat_utils/                  Top-level directory of your package
      __init__.py               Initialize the your package
      module1.py                Submodule #1
      module2.py                Submodule #2

To run this notebook, create all the directories and files listed above.

What should you put in __init__.py#

You can actually have it empty, however, a good idea is to add a docstring (They are marked with """ """), add __all__, so import * works. You can also use it as a normal python file, define things there and other package attributes.

Contents of __init__.py for the module used here:

# This is a docstring
"""A package with useful functions for my notebooks."""
__all__ = ["module1", "module2"]
#from . import module1, module2

__version__="0.0.1"
import mat_utils
help(mat_utils)

from mat_utils import module1, module2
# we can also do the following, since we have __all__ in the __init_.py
# from mat_utils import *

# this is needed for the module to get autoupdated if you are working on them:
%reload_ext autoreload
%autoreload 2
Help on package mat_utils:

NAME
    mat_utils - A package with useful functions for my notebooks.

PACKAGE CONTENTS
    module1
    module2

DATA
    __all__ = ['module1', 'module2']

VERSION
    0.0.1

FILE
    /home/fc-3auid-3a41f39e02-2da958-2d4d3b-2d84bd-2d4a2f2886a11b/mods_test/mat_utils/__init__.py

Functions in module1:#

If you add docstrings after the function’s definition, they will be picked up by help() command. Which is quite useful, when you need to know f.e. arguments a function accepts.

from numba import njit

"""
Just a module :D
"""

def my_function():
    """
    function to print from which module it is called
    """
    print(f"funcntion from module:",__name__)


def slow_loop(n):
    """
    this loop goes slow
    Arguments:
        n - (float/int) loop length
    """
    j = 0.0
    for i in range(int(n)):
        j = j + 1.0
    return j

@njit
def fast_loop(n):
    """
    this loop goes slow (precompiled with numba)
    Arguments:
        n - (float/int) loop length
    """
    j = 0.0
    for i in range(int(n)):
        j = j + 1.0
    return j
help(module1)
help(module1.my_function)
Help on module mat_utils.module1 in mat_utils:

NAME
    mat_utils.module1 - Just a module :D

FUNCTIONS
    fast_loop(n)
        this loop goes slow (precompiled with numba)
        Arguments:
            n - (float/int) loop length
    
    my_function()
        function to print from which module it is called
    
    slow_loop(n)
        this loop goes slow
        Arguments:
            n - (float/int) loop length

FILE
    /home/fc-3auid-3a41f39e02-2da958-2d4d3b-2d84bd-2d4a2f2886a11b/mods_test/mat_utils/module1.py


Help on function my_function in module mat_utils.module1:

my_function()
    function to print from which module it is called

Updating modules:#

You can use:

%reload_ext autoreload
%autoreload 2

To autoupdate the modules you have imported in your notebook, while you are working on them.

module1.my_function()
funcntion from module: mat_utils.module1

Now you can update the module by f.e. adding print("extra stuff") at the end of my_function in module1 and rerunning it.

module1.my_function()
funcntion from module:mat_utils.module1
extra stuff

Simple numbla example in the module:#

help(module1.slow_loop)
Help on function slow_loop in module mat_utils.module1:

slow_loop(n)
    this loop goes slow
    Arguments:
        n - (float/int) loop length
%%time
n=1e9
a1 = module1.slow_loop(n)
a1
CPU times: user 24.6 s, sys: 187 µs, total: 24.6 s
Wall time: 24.7 s
1000000000.0
Numba:Not all the classes/functions can be njit'd. When using xarray, consider using `guvectorize` and xarray.apply_ufunc.
%%time
a2 = module1.fast_loop(n)
a2
CPU times: user 1.17 s, sys: 37.7 ms, total: 1.2 s
Wall time: 1.21 s
1000000000.0

When you should use numbla?#

Numba knows about numpy. You can precompile a function that operatates on them.

It is also possible to use numba vectorization in xarray.DataSet/DataArray and dask.arrays:

You can see a nice example in the xarray tutorial