Python Extensions in Rust with Jupyter Notebooks

December 27, 2023
dataframepythonrust

The Rust programming language has gotten more prominent for writing compiled Python extensions. Currently, there is a bunch of boilerplate for wrapping writing up a Rust function and making it callable from Python. I enjoy exploring and prototyping code in Jupyter Notebooks, so I developed rustimport_jupyter to compile Rust code in Jupyter and have the compiled code available in Python! In this blog post, I will showcase a simple function, NumPy function, and Polar expression plugins. This blog post is runnable as a notebook on Google Colab.

Simple Rust Functions

rustimport_jupyter builds on top of rustimport to compile Python extensions written in Rust from Jupyter notebooks. After installing the rustimport_jupyter package from PyPI, we load the magic from within a Jupyter notebook:

%load_ext rustimport_jupyter

Next, we define a double function in Rust and prefixing the cell with the %%rustimport marker:

%%rustimport
use pyo3::prelude::*;

#[pyfunction]
fn double(x: i32) -> i32 {
    2 * x
}

The %%rustimport marker compiles the Rust code and imports the double function into the Jupyter notebook environment. This means, we can directly call it from Python!

double(34)
Out[4]:
68

By default, %%rustimport is compiles without Rust optimizations. We can enable these optimizations by adding the --release flag:

%%rustimport --release
use pyo3::prelude::*;

#[pyfunction]
fn triple(x: i32) -> i32 {
    3 * x
}
triple(7)
Out[6]:
21

NumPy in Rust

Rust's ecosystem contains many third party libraries that is useful for writing our custom functions. rustimport defines a custom //: comment syntax that we can use to pull in write our extensions. In this next example, we use PyO3/rust-numpy to define a NumPy function that computes a*x-y in Rust:

%%rustimport --release
//: [dependencies]
//: pyo3 = { version = "0.20", features = ["extension-module"] }
//: numpy = "0.20"

use pyo3::prelude::*;
use numpy::ndarray::{ArrayD, ArrayViewD};
use numpy::{IntoPyArray, PyArrayDyn, PyReadonlyArrayDyn};

fn axsy(a: f64, x: ArrayViewD<'_, f64>, y: ArrayViewD<'_, f64>) -> ArrayD<f64> {
    a * &x - &y
}

#[pyfunction]
#[pyo3(name = "axsy")]
fn axsy_py<'py>(
    py: Python<'py>,
    a: f64,
    x: PyReadonlyArrayDyn<'py, f64>,
    y: PyReadonlyArrayDyn<'py, f64>,
) -> &'py PyArrayDyn<f64> {
    let x = x.as_array();
    let y = y.as_array();
    let z = axsy(a, x, y);
    z.into_pyarray(py)
}

The pyo3(name = "axsy") Rust macro exports the compiled function as axsy in Python. We can now use axsy directly in Jupyter:

import numpy as np

a = 2.4
x = np.array([1.0, -3.0, 4.0], dtype=np.float64)
y = np.array([2.1, 1.0, 4.0], dtype=np.float64)

axsy(a, x, y)
Out[8]:
array([ 0.3, -8.2,  5.6])

Polars Expression Plugin

Recently, Polars added support for expression plugins to create user defined functions. With rustimport_jupyter, we can prototype quickly on an Polars expression directly in Jupyter! In this example, we compile a pig-laten expression as seen in Polar's user guide:

%%rustimport --module-path-variable=polars_pig_latin_module
//: [dependencies]
//: polars = { version = "*" }
//: pyo3 = { version = "*", features = ["extension-module"] }
//: pyo3-polars = { version = "0.9", features = ["derive"] }
//: serde = { version = "*", features = ["derive"] }

use pyo3::prelude::*;
use polars::prelude::*;
use pyo3_polars::derive::polars_expr;
use std::fmt::Write;

fn pig_latin_str(value: &str, output: &mut String) {
    if let Some(first_char) = value.chars().next() {
        write!(output, "{}{}ay", &value[1..], first_char).unwrap()
    }
}

#[polars_expr(output_type=Utf8)]
fn pig_latinnify(inputs: &[Series]) -> PolarsResult<Series> {
    let ca = inputs[0].utf8()?;
    let out: Utf8Chunked = ca.apply_to_buffer(pig_latin_str);
    Ok(out.into_series())
}

Note that we use --module-path-variable=polars_pig_latin_module, which saves the compiled module path as polars_pig_latin_module. With polars_pig_latin_module defined, we configure a language namespace for the Polars DataFrame:

import polars as pl

@pl.api.register_expr_namespace("language")
class Language:
    def __init__(self, expr: pl.Expr):
        self._expr = expr

    def pig_latinnify(self) -> pl.Expr:
        return self._expr.register_plugin(
            lib=polars_pig_latin_module,
            symbol="pig_latinnify",
            is_elementwise=True,
        )

With the language namepsace defined, we can now use it with Polars:

df = pl.DataFrame(
    {
        "convert": ["pig", "latin", "is", "silly"],
    }
)

out = df.with_columns(
    pig_latin=pl.col("convert").language.pig_latinnify(),
)
print(out)
shape: (4, 2)
┌─────────┬───────────┐
│ convert ┆ pig_latin │
│ ---     ┆ ---       │
│ str     ┆ str       │
╞═════════╪═══════════╡
│ pig     ┆ igpay     │
│ latin   ┆ atinlay   │
│ is      ┆ siay      │
│ silly   ┆ illysay   │
└─────────┴───────────┘

Conclusion

For those who like prototyping and exploring in Jupyter notebooks, rustimport_jupyter enables you to explore the Rust ecosystem while easily connecting it to your Python code. You can try out the library by installing it with: pip install rustimport_jupyter 🚀!

Similar Posts

08/15/23
Quick NumPy UFuncs with Cython 3.0
05/14/23
Accessing Data from Python's DataFrame Interchange Protocol
09/12/18
Survival Regression Analysis on Customer Churn
07/31/18
Nuclei Image Segmentation Tutorial
08/28/17
Rodents Of NYC