Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Jupyter Kernel for SystemDS #1998

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions scripts/jupyterkernel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@

# SystemDS Kernel Setup Guide

This README outlines the steps to set up a Jupyter kernel for SystemDS, focusing on building SystemDS from source, preparing a custom repository with specific dependencies, and setting up a Jupyter kernel.

## Prerequisites

- Java JDK 11 or later
- Maven
- Git
- Jupyter Notebook or JupyterLab
Comment on lines +7 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the operating system this README is written for.


## Step 1: Clone and Build SystemDS

Clone the Apache SystemDS repository and build it. This step ensures that the SystemDS JAR is locally available.

[How to install SystemDS?](https://apache.github.io/systemds/site/install.html)

After the build, save the JAR to your local Maven Repo.

```bash
mvn install:install-file -Dfile=path/to/your/jar-file.jar -DgroupId=org.apache.sds -DartifactId=sds -Dversion=3.2.0 -Dpackaging=jar
```
Note: Ensure that any modifications to groupId, artifactId, and version are carefully mirrored in the pom.xml dependencies section of the Kernel. Inconsistencies between these identifiers in your project setup and the pom.xml file can lead to build failures or dependency resolution issues.

## Step 2: Set Up Your Repository

Clone SystemDS kernel.

```bash
git clone https://github.com/kubieren/SystemDSKernel.git
cd SystemDSKernel/kernelsds
```
Comment on lines +31 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the code from the main SystemDS repo, which will be available after this PR is merged.


Build your project, which is configured with Maven. This step compiles your code and packages it, taking into account the dependencies specified in your `pom.xml`.

```bash
mvn clean package
```

## Step 3: Create the Kernel Specification

Navigate to or create a directory where you wish to store your kernel's configuration. For example, create `my_systemds_kernel` in your home directory:

```bash
mkdir -p ~/my_systemds_kernel
```

Within this directory, create a `kernel.json` file with the following content, adjusting the path to your JAR file as necessary:

```json
{
"argv": ["java", "-jar", "/path/to/your/kernelJarFile/kernelsds-1.0-SNAPSHOT.jar", "{connection_file}"],
"display_name": "SystemDS Kernel",
"language": "systemds",
"interrupt_mode": "message"
}
```

## Step 4: Install the Kernel Specification

Install your kernel specification with Jupyter by running:

```bash
jupyter kernelspec install ~/path_to_my_systemds_kernel --user
```

This command makes the SystemDS kernel available to Jupyter.

## Step 5: Launch Jupyter Notebook

Start Jupyter Notebook or JupyterLab:

```bash
jupyter notebook
```

or

```bash
jupyter lab
```

You should now be able to create new notebooks with the "SystemDS Kernel" option.

## Conclusion

Follow these steps to integrate SystemDS with Jupyter Notebook, allowing you to execute SystemDS operations directly from Jupyter notebooks. Ensure all paths and URLs are correct based on your environment and where you've placed the SystemDS JAR file.
96 changes: 96 additions & 0 deletions scripts/jupyterkernel/kernelsds/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.kernelsds.app</groupId>
<artifactId>kernelsds</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.11.0</version>
<configuration>
<source>11</source>
<target>11</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.3.0</version>


<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<mainClass>com.kernelsds.app.App</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<createDependencyReducedPom>true</createDependencyReducedPom>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.kernelsds.app.App</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<name>kernelsds</name>
<url>http://maven.apache.org</url>
<repositories>
<repository>
<id>oss-sonatype-snapshots</id>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.github.spencerpark</groupId>
<artifactId>jupyter-jvm-basekernel</artifactId>
<version>2.3.0</version>
</dependency>

<dependency>
<groupId>org.apache.sds</groupId>
<artifactId>sds</artifactId>
<version>3.2.0</version>
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
package com.kernelsds.app;


import io.github.spencerpark.jupyter.channels.JupyterConnection;

import io.github.spencerpark.jupyter.kernel.KernelConnectionProperties;
import io.github.spencerpark.jupyter.channels.JupyterSocket;

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.logging.Level;

public class App
{
public static void main( String[] args ) throws Exception {
if (args.length < 1)
throw new IllegalArgumentException("Missing connection file argument");

Path connectionFile = Paths.get(args[0]);

if (!Files.isRegularFile(connectionFile))
throw new IllegalArgumentException("Connection file '" + connectionFile + "' isn't a file.");

String contents = new String(Files.readAllBytes(connectionFile));

//JupyterSocket.JUPYTER_LOGGER.setLevel(Level.WARNING);
JupyterSocket.JUPYTER_LOGGER.setLevel(Level.WARNING);

KernelConnectionProperties connProps = KernelConnectionProperties.parse(contents);
JupyterConnection connection = new JupyterConnection(connProps);
Comment on lines +28 to +32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add detailed comments for readability and debuggability, especially for the code that configure flags.


ISystemDsKernel kernel = new ISystemDsKernel();
//ISystemDsKernel k = new ISystemDsKernel();
kernel.becomeHandlerForConnection(connection);

connection.connect();
connection.waitUntilClose();
}

}
Loading
Loading