SKIL Documentation

Skymind Intelligence Layer

The community edition of the Skymind Intelligence Layer (SKIL) is free. It takes data science projects from prototype to production quickly and easily. SKIL bridges the gap between the Python ecosystem and the JVM with a cross-team platform for Data Scientists, Data Engineers, and DevOps/IT. It is an automation tool for machine-learning workflows that enables easy training on Spark-GPU clusters, experiment tracking, one-click deployment of trained models, model performance monitoring and more.

Get Started

Quick Start

The community edition of the Skymind Intelligence Layer (SKIL) is free. It is a notebook-oriented platform that takes data science projects from prototype to production deployments quickly and easily. SKIL bridges the gap between the Python data science ecosystem and production JVM environments with a cross-team platform for Data Scientists, Data Engineers, and DevOps/IT. It is an automation tool for machine-learning workflows that enables easy training on Spark-GPU clusters; experiment tracking; one-click deployment of trained models; model performance monitoring and more.

In this quick start, you’ll learn how to:

  • Download and install SKIL CE
  • Create a sample workspace notebook and train a model
  • Deploy the model to the SKIL model server
  • Get a prediction from the REST endpoint
    • Via a Web browser
    • Via Java code

Check System Requirements

Ensure that your machine meets the System Requirements before continuing.

If you need to quickly launch a machine that is compatible with SKIL, read our guide to launching an Amazon AMI.

Install SKIL

Prerequisites

Ensure that your machine has an installation of GIT and Apache Maven.

SKIL is distributed in a couple flavours:

This quickstart assumes you are using the RedHat/CentOS distribution package.

To install SKIL on your RedHat machine, first add the following contents to a new file at /etc/yum.repos.d/skymind.repo:

[Skymind]
name=Skymind Repository
baseurl=http://packages.skymind.io/rpm/1.0
gpgcheck=0 

Then, run an update of your package manager and enter the installation commands:

sudo yum update
sudo yum install skil-server 
sudo yum install skil-server-miniconda
sudo yum install skil-server-spark
sudo yum install skil-server-interpreter

That's it! SKIL is now installed on your system.

Start SKIL CE

We recommend disabling SELinux since it can mistakenly terminate many processes running on non-standard ports. Run setenforce 0 to disable it temporarily. Or follow this guide to disable it permanently.

Make sure that ports 9008 and 8080 are open.

To start the SKIL CE system locally, use the following command:

[centos@skil ~]$ sudo systemctl start skil

Wait about a minute for SKIL to finish starting. Everything is up and running when the output of jps looks like so:

$ jps
38802 ZeppelinInterpreterMain
43044 Jps
38341 ModelHistoryServer
38295 ProcessLauncherDaemon
38363 ZeppelinMain

Once SKIL CE is running locally, you can log in to build and deploy your first model.

Login to SKIL

To log in to a local installed version of SKIL CE, visit the address http://[host]:9008 in a Web browser. This will lead to the login screen below.

For the Community Edition of SKIL, there's a single account and login:

Login: “admin”
Password: “admin”

SKIL CE is meant to be used and evaluated by a single data scientist, so in this quick start article, we will ignore role and account management.

Next, this guide will how how to build a notebook and deploy it to the model server.

Taking a Model from Notebook to Deployment

The purpose of SKIL and SKIL CE is to prototype deep learning applications quickly, and deploy the best models to a production-grade AI model server.

Deep learning models can be complex to build and unwieldy to deploy. SKIL addresses these pain points for data scientists and infrastructure engineering teams. Deep learning has a wide domain of applications in predictive analytics and machine perception, and it impacts nearly every industry.

Here are a few deep-learning models that can integrate into our applications:

  • ICU Patient Mortality
  • Computer Vision
  • Anomaly Detection

Beyond building the above models, SKIL enables developers to plug their predictive engines into real-world applications such as:

  • A Tomcat Web application
  • A Wildfly application
  • A mobile application
  • A streaming system (Streamsets)
  • Robotics systems and edge devices (Somatic, Lego Mindstorms)

The SKIL platform helps data scientists build deep learning models and integrate them into applications with a workbench and AI model server.

You can track multiple versions of a model, compare how each performs after training, and deploy the best model to production on SKIL’s AI model server with a single click.

With SKIL CE, developers can start building real, state-of-the-art deep learning applications without worrying about infrastructure and plumbing.

The diagram below provides a general overview of how the SKIL Workspace system and SKIL’s AI model server work together to provide an enterprise-class platform for operationalizing deep learning applications.

Here’s how to build your first SKIL model with SKIL CE.

Create Workspace

To get started, create a new workspace in the SKIL user interface by following these steps:

1) Click on the workspaces tab on the left side of the main SKIL interface to bring up the workspaces screen, as seen in the image below.

Every new workspace is a place to conduct a set of “experiments” centered around a particular project.

“Experiments” are just different configurations of neural net models and data pipelines applied to a given problem, and a workspace is effectively a shared lab that data scientists can use to test which neural networks perform best on that problem. An experiment in SKIL can be contained in a Zeppelin notebook, which allows you to save, clone, and run a specific modeling job as needed.

Each notebook is tracked and indexed by SKIL, and a trained model configured in the notebook can be sent directly to SKIL’s AI model server. A SKIL Workspace can have many different notebooks for different experiments conducted as data scientists seek the best model. Let’s say that after a few experiments, you find a model that performs well and you want to integrate it with an application in production.

2) Create a Workspace

After clicking on the workspaces tab on the left, click on the “Create Workspace” button on the right side of the page (see below).

Clicking on the “Create Workspace” button brings up this dialog window:

In this window, you name an experiment to distinguish it from other experiments that will be created in this common workspace.

For this tutorial, please call this experiment “First Sensor Project.” Optionally, we can add a few labels to this workspace to help identify it later (e.g., “side_project”, or “prod_job”, etc). When you’re ready to finalize the new workspace, you can click “Create Workspace” on the lower right corner of the window.

A new workspace should appear in the list of workspaces:

Clicking on the name of the workspace just created (“First Sensor Project”) will show the workspace details:

Now you are ready to create the first experiment in the new workspace.

Create Experiment

Inside a workspace, you can create experiments contained in Zeppelin notebooks. You and your team can create, run, and clone these notebook-experiments to improve collaboration and speed up time to insight.

By clicking on the “Create New Experiment” button, you’ll bring up the “Create Experiment” dialog window (below).

Give this experiment a unique and descriptive name that will make it easy to find later, using the input box under “Experiment Name.”

Select the only listed option for “Model History Server ID” and “Zeppelin Server ID” (in the present version of SKIL CE, there will be only one option for each of them). You can also provide a distinct notebook name that will apply within the Zeppelin notebook storage system.

Once you’re done setting up the experiment and its notebook, click the “Create Experiment Notebook” button. The new experiment should appear in the list of experiments for the current workspace, as seen below.

With the experiment created, you can check out the associated experiment notebook by clicking the “Open Notebook” button for the new experiment. That will bring up the embedded notebook system (below).

Note: the first time we use a notebook in SKIL, we need to initialize the SKIL system. Click “Login” within the Zeppelin window and use an “admin” username and “admin” password.

Once logged in, click the “Notebook” dropdown and select your experiment’s notebook.

Each notebook starts out with a generic template containing DL4J code that would serve as the basis of a typical project.

For this example, you’re going to build an LSTM model based on sensor data. The code for this notebook is here:

uci_quickstart_notebook.scala

In this specific example, delete the blocks of template code and copy and paste blocks of code from the GitHub link above into the notebook. The notebook should autosave, but at any point, you can make a specific commit into the Zeppelin version control system with the “version control” button, as shown below.

Click on the version control button inside Zeppelin to optionally add a commit message to save the current state. Once the notebook code is entered into the notebook itself and saved, you’re ready to execute this notebook and produce the model.

Run Experiment

To run the experiment in the notebook, click on the “play” icon on the top toolbar inside the embedded Zeppelin notebook UI. This will run all of the code paragraphs inside that notebook.

The notebook will take some time to run. Once it’s complete, the output of the notebook will be visible within the notebook itself.

This guide will walk through a few of the notebook's code snippets. The first paragraph in the notebook runs all the needed imports (in Scala) and sets the other paragraphs up for execution. There are four major functional areas within this notebook:

  1. UCI data download and data prep / ETL
  2. Neural network configuration
  3. Network Training loop
  4. Registering modeling results with SKIL model server

The first area of the code we’ll highlight downloads the training data and performs some basic ETL on the code. The link takes us to code where we see the raw CSV data:

  1. Loaded from disk
  2. Converted into sequences
  3. Statistics about the data are collected
  4. Sequences are then normalized/standardized

The next area of code is where we use the DL4J API to configure the neural network. In this example, we’re using a variant of a recurrent neural network called a long short-term memory network (LSTM).

// Configure the network
val conf = new NeuralNetConfiguration.Builder()
  .seed(123)
  .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
  .weightInit(WeightInit.XAVIER)
  .updater(new Nesterovs(0.005, 0.9))
  .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
  .gradientNormalizationThreshold(0.5)
  .graphBuilder()
  .addInputs("input")
  .setInputTypes(InputType.recurrent(1))
  .addLayer("lstm", new GravesLSTM.Builder().activation(Activation.TANH).nIn(1).nOut(10).build(), "input")
  .addVertex("pool", new LastTimeStepVertex("input"), "lstm")
  .addLayer("output", new OutputLayer.Builder(LossFunction.MCXENT)
            .activation(Activation.SOFTMAX).nIn(10).nOut(numLabelClasses).build(), "pool")
  .setOutputs("output")
  .pretrain(false)
  .backprop(true)
  .build()

The network has a single LSTM hidden layer (Graves variant) and a softmax output layer to give probabilities across the six classes of time series data. The training loop for the network is for 40 epochs (each epoch is a complete pass over all records in the training set).

val nEpochs = 40
for (i <- 0 until nEpochs) {
  network_model.fit(trainData)
  // Evaluate on the test set:
  val evaluation = eval(testData)
  val accuracy = evaluation.accuracy()
  val f1 = evaluation.f1()
  println(s"Test set evaluation at epoch $i: Accuracy = $accuracy, F1 = $f1")
  testData.reset()
  trainData.reset()
}
// Right below the training loop in the notebook, a few debug lines show how to query an LSTM network.
// Test one record (label should be 1)
val record = Array(Array(Array(
  -1.65, 1.38, 1.37, 2.56, 2.72, 0.64, 0.76, 0.45, -0.28, -2.72, -2.85, -2.27, -1.23, -1.42, 0.90,
  1.81, 2.77, 1.12, 2.25, 1.26, -0.23, -0.27, -1.74, -1.90, -1.56, -1.35, -0.54, 0.41, 1.20, 1.59,
  1.66, 0.75, 0.96, 0.07, -0.70, -0.32, -1.13, -0.77, -0.96, -0.55, 0.39, 0.56, 0.52, 0.98, 0.91,
  0.23, -0.13, -0.31, -0.98, -0.73, -0.85, -0.77, -0.80, -0.04, 0.64, 0.77, 0.50, 0.98, 0.40, 0.24
)))
val flattened = ArrayUtil.flattenDoubleArray(record)
val input = Nd4j.create(flattened, Array(1, 1, 60), 'c')
val output = network_model.output(input)
val label = Nd4j.argMax(output(0), -1)
println(s"Label: $label")

The label returned from the network prediction should be “1”, which we’ll hand check from the client side in a moment. The notebook ends with a block of code that collects the model just trained and catalogs it in the model-history tracking system.

Each experiment needs to send a model and its evaluation metrics to the model server to be registered and archived. Each SKIL notebook must include a small amount of code, explained below, to make sure the model gets stored in the right place.

val modelId = skilContext.addModelToExperiment(z, network_model, "LSTM model")
val evalId = skilContext.addEvaluationToModel(z, modelId, evaluation, "Test set")

In addition to the correct import headers and creating the skilContext object near the top of the notebook, these lines of code connect this notebook with the result of the SKIL system.

In the first line, the specific model is attached to the experiment in the SKIL system. In the next line, the evaluation metric results are cataloged with the model ID tag in SKIL, so the model can be evaluated in the UI against other models later.

Catalog Model

The AI model server allows SKIL to store and integrate deep learning models with AI applications. It stores all of the model revisions for a given experiment, and lets you choose which model you’d like to "deploy" or mark as “active”. Deploying a model means that it will be the model that serves the predictions to any production applications that query the REST endpoint.

After the above notebook example is done running in SKIL, click on the “Models” sub-tab in the experiment page to see the new model listed in the table (as below). This may require a page refresh.

Now that you’ve built a model with the notebook and made sure the model was cataloged in the model server, the next guide will show how to expose the model to the rest of the world through the REST interfaces by deploying the model.

Deploy Model

Once the model is indexed in the model history server, it will show up in the list of models that can be deployed to production to handle new data inference requests.

In the “Models” sub-tab in the experiment page, there’s a list of all models produced by notebook runs for this experiment, as seen above. Clicking on one of the models in the list will bring up specific model details (below).

For each model in this list, two operations can be performed:

  1. Mark Best
  2. Deploy

Marking a model as “Best” will pin it to the top of the model list, as seen above. Clicking the “Deploy Wizard” button brings up a “deploy model” dialog window.

Within SKIL, a “deployment” is a “logical group of models, transforms, and KNN endpoints”. It helps us logically group deployed components to track what goes together in order to manage the system better.

As explained in the dialog, this wizard will make your model available via a REST API. Then you’ll be able to expose the ETL process as a transform endpoint, and configure the model. You also have the option to update an existing model in place. Clicking “next” will let you either create a new deployment or replace an existing one, as seen in the dialog below.

Let’s create a new deployment and name it “OurFirstDeployment”. In the dialog window that comes up after pressing the “Next” button, you see the option to “deploy the current ETL JSON as a transform”.

This refers to the vectorization of data for ETL, an explanation of which is beyond the scope of this quick start article. For now, you can leave that checkbox unchecked. Clicking “Next” again takes us to the final deployment wizard screen (below).

The “name” option here is different from “deployment name”, as it distinguishes the model inside the deployment group. It is required. We’ll use the name “lstm_model” for this example. You can also see the static file path for the physical model file in the local filesystem.

The “Scale” option tells the system how many model servers to start for model replication. SKIL CE is limited to 1 model server, so you don’t have to change that parameter.

In the next line, the option exists to provide additional JVM arguments. The last option, “Endpoint URLs”, gives the option of housing multiple models under the same URI. We won’t set this option in the course of this quick start tutorial. Accept the “Deployment Mode” as “New Model”, and then click “Deploy” to finalize the deployment.

Once the deployment is finalized by clicking the “Deploy” button, the model will be listed in the “Deployments” screen (below).

Clicking on the entry for this newly deployed model brings up the deployment details (below).

NOTE: If the model is not deployed, click the “Start” button on the model:

This deployment includes its respective ETL vectorization transforms. It also includes the endpoint URLs tied to the model server in the “endpoint” column. Here’s how to get live predictions via a REST interface from this newly deployed model:

TKTK

Troubleshooting

You may see the following error at some point (Deploying Model, No JWT present or has expired):

If so, leave the current browser tab open. Create a new tab and log in again. Close the new tab once you’re logged back in. Try once more to perform the original action that caused the error.

Inference

Get Predictions via Model Server

Once a deployment has been created and launched, you need to go the “last mile” by actually serving live predictions from this newly created AI model to a real application.

In this section, you’ll learn how to set up a sample Java client (which could easily be integrated into a Tomcat or Java application) to query the model you just built with the notebook.

The model server is running locally and it has exposed a REST endpoint at:

http://[host]:9008/endpoints/ourfirstdeployment/model/lstmmodel/default/

Now configure the client to “speak REST” properly and send the specific query using input data that you select. Let’s take a look at how to get these predictions, with the REST Java client sample code below.

Get Predictions via REST

To get a working SKIL model server client, “git clone” the project stored here

https://github.com/SkymindIO/SKIL_Examples

With the following command:

$ git clone git@github.com:SkymindIO/SKIL_Examples.git

Build the client application JAR with the following commands:

$ cd SKIL_Examples
$ cd sample-api
$ mvn clean package

Once you build the client application, use the JAR to make a REST call to SKIL’s AI model server with the following command:

$ java -jar target/skil-ce-examples-1.0.0.jar quickstart http://host:9008/[skil_endpoint_name]

Note: Replace [skil_endpoint_name] with the endpoint to which your model was deployed.

The output of the client example code should look like this:

Inference response: Inference.Response.Classify{results{[1]}, probabilities{[0.9729845]}
  Label expected: 1
Inference response: Inference.Response.Classify{results{[4]}, probabilities{[0.8539419]}
  Label expected: 4
Inference response: Inference.Response.Classify{results{[5]}, probabilities{[0.9414516]}
  Label expected: 5
Inference response: Inference.Response.Classify{results{[5]}, probabilities{[0.94135857]}
  Label expected: 5
The output above shows inference predictions the SKIL AI model server returns, referring to specific labels in the classifier.

With that, you’ve built your first deep learning application with SKIL, from notebook to deployed production model. Not so hard, was it?

Watch this space for further tutorials on more applications of deep learning in production.

Quick Start