SKIL Documentation

Skymind Intelligence Layer

The community edition of the Skymind Intelligence Layer (SKIL) is free. It takes data science projects from prototype to production quickly and easily. SKIL bridges the gap between the Python ecosystem and the JVM with a cross-team platform for Data Scientists, Data Engineers, and DevOps/IT. It is an automation tool for machine-learning workflows that enables easy training on Spark-GPU clusters, experiment tracking, one-click deployment of trained models, model performance monitoring and more.

Get Started

Command Line (CLI)

SKIL comes with a built-in command line interface (CLI) for advanced setup and administrative tasks. Sometimes you will need to manually instantiate a process with special variables, or want to write a custom shell script to manage SKIL for your own applications - the CLI is useful for these tasks.

The SKIL Command

The SKIL command is a python wrapper typically located in /opt/skil/sbin if you have installed using an RPM distribution file or Docker. You can add /opt/skil/sbin to your PATH environment variable and use the skil command from anywhere or cd into /opt/skil/sbin and use ./skil to run the command.

Executing ./sbin/skil will give the following output:

bash-4.2$ ./sbin/skil
SKIL_HOME not set. Using directory: /opt/skil
SKIL_CLASS_PATH not set. Using: /opt/skil/lib/*:/opt/skil/native/*:/opt/skil/jackson-2.5.1/*
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/skil/lib/zeppelin-spark_2.10-0.7.3_skil-1.0.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/skil/lib/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/skil/lib/logback-classic-1.1.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Please login first with skil login --userId username --password password
usage: skil.py [-h] [--host HOST] [--port PORT]
              {npm,modelhistory,processes,inference,pkill,agents,nearestneighbor,parameter_server_master,media_driver,zeppelinInterpreter,addplugin,parameter_server_slave,ui,zeppelin,services,datavec,spark,arbiter,parallelwrapper,loadbalancer,login}
               ...
skil.py: error: too few arguments

Note that you will need to authenticate your SKIL client. Optional arguments for each SKIL command include defining a custom port and host if your SKIL server is not listening on a default location. If this is the case, you can add these when using each command:

$ ./sbin/skil --host 192.168.1.1 --port 9008 {command goes here}

Available Commands

The SKIL CLI exposes a list of commands related to managing processes, performing inference, setting up parallel wrappers (for model training). Most of these commands are for starting specific services, and defining the port and custom name of the service.

Note that you can list existing processes and services by using the ./skil processes and ./skil services commands respectively.

Command
Flags
Comments

modelhistory

[--userName USERNAME]                            
[--modelHistoryPort MODELHISTORYPORT]
[--password PASSWORD] 
[--name NAME]
[--dbPort DBPORT] 
[--jvmArgs JVMARGS]
[--agentId AGENTID] 
[--dbName DBNAME]
[--dbPath DBPATH]

SKIL creates a Default Model History server on start up. You can create a separate model history server to isolate models between departments or projects.

processes

Returns a JSON representation of all the running processes for monitoring purposes.

inference

[--batchLimit BATCHLIMIT]
[--queueLimit QUEUELIMIT] 
[--name NAME]
[--inputNames INPUTNAMES] 
[--jvmArgs JVMARGS]
[--workers WORKERS] 
[--inferenceMode INFERENCEMODE]
[--modelHistoryServerUrl MODELHISTORYSERVERURL]
[--predictServerPort PREDICTSERVERPORT]
[--outputNames OUTPUTNAMES] 
[--agentId AGENTID]
[--modelUri MODELURI]

Creates a stand-alone model server. SKIL's deployments feature uses this command under the hood. You can use this command for advanced deployment scenarios.

pkill

[--id [enter a process ID]]

This will delete a process and it's configuration from Zookeeper so it will no longer be restored on start-up. The Default Model History Server or Default Zeppelin processes will still be recreated unless they are disabled. You should only use pkill on processes you've created using the CLI.

agents

Returns the list of agents in the cluster

nearestneighbor

[--name NAME] 
[--jvmArgs JVMARGS]
[--labelsPath LABELSPATH]
[--ndarrayPath NDARRAYPATH]
[--nearestNeighborsPort NEARESTNEIGHBORSPORT]
[--similarityFunction SIMILARITYFUNCTION]
[--agentId AGENTID] [--invert INVERT]

Starts a stand alone KNN server outside of deployments. Useful for advanced deployment scenarios.

zeppelinInterpreter

[--zeppelinInterpreterDir ZEPPELININTERPRETERDIR]
[--name NAME] 
[--jvmArgs JVMARGS]
[--interpreterPort INTERPRETERPORT]
[--zeppelinHome ZEPPELINHOME]
[--agentId AGENTID]

Start a new zeppelin interpreter process. SKIL creates a default zeppelin server and zeppelin interpreter process but you can add more use this command to add other servers for specific team members or running long running pre-processing or training jobs.

Zeppelin Interpreter processes require a zeppelin server process to be up and running.

addplugin

[enter path to plugin JAR]

Uploads a plugin JAR file. Used mostly in custom transform process steps or for defining DataSetProvider classes for Spark or Parallel Wrapper jobs.

ui

[--jvmArgs JVMARGS] 
[--enableRemote ENABLEREMOTE]
[--agentId AGENTID] 
[--name NAME] 
[--uiPort UIPORT]

Starts a DL4J UI server for visualizing the model training process. Specify the --uiPort value to a StatsListener inside a notebook to track model performance.

zeppelin

[--zeppelinInterpreterDir ZEPPELININTERPRETERDIR]
[--name NAME] [--zeppelinBinDir ZEPPELINBINDIR]
[--jvmArgs JVMARGS]
[--zeppelinPassword ZEPPELINPASSWORD]
[--zeppelinConfDir ZEPPELINCONFDIR]
[--zeppelinPort ZEPPELINPORT]
[--zeppelinMemory ZEPPELINMEMORY]
[--zeppelinWarDir ZEPPELINWARDIR]
[--zeppelinNotebookDirectory ZEPPELINNOTEBOOKDIRECTORY]
[--deleteInterpreterRepoOnStartup DELETEINTERPRETERREPOONSTARTUP]
[--master MASTER]
[--zeppelinUserName ZEPPELINUSERNAME]
[--zeppelinLogFile ZEPPELINLOGFILE]
[--zeppelinHost ZEPPELINHOST]
[--zeppelinHome ZEPPELINHOME]
[--interpreterPort INTERPRETERPORT]
[--agentId AGENTID]
[--zeppelinLocalRepo ZEPPELINLOCALREPO]

Start a new zeppelin server process. SKIL creates a default zeppelin server and zeppelin interpreter process but you can add more using this command to add other servers for specific team members or running long running pre-processing or training jobs. Once you've created a zeppelin server process, you need to create one or more zeppelinInterpreter processes to evaluate notebooks.

services

Returns the list of running services. Similar to processes but slightly less verbose.

datavec

[--name NAME] 
[--dataType DATATYPE]
[--jvmArgs JVMARGS] 
[--jsonPath JSONPATH]
[--agentId AGENTID] 
[--dataVecPort DATAVECPORT]

Starts a stand alone transform process server. Useful for advanced deployment scenarios.

spark

[--modelHistoryId MODELHISTORYID] 
[--status STATUS]
[--verbose VERBOSE]
[--total-executor-cores TOTAL_EXECUTOR_CORES]
[--driver-class-path DRIVER_CLASS_PATH] 
[--uiUrl UIURL]
[--driver-memory DRIVER_MEMORY] 
[--kill KILL]
[--class CLASS] 
[--multiDataSet MULTIDATASET]
[--principal PRINCIPAL] 
[--agentId AGENTID]
[--numEpochs NUMEPOCHS] 
[--batchSize BATCHSIZE]
[--deploy-mode DEPLOY_MODE]
[--driver-library-path DRIVER_LIBRARY_PATH]
[--outputPath OUTPUTPATH] 
[--num-executors NUM_EXECUTORS]
[--modelPath MODELPATH] 
[--master MASTER]
[--evalDataSetProviderClass EVALDATASETPROVIDERCLASS]
[--driver-cores DRIVER_CORES] 
[--jars JARS]
[--executor-memory EXECUTOR_MEMORY] 
[--files FILES]
[--keytab KEYTAB] 
[--properties-file PROPERTIES_FILE]
[--trainingMasterPath TRAININGMASTERPATH]
[--supervise SUPERVISE] 
[--queue QUEUE]
[--packages PACKAGES]
[--exclude-packages EXCLUDE_PACKAGES]
[--doInference DOINFERENCE]
[--modelInstanceId MODELINSTANCEID] 
[--name NAME]
[--proxy-user PROXY_USER] 
[--jvmArgs JVMARGS]
[--evalType EVALTYPE] 
[--repositories REPOSITORIES]
[--modelHistoryUrl MODELHISTORYURL]
[--dataSetProvider DATASETPROVIDER]
[--driver-java-options DRIVER_JAVA_OPTIONS]

Start a data parallel model training job on Spark using DL4J. This command creates an uberjar of all of SKIL's dependencies and launches the job using the specified Spark home. To use this command simply create a model using a notebook, save it to disk or HDFS, upload a plugin jar that has a DataSetProvider subclass that fetches and vectorizes your data set and run this command specifying the class name and model path.

parallelwrapper

[--modelHistoryId MODELHISTORYID]
[--reportScore REPORTSCORE]
[--multiDataSet MULTIDATASET] 
[--name NAME]
[--modelOutputPath MODELOUTPUTPATH]
[--averagingFrequency AVERAGINGFREQUENCY]
[--jvmArgs JVMARGS] 
[--workers WORKERS]
[--uiUrl UIURL]
[--dataSetIteratorFactoryClazz DATASETITERATORFACTORYCLAZZ]
[--modelHistoryUrl MODELHISTORYURL]
[--averageUpdaters AVERAGEUPDATERS]
[--legacyAveraging LEGACYAVERAGING]
[--modelPath MODELPATH]
[--prefetchSize PREFETCHSIZE]
[--evalDataSetProviderClass EVALDATASETPROVIDERCLASS]
[--agentId AGENTID] 
[--evalType EVALTYPE]
[--multiDataSetIteratorFactoryClazz MULTIDATASETITERATORFACTORYCLAZZ]

Start a data parallel model training job on multiple GPUs using DL4J. To use this command simply create a model using a notebook, save it to disk or HDFS, upload a plugin jar that has a DataSetIteratorFactory subclass that fetches and vectorizes your data set and run this command specifying the class name and model path.

loadbalancer

[--jvmArgs JVMARGS] 
[--agentId AGENTID]
[--name NAME] 
[--urls URLS]
[--loadBalancePort LOADBALANCEPORT]

Create a simple loadbalancer that routes requests to the specified URLS Useful for advanced deployment scenarios when using stand alone model, transform and KNN servers.

login

[--userId USERID] 
[--password PASSWORD]

Logs in to SKIL and saves the token into your Home directory.

Leave out the --password argument to enter it interactively.

The following commands are deprecated or alpha but are documented here for completeness.

Command
Status
Flags
Comments

npm

deprecated

[--jvmArgs JVMARGS] 
[--agentId AGENTID] 
[--name NAME] 
[--moduleName MODULENAME] 
[--npmHome NPMHOME]

This is a command used for development and will be removed in a later release.

parameter_server_master

alpha

[--aeronDirectory AERONDIRECTORY]
[--name NAME] 
[--jvmArgs JVMARGS]
[--parameterServerAeronPort PARAMETERSERVERAERONPORT]
[--shape SHAPE] 
[--streamId STREAMID]
[--parameterServerStatusPort PARAMETERSERVERSTATUSPORT]
[--agentId AGENTID]

Creates a parameter server for accelerating Spark training on very large Hadoop clusters.

media_driver

alpha

[--jvmArgs JVMARGS]
[--aeronDirectory AERONDIRECTORY]
[--agentId AGENTID] 
[--name NAME]

Creates a server for sharing weights inside a spark training job. Used in very large Hadoop clusters.

parameter_server_slave

alpha

[--aeronDirectory AERONDIRECTORY]
[--name NAME] 
[--jvmArgs JVMARGS]
[--parameterServerAeronPort PARAMETERSERVERAERONPORT]
[--shape SHAPE] 
[--masterUrl MASTERURL]
[--streamId STREAMID]
[--parameterServerStatusPort PARAMETERSERVERSTATUSPORT]
[--agentId AGENTID]

Creates a parameter server for sharing weights in Spark training jobs. Used in very large Spark clusters to optimize Top of Rack network bandwidth.

arbiter

alpha

[--regressionType REGRESSIONTYPE]
[--dataSetIteratorClass DATASETITERATORCLASS]
[--name NAME] 
[--jvmArgs JVMARGS]
[--problemType PROBLEMTYPE]
[--neuralNetType NEURALNETTYPE]
[--modelSavePath MODELSAVEPATH] 
[--agentId AGENTID]
[--optimizationConfigPath OPTIMIZATIONCONFIGPATH]

Starts an arbiter server for neural network hyper parameter search.


What's Next

For some fun use cases of SKIL's CLI, you can take a look at the following pages:

Adding more Zeppelin Instances
Visualization
Distributed Training in SKIL
Training on Multiple GPUs in SKIL

Command Line (CLI)


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.