Shell interpreter for Apache Zeppelin

Overview

Zeppelin Shell has two interpreters the default is the %sh interpreter.

Shell interpreter

Shell interpreter uses Apache Commons Exec to execute external processes. In Zeppelin notebook, you can use %sh in the beginning of a paragraph to invoke system shell and run commands.

Terminal interpreter

Terminal interpreter uses hterm, Pty4J analog terminal operation.

Note : Currently each command runs as the user Zeppelin server is running as.

Configuration

At the "Interpreters" menu in Zeppelin dropdown menu, you can set the property value for Shell interpreter.

Name Default Description
shell.command.timeout.millisecs 60000 Shell command time out in millisecs
shell.working.directory.user.home false If this set to true, the shell's working directory will be set to user home
zeppelin.shell.auth.type Types of authentications' methods supported are SIMPLE, and KERBEROS
zeppelin.shell.principal The principal name to load from the keytab
zeppelin.shell.keytab.location The path to the keytab file
zeppelin.shell.interpolation false Enable ZeppelinContext variable interpolation into paragraph text
zeppelin.terminal.ip.mapping Internal and external IP mapping of zeppelin server
zeppelin.concurrency.max 10 Max concurrency of shell interpreter

Example

Shell interpreter

The following example demonstrates the basic usage of Shell in a Zeppelin notebook.

If you need further information about Zeppelin Interpreter Setting for using Shell interpreter, please read What is interpreter setting? section first.

Kerberos refresh interval

For changing the default behavior of when to renew Kerberos ticket following changes can be made in conf/zeppelin-env.sh.

# Change Kerberos refresh interval (default value is 1d). Allowed postfix are ms, s, m, min, h, and d.
export KERBEROS_REFRESH_INTERVAL=4h
# Change kinit number retries (default value is 5), which means if the kinit command fails for 5 retries consecutively it will close the interpreter. 
export KINIT_FAIL_THRESHOLD=10

Object Interpolation

The shell interpreter also supports interpolation of ZeppelinContext objects into the paragraph text. The following example shows one use of this facility:

In Scala cell:

z.put("dataFileName", "members-list-003.parquet")
    // ...
val members = spark.read.parquet(z.get("dataFileName"))
    // ...

In later Shell cell:

%sh
rm -rf {dataFileName}

Object interpolation is disabled by default, and can be enabled (for the Shell interpreter) by setting the value of the property zeppelin.shell.interpolation to true (see Configuration above). More details of this feature can be found in Zeppelin-Context

Terminal interpreter

The following example demonstrates the basic usage of terminal in a Zeppelin notebook.

%sh.terminal
input any char

zeppelin.terminal.ip.mapping

When running the terminal interpreter in the notebook, the front end of the notebook needs to obtain the IP address of the server where the terminal interpreter is located to communicate.

In a public cloud environment, the cloud host has an internal IP and an external access IP, and the interpreter runs in the cloud host. This will cause the notebook front end to be unable to connect to the terminal interpreter properly, resulting in the terminal interpreter being unusable.

Solution: Set the mapping between internal IP and external IP in the terminal interpreter, and connect the front end of the notebook through the external IP of the terminal interpreter.

Example: {"internal-ip1":"external-ip1", "internal-ip2":"external-ip2"}