Generic JDBC Interpreter for Apache Zeppelin

Overview

This interpreter lets you create a JDBC connection to any data source, by now it has been tested with:

Postgres
MySql
MariaDB
Redshift
Apache Hive
Apache Phoenix
Apache Drill (Details on using Drill JDBC Driver)
Apache Tajo

If someone else used another database please report how it works to improve functionality.

Create Interpreter

When you create a interpreter by default use PostgreSQL with the next properties:

name	value
common.max_count	1000
default.driver	org.postgresql.Driver
default.password	********
default.url	jdbc:postgresql://localhost:5432/
default.user	gpadmin

It is not necessary to add driver jar to the classpath for PostgreSQL as it is included in Zeppelin.

Simple connection

Prior to creating the interpreter it is necessary to add maven coordinate or path of the JDBC driver to the Zeppelin classpath. To do this you must edit dependencies artifact(ex. mysql:mysql-connector-java:5.1.38) in interpreter menu as shown:

To create the interpreter you need to specify connection parameters as shown in the table.

name	value
common.max_count	1000
default.driver	driver name
default.password	********
default.url	jdbc url
default.user	user name

Multiple connections

JDBC interpreter also allows connections to multiple data sources. It is necessary to set a prefix for each connection to reference it in the paragraph in the form of %jdbc(prefix). Before you create the interpreter it is necessary to add each driver's maven coordinates or JDBC driver's jar file path to the Zeppelin classpath. To do this you must edit the dependencies of JDBC interpreter in interpreter menu as following:

You can add all the jars you need to make multiple connections into the same JDBC interpreter. To create the interpreter you must specify the parameters. For example we will create two connections to MySQL and Redshift, the respective prefixes are default and redshift:

name	value
common.max_count	1000
default.driver	com.mysql.jdbc.Driver
default.password	********
default.url	jdbc:mysql://localhost:3306/
default.user	mysql-user
redshift.driver	com.amazon.redshift.jdbc4.Driver
redshift.password	********
redshift.url	jdbc:redshift://examplecluster.abc123xyz789.us-west-2.redshift.amazonaws.com:5439
redshift.user	redshift-user

Bind to Notebook

In the Notebook click on the settings icon at the top-right corner. Use select/deselect to specify the interpreters to be used in the Notebook.

More Properties

You can modify the interpreter configuration in the Interpreter section. The most common properties are as follows, but you can specify other properties that need to be connected.

Property Name	Description
{prefix}.url	JDBC URL to connect, the URL must include the name of the database
{prefix}.user	JDBC user name
{prefix}.password	JDBC password
{prefix}.driver	JDBC driver name.
common.max_result	Max number of SQL result to display to prevent the browser overload. This is common properties for all connections

To develop this functionality use this method. For example if a connection needs a schema parameter, it would have to add the property as follows:

name	value
{prefix}.schema	schema_name

Examples

Hive

Properties

Name	Value
hive.driver	org.apache.hive.jdbc.HiveDriver
hive.url	jdbc:hive2://localhost:10000
hive.user	hiveuser
hive.password	hivepassword

Dependencies

Artifact	Excludes
org.apache.hive:hive-jdbc:0.14.0
org.apache.hadoop:hadoop-common:2.6.0

Phoenix

Properties

Name	Value
phoenix.driver	org.apache.phoenix.jdbc.PhoenixDriver
phoenix.url	jdbc:phoenix:localhost:2181:/hbase-unsecure
phoenix.user	phoenixuser
phoenix.password	phoenixpassword

Dependencies

Artifact	Excludes
org.apache.phoenix:phoenix-core:4.4.0-HBase-1.0

Tajo

Properties

Name	Value
tajo.driver	org.apache.tajo.jdbc.TajoDriver
tajo.url	jdbc:tajo://localhost:26002/default

Dependencies

Artifact	Excludes
org.apache.tajo:tajo-jdbc:0.11.0

How to use

Reference in paragraph

Start the paragraphs with the %jdbc, this will use the default prefix for connection. If you want to use other connection you should specify the prefix of it as follows %jdbc(prefix):

%jdbc
SELECT * FROM db_name;

%jdbc(prefix)
SELECT * FROM db_name;

Apply Zeppelin Dynamic Forms

You can leverage Zeppelin Dynamic Form inside your queries. You can use both the text input and select form parametrization features

%jdbc(prefix)
SELECT name, country, performer
FROM demo.performers
WHERE name=''

Bugs & Reporting

If you find a bug for this interpreter, please create a JIRA ticket.