# Java Client for TD-API Using the Java client for [Treasure Data API](/apis/td-api), you can: - Submit Hive/Trino(Presto) queries to Treasure Data. - Check the status of jobs (queries). - Retrieve query results. - Check the information of databases and tables. Note that td-client-java 0.8.0 requires Java 1.8 or higher. And td-client-java-0.7.x requires Java7. ## Install You can download a Jar file (td-client-java-(version)-shade.jar) from [here](https://repo1.maven.org/maven2/com/treasuredata/client/td-client/). For the information about the [older versions](https://github.com/treasure-data/td-client-java/tree/0.7.x). Use the following dependency settings for either Maven or the Standalone Jar file. ```xml Maven com.treasuredata.client td-client (version) ch.qos.logback logback-classic 1.2.3 ``` ```xml Standalone Jar com.treasuredata.client td-client (version) shade ``` ## Basic Use ### Set API Key **Option 1 : Config file** To use td-client-java, you need to set your API key in the `$HOME/.td/td.conf` file. ```xml [account] user = (your TD account e-mail address) apikey = ``` **Option 2: Environment variable** It is also possible to use the `TD_API_KEY` environment variable. Add the following configuration to your shell configuration `.bash_profile`, `.zprofile`, etc. ```shell export TD_API_KEY = YOUR_API_KEY ``` For Windows, add the `TD_API_KEY` environment variable in the user preference panel. ### Example Code ```java import com.treasuredata.client.*; import com.google.common.base.Function; import org.msgpack.core.MessagePack; import org.msgpack.core.MessageUnpacker; import org.msgpack.value.ArrayValue; ... // Create a new TD client by using configurations in $HOME/.td/td.conf TDClient client = TDClient.newClient(); try { // Retrieve database and table names List databaseNames = client.listDatabases(); for(TDDatabase db : databaseNames) { System.out.println("database: " + db.getName()); for(TDTable table : client.listTables(db.getName())) { System.out.println(" table: " + table); } } // Submit a new Trino(Presto) query (for Hive, use TDJobReqult.newHiveQuery) String jobId = client.submit(TDJobRequest.newTrinoQuery("sample_datasets", "select count(1) from www_access")); // Wait until the query finishes ExponentialBackOff backoff = new ExponentialBackOff(); TDJobSummary job = client.jobStatus(jobId); while(!job.getStatus().isFinished()) { Thread.sleep(backoff.nextWaitTimeMillis()); job = client.jobStatus(jobId); } // Read the detailed job information TDJob jobInfo = client.jobInfo(jobId); System.out.println("log:\n" + jobInfo.getCmdOut()); System.out.println("error log:\n" + jobInfo.getStdErr()); // Read the job results in msgpack.gz format client.jobResult(jobId, TDResultFormat.MESSAGE_PACK_GZ, new Function() { @Override public Object apply(InputStream input) { try { MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(new GZIPInputStream(input)); while(unpacker.hasNext()) { // Each row of the query result is array type value (e.g., [1, "name", ...]) ArrayValue array = unpacker.unpackValue().asArrayValue(); int id = array.get(0).asIntegerValue().toInt(); } } }); ... } finally { // Never forget to close the TDClient. client.close(); } ``` ### Bulk upload ```java // Create a new TD client by using configurations in $HOME/.td/td.conf TDClient client = TDClient.newClient(); File f = new File("./sess/part01.msgpack.gz"); TDBulkImportSession session = client.createBulkImportSession("session_name", "database_name", "table_name"); client.uploadBulkImportPart(session.getName(), "session_part01", f); ``` ### Data Connector Bulk Loading ```java // Create a new TD client by using configurations in $HOME/.td/td.conf TDClient client = TDClient.newClient(); client.startBulkLoadSession("session_name"); ``` ## Advanced Use ### Proxy Server If you need to access Web through proxy, add the following configuration to `$HOME/.td/td.conf` file: ``` [account] user = (your TD account e-mail address) apikey = (your API key) td.client.proxy.host = (optional: proxy host name) td.client.proxy.port = (optional: proxy port number) td.client.proxy.user = (optional: proxy user name) td.client.proxy.password = (optional: proxy password) ``` ### Configuring TDClient To configure TDClient, use `TDClient.newBuilder()`: ```java TDClient client = TDClient .newBuilder() .setApiKey("(your api key)") .setEndpoint("api.ybi.idcfcloud.net") // For using a non-default endpoint .build() ``` It is also possible to set the configuration with a `Properties` object: ```java Properties prop = new Properties(); // Set your own properties prop.setProperty("td.client.retry.limit", "10"); ... // This overrides the default configuration parameters with the given Properties TDClient client = TDClient.newBuilder().setProperties(prop).build(); ``` ### Configuration Parameters The precedence of the configuration parameters are as follows: 1. Properties object passed to `TDClient.newBuilder().setProperties(Properties p)` 2. Parameters written in `$HOME/.td/td.conf` 3. System properties (passed with `-D` option when launching JVM) 4. Environment variable (only for `TD_API_KEY` parameter) | Key | Default Value | Description | | --- | --- | --- | | `apikey` | | API key to access Treasure Data. You can also set this via `TD_API_KEY` environment variable. | | `user` | | Account e-mail address (unnecessary if `apikey` is set) | | `password` | | Account password (unnecessary if `apikey` is set) | | `td.client.proxy.host` | | (optional) Proxy host e.g., "myproxy.com" | | `td.client.proxy.port` | | (optional) Proxy port e.g., "80" | | `td.client.proxy.user` | | (optional) Proxy user | | `td.client.proxy.password` | | (optional) Proxy password | | `td.client.usessl` | true | (optional) Use SSL encryption | | `td.client.retry.limit` | 7 | (optional) The maximum number of API request retry | | `td.client.retry.initial-interval` | 500 | (optional) backoff retry interval = (interval) * (multiplier) ^ (retry count) | | `td.client.retry.max-interval` | 60000 | (optional) max retry interval | | `td.client.retry.multiplier` | 2.0 | (optional) retry interval multiplier | | `td.client.connect-timeout` | 15000 | (optional) connection timeout before reaching the API | | `td.client.read-timeout` | 60000 | (optional) timeout when no data is coming from API | | `td.client.connection-pool-size` | 64 | (optional) Connection pool size | | `td.client.endpoint` | `api.treasuredata.com` | (optional) TD REST API endpoint name | | `td.client.port` | 80 for non-SSL, 443 for SSL connection | (optional) TD API port number | ## Further Reading - [GitHub Source Code](https://github.com/treasure-data/td-client-java) - [Version information](https://github.com/treasure-data/td-client-java/blob/master/CHANGES.txt)