# TD Toolbelt Reference You can run Treasure Data from the command line using these commands. | Command | Example | | --- | --- | | [Basic Commands](#basic-commands) | `td` | | [Database Commands](#database-commands) | `td db:create ` | | [Table Commands](#table-commands) | `td table:list [db]` | | [Query Commands](#query-commands) | `td query [sql]` | | [Import Commands](#import-commands) | `td import:list` | | [Bulk Import Commands](#bulk-import-commands) | `td bulk_import:list` | | [Result Commands](#result-commands) | `td result:list` | | [Schedule Commands](#schedule-commands) | `td sched:list` | | [Schema Commands](#schema-commands) | `td schema:show ` | | [Connector Commands](#connector-commands) | `td connector:guess [config]` | | [User Commands](#user-commands) | `td user:list` | | [Workflow Commands](#workflow-commands) | `td workflow init` | | [Job Commands](#job-commands) | `td job:show ` | ## Basic Commands You can use the following commands to enable basic functions in Treasure Data. + [td](#td) + [Additional commands](#additional-commands) ### td Show list of options in Treasure Data. **Usage** td table colgroup col col tbody tr th p Options th p Description tr td code -c, --config PATH td path to the configuration file (default: ~/.td/td.conf) td code -k, --apikey KEY td use this API key instead of reading the config file td code -e, --endpoint API_SERVER td specify the URL for API server to use (default: https://api.treasuredatacom). The URL must contain a scheme (http:// or https:// prefix) to be valid. td code --insecure td insecure access: disable SSL (enabled by default) td code -v, --verbose td verbose mode p td code -r, --retry-post-requests td retry on failed post requests. Warning: can cause resource duplication, such as duplicated job submissions. td code --version td show version < /p> ### Additional Commands **Usage** td table colgroup col col tbody tr th p Options th p Description tr td p code db td p create/delete/list databases tr td p code table td p create/delete/list/import/export/tail tables tr td p code query td p issue a query tr td p code job td p show/kill/list jobs tr td p code import td p manage bulk import sessions (Java based fast processing) tr td p code bulk_import td p manage bulk import sessions (Old Ruby-based implementation) tr td p code result td p create/delete/list result URLs tr td p code sched td p create/delete/list schedules that run a query periodically tr td p code schema td p create/delete/modify schemas of tables tr td p code connector td p manage connectors tr td p code workflow td p manage workflows tr td p code status td p show scheds, jobs, tables and results tr td p code apikey td p show/set API key tr td p code server td p show status of the Treasure Data server tr td p code sample td p create a sample log file tr td p code help td p show help messages ## Database Commands You can create, delete, and view lists of databases from the command line. + [td db:create](#td-db-create) + [td db:delete](#td-db-delete) + [td db:list](#td-db-list) ### td db create Create a database. **Usage** ``` td db:create ``` **Example** td db:create example_db ### td db delete Delete a database. **Usage** ``` td db:delete ``` table colgroup col col tbody tr th p Options th p Description tr td p code -f, --force td p clear tables and delete the database **Example** td db:delete example_db ### td db list Show list of tables in a database. **Usage** td db:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td db:list td dbs ## Table Commands You can create, list, show, and organize table structure using the command line. + [td table:list](#td-table-list) + [td table:show](#td-table-show) + [td table:create](#td-table-create) + [td table:delete](#td-table-delete) + [td table:import](#td-table-import) + [td table:export](#td-table-export) + [td table:swap](#td-table-swap) + [td table:rename](#td-table-rename) + [td table:tail](#td-table-tail) + [td table:expire](#td-table-expire) ### td table list Show list of tables. **Usage** td table:list [db] table colgroup col col tbody tr th p Options th p Description tr td p code -n, --num_threads VAL p code –-show-bytes td p number of threads to get list in parallel p show estimated table size in bytes tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td table:list td table:list example_db td tables ### td table show Describe information in a table. **Usage** ``` td table:show
``` p table colgroup col col tbody tr th p Options th p Description tr td p code -v td p show more attributes **Example** td table example_db table1 ### td table create Create a table. **Usage** ``` td table:create
``` table colgroup col col tbody tr th p Options th p Description tr td p code -T, --type TYPE p code --expire-days DAYS p code --include-v BOOLEAN p code --detect-schema BOOLEAN td p set table type (log) p set table expire days p set include_v flag p set detect schema flag **Example** td table:create example_db table1 ### td table delete Delete a table. **Usage** ``` td table:delete
``` table colgroup col col tbody tr th p Options th p Description tr td p code -f, --force td p never prompt **Example** td table:delete example_db table1 ### td table import Parse and import files to a table **Usage** ``` td table:import
``` table colgroup col col tbody tr th p Options th p Description tr td p code --format FORMAT td p file format (default: apache) tr td p code --apache td p same as --format apache; apache common log format tr td p code --syslog td p same as --format syslog; syslog tr td p code --msgpack td p same as --format msgpack; msgpack stream format tr td p code --json td p same as --format json; LF-separated json format tr td p code -t, --time-key COL_NAME td p time key name for json and msgpack format (e.g. 'created_at') tr td p code --auto-create-table td p Create table and database if doesn't exist **Example** td table:import example_db table1 --apache access.log td table:import example_db table1 --json -t time - < test.json #### How is the import command’s time format set in a windows batch file? ‘%’ is a recognized environment variable, so you must use ‘%%’ to set it. td import:prepare --format csv --column-header --time-column 'date' --time-format '%%Y-%%m-%%d' test.csv ### td table export Dump logs in a table to the specified storage **Usage** ``` td table:export
``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait td p wait until the job is completed tr td p code -f, --from TIME td p export data which is newer than or same with the TIME tr td p code -t, --to TIME td p export data which is older than the TIME tr td p code -b, --s3-bucket NAME td p name of the destination S3 bucket (required) tr td p code -p, --prefix PATH td p path prefix of the file on S3 tr td p code -k, --aws-key-id KEY_ID td p AWS access key id to export data (required) tr td p code -s, --aws-secret-key SECRET_KEY td p AWS secret access key to export data (required) tr td p code -F, --file-format FILE_FORMAT td p file format for exported data. p Available formats are tsv.gz (tab-separated values per line) and jsonl.gz (JSON record per line). p The json.gz and line-json.gz formats are default and still available but only for backward compatibility purpose;use is discouraged because they have far lower performance. tr td p code -O, --pool-name NAME td p specify resource pool by name tr td p code -e, --encryption ENCRYPT_METHOD td p export with server side encryption with the ENCRYPT_METHOD tr td p code -a ASSUME_ROLE_ARN, p code --assume-role td p export with assume role with ASSUME_ROLE_ARN as role arn **Example** td table:export example_db table1 --s3-bucket mybucket -k KEY_ID -s SECRET_KEY ### td table swap Swap the names of two tables. **Usage** ``` td table:swap ``` **Example** td table:swap example_db table1 table2 ### td table rename Rename the existing table. **Usage** ``` td table:rename ``` table colgroup col col tbody tr th p Options th p Description tr td p code --overwrite td p replace existing dest table **Example** td table:rename example_db table1 table2 ### td table tail p Get recently imported logs. **Usage** ``` td table:tail
``` table colgroup col col tbody tr th p Options th p Description tr td p code -n, --count N td p number of logs to get tr td p code -P, --pretty td p pretty print **Example** td table:tail example_db table1 td table:tail example_db table1 -n 30 ### td table expire Expire data in table after specified number of days. Set to 0 to disable the expiration. **Usage** ``` td table:expire
``` **Example** td table:expire example_db table1 30 ## Query Commands You can issue queries from the command line. + [td query](#td-query) ### td query Issue a query **Usage** td query [sql] table colgroup col col tbody tr th p Options th p Description tr td p code -d, --database DB_NAME td p use the database (required) tr td p code -w, --wait[=SECONDS] td p wait for finishing the job (for seconds) tr td p code -G, --vertical td p use vertical table to show results tr td p code -o, --output PATH td p write result to the file tr td p code -f, --format FORMAT td p format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz) tr td p code -r, --result RESULT_URL td p write result to the URL (see also result:create subcommand) p It is suggested for this option to be used with the -x / --exclude option to suppress printing of the query result to stdout or -o / --output to dump the query result into a file. p br tr td p code -u, --user NAME td p set user name for the result URL tr td p code -p, --password td p ask password for the result URL tr td p code -P, --priority PRIORITY td p set priority tr td p code -R, --retry COUNT td p automatic retrying count tr td p code -q, --query PATH td p use file instead of inline query tr td p code -T, --type TYPE td p set query type (hive, trino(presto)) tr td p code --sampling DENOMINATOR td p OBSOLETE - enable random sampling to reduce records 1/DENOMINATOR tr td p code -l, --limit ROWS td p limit the number of result rows shown when not outputting to file tr td p code -c, --column-header td p output of the columns' header when the schema is available for the table (only applies to json, tsv and csv formats) tr td p code -x, --exclude td p do not automatically retrieve the job result tr td p code -O, --pool-name NAME td p specify resource pool by name tr td p code --domain-key DOMAIN_KEY td p optional user-provided unique ID. p You can include this ID with your `create` request to ensure a idempotence .ß tr td p code --engine-version ENGINE_VERSION td p specify query engine version by name **Example** td query -d example_db -w -r rset1 "select count(*) from table1" td query -d example_db -w -r rset1 -q query.txt ## Import Commands You can import and organize data from the command line using these commands. + [td import:list](#td-import-list) + [td import:show](#td-import-show) + [td import:create](#td-import-create) + [td import:jar_version](#td-import-jar-version) + [td import:jar_update](#td-import-jar-update) + [td import:prepare](#td-import-prepare) + [td import:upload](#td-import-upload) + [td import:auto](#td-import-auto) + [td import:perform](#td-import-perform) + [td import:error_records](#td-import-error-records) + [td import:commit](#td-import-commit) + [td import:delete](#td-import-delete) + [td import:freeze](#td-import-freeze) + [td import:unfreeze](#td-import-unfreeze) + [td import:config](#td-import-config) ### td import list List bulk import sessions **Usage** td import:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td import:list ### td import show Show list of uploaded parts. **Usage** ``` td import:show ``` **Example** td import:show ### td import create Create a new bulk import session to the table **Usage** ``` td import:create
``` **Example** td import:create logs_201201 example_db event_logs ### td import jar version Show import jar version **Usage** td import:jar_version **Example** td import:jar_version ### td import jar update Update import jar to the latest version **Usage** td import:jar_update **Example** td import:jar_update ### td import prepare Convert files into part file format **Usage** td import:prepare table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv tr td p code -C, --compress TYPE td p compressed type [gzip, none, auto]; default=auto detect tr td p - code T, --time-format FORMAT td p specifies the strftime format of the time column p The format slightly differs from Ruby's Time#strftime format in that the'%:z' and '%::z' timezone options are not supported. tr td p code -e, --encoding TYPE td p encoding type [UTF-8, etc.] tr td p code -o, --output DIR td p output directory. default directory is 'out'. tr td p code -s, --split-size SIZE_IN_KB td p size of each parts (default: 16384) tr td p code -t, --time-column NAME td p name of the time column tr td p code --time-value TIME,HOURS td p time column's value. If the data doesn't have a time column,users can auto-generate the time column's value in 2 ways: ul li p Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. ul li p Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. p This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp > = TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. tr td p code --primary-key NAME:TYPE td p pair of name and type of primary key declared in your item table tr td p code --prepare-parallel NUM td p prepare in parallel (default: 2; max 96) tr td p code --only-columns NAME,NAME,... td p only columns tr td p code --exclude-columns NAME,NAME,... td p exclude columns tr td p code --error-records-handling MODE td p error records handling mode [skip, abort]; default=skip tr td p code --invalid-columns-handling MODE td p invalid columns handling mode [autofix, warn]; default=warn tr td p code --error-records-output DIR td p write error records; default directory is 'error-records'. tr td p code --columns NAME,NAME,... td p column names (use --column-header instead if the first line has column names) tr td p code --column-types TYPE,TYPE,... td p column types [string, int, long, double] tr td p code --column-type NAME:TYPE td p column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' tr td p code S, --all-string td p disable automatic type conversion tr td p code --empty-as-null-if-numeric td p the empty string values are interpreted as null values if columns are numerical types. **CSV/TSV Specific Options** p table colgroup col col tbody tr th p Options th p Description tr td p code --column-header td p first line includes column names tr td p code --delimiter CHAR td p delimiter CHAR; default="," at csv, "\t" at tsv tr td p code --escape CHAR td p escape CHAR; default=\ tr td p code --newline TYPE td p newline [CRLF, LF, CR]; default=CRLF tr td p code --quote CHAR td p quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE **MySQL Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --db-url URL td p JDBC connection URL tr td p code --db-user NAME td p user name for MySQL account tr td p code --db-password PASSWORD td p password for MySQL account **REGEX Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --regex-pattern PATTERN td p pattern to parse line. When 'regex' is used as source file format, this option is required **Example** td import:prepare logs/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/ td import:prepare logs/*.csv --format csv --columns date_code,uid,price,count --time-value 1394409600,10 -o parts/ td import:prepare mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:prepare "s3://:@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/ ### td import upload p Upload or re-upload files into a bulk import session **Usage** ``` td import:upload ``` table tbody tr th p Options th p Description tr td p code --retry-count NUM td p upload process will automatically retry at specified time; default: 10 tr td p code --auto-create DATABASE.TABLE td p create automatically bulk import session by specified database and table names p If you use 'auto-create' option, you MUST not specify any session name as first argument. tr td p code --auto-perform td p perform bulk import job automatically tr td p code --auto-commit td p commit bulk import job automatically tr td p code --auto-delete td p delete bulk import session automatically tr td p code --parallel NUM td p upload in parallel (default: 2; max 8) tr td p code -f, --format FORMAT td p source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv tr td p code -C, --compress TYPE td p compressed type [gzip, none, auto]; default=auto detect tr td p code -T, --time-format FORMAT td p specifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported. tr td p code -e, --encoding TYPE td p encoding type [UTF-8, etc.] tr td p code -o, --output DIR td p output directory. default directory is 'out'. tr td p code -s, --split-size SIZE_IN_KB td p size of each parts (default: 16384) tr td p code -t, --time-column NAME td p name of the time column tr td p code --time-value TIME,HOURS td p time column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: ul li p Fixed time value with li p -time-value p TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. ul li p Incremental time value with li p -time-value p TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp to overflow the range (timestamp > = TIME + HOURS ul li p 3600), the next timestamp will restart at TIME and continue from there. E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. tr td p code --primary-key NAME:TYPE td p pair of name and type of primary key declared in your item table tr td p code --prepare-parallel NUM td p prepare in parallel (default: 2; max 96) tr td p code --only-columns NAME,NAME,... td p only columns tr td p code --exclude-columns NAME,NAME,... td p exclude columns tr td p code --error-records-handling MODE td p error records handling mode [skip, abort]; default=skip tr td p code --invalid-columns-handling MODE td p invalid columns handling mode [autofix, warn]; default=warn tr td p code --error-records-output DIR td p write error records; default directory is 'error-records'. tr td p code --columns NAME,NAME,... td p column names (use --column-header instead if the first line has column names) tr td p code --column-types TYPE,TYPE,... td p column types [string, int, long, double] tr td p code --column-type NAME:TYPE td p column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' tr td p code -S, --all-string td p disable automatic type conversion tr td p code --empty-as-null-if-numeric td p the empty string values are interpreted as null values if columns are numerical types. **CSV/TSV Specific Options** table tbody tr th p Options th p Description tr td p code --column-header td p irst line includes column names tr td p code -f --delimiter CHAR td p delimiter CHAR; default="," at csv, "\t" at tsv tr td p code --escape CHAR td p escape CHAR; default=\ tr td p code --newline TYPE td p newline [CRLF, LF, CR]; default=CRLF tr td p code --quote CHAR td p quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE **MySQL Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --db-url URL td p JDBC connection URL tr td p code --db-user NAME td p user name for MySQL account tr td p code --db-password PASSWORD td p password for MySQL account **REGEX Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --regex-pattern PATTERN td p pattern to parse line. When 'regex' is used as source file format, this option is required **Example** td import:upload mysess parts/* --parallel 4 td import:upload mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/ td import:upload parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/ td import:upload mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:upload "s3://:@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/ ### td import auto Automatically upload or re-upload files into a bulk import session. It's functional equivalent of 'upload' command with 'auto-perform', 'auto-commit' and 'auto-delete' options. But it, by default, doesn't provide 'auto-create' option. If you want 'auto-create' option, you explicitly must declare it as command options. **Usage** ``` td import:auto ``` table tbody tr th p Options th p Description tr td p code --retry-count NUM td p upload process will automatically retry at specified time; default: 10 tr td p code --auto-create DATABASE.TABLE td p create automatically bulk import session by specified database and table names p If you use 'auto-create' option, you MUST not specify any session name as first argument. tr td p code --parallel NUM td p upload in parallel (default: 2; max 8) tr td p code -f, --format FORMAT td p source file format [csv, tsv, json, msgpack, apache, regex, mysql]; default=csv tr td p code -C, --compress TYPE td p compressed type [gzip, none, auto]; default=auto detect tr td p code -T, --time-format FORMAT td p specifies the strftime format of the time column The format slightly differs from Ruby's Time#strftime format in that the '%:z' and '%::z' timezone options are not supported. tr td p code -e, --encoding TYPE td p encoding type [UTF-8, etc.] tr td p code -o, --output DIR td p output directory. default directory is 'out'. tr td p code -s, --split-size SIZE_IN_KB td p size of each parts (default: 16384) tr td p code -t, --time-column NAME td p name of the time column tr td p code --time-value TIME,HOURS td p time column's value. If the data doesn't have a time column, users can auto-generate the time column's value in 2 ways: ul li p Fixed time value with --time-value TIME: where TIME is a Unix time in seconds since Epoch. The time column value is constant and equal to TIME seconds. E.g. '--time-value 1394409600' assigns the equivalent of timestamp 2014-03-10T00:00:00 to all records imported. ul li p Incremental time value with --time-value TIME,HOURS: where TIME is the Unix time in seconds since Epoch and HOURS is the maximum range of the timestamps in hours. This mode can be used to assign incremental timestamps to subsequent records. Timestamps will be incremented by 1 second each record. If the number of records causes the timestamp tooverflow the range (timestamp > = TIME + HOURS * 3600), the next timestamp will restart at TIME and continue from there.E.g. '--time-value 1394409600,10' will assign timestamp 1394409600 to the first record, timestamp 1394409601 to the second, 1394409602 to the third, and so on until the 36000th record which will have timestamp 1394445600 (1394409600 + 10 * 3600). The timestamp assigned to the 36001th record will be 1394409600 again and the timestamp will restart from there. p br tr td p code --primary-key NAME:TYPE td p pair of name and type of primary key declared in your item table tr td p code --prepare-parallel NUM td p prepare in parallel (default: 2; max 96) tr td p code --only-columns NAME,NAME,... td p only columns tr td p code --exclude-columns NAME,NAME,... td p exclude columns tr td p code --error-records-handling MODE td p error records handling mode [skip, abort]; default=skip tr td p code --invalid-columns-handling MODE td p invalid columns handling mode [autofix, warn]; default=warn tr td p code --error-records-output DIR td p write error records; default directory is 'error-records'. tr td p code --columns NAME,NAME,... td p column names (use --column-header instead if the first line has column names) tr td p code --column-types TYPE,TYPE,... td p column types [string, int, long, double] tr td p code --column-type NAME:TYPE td p column type [string, int, long, double]. A pair of column name and type can be specified like 'age:int' tr td p code -S, --all-string td p disable automatic type conversion tr td p code --empty-as-null-if-numeric td p the empty string values are interpreted as null values if columns are numerical types. **CSV/TSV Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --column-header td p first line includes column names tr td p code --delimiter CHAR td p delimiter CHAR; default="," at csv, "\t" at tsv tr td p code --escape CHAR td p escape CHAR; default=\ tr td p code --newline TYPE td p newline [CRLF, LF, CR]; default=CRLF tr td p code --quote CHAR td p quote [DOUBLE, SINGLE, NONE]; if csv format, default=DOUBLE. if tsv format, default=NONE __ **MySQL Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --db-url URL td p JDBC connection URL tr td p code --db-user NAME td p user name for MySQL account tr td p code --db-password PASSWORD td p password for MySQL account **REGEX Specific Options** table colgroup col col tbody tr th p Options th p Description tr td p code --regex-pattern PATTERN td p pattern to parse line. When 'regex' is used as source file format, this option is required **Example** td import:auto mysess parts/* --parallel 4 td import:auto mysess parts/*.csv --format csv --columns time,uid,price,count --time-column time -o parts/ td import:auto parts/*.csv --auto-create mydb.mytbl --format csv --columns time,uid,price,count --time-column time -o parts/ td import:auto mysess mytable --format mysql --db-url jdbc:mysql://localhost/mydb --db-user myuser --db-password mypass td import:auto "s3://:@/my_bucket/path/to/*.csv" --format csv --column-header --time-column date_time -o parts/ ### td import perform Start to validate and convert uploaded files **Usage** ``` td import:perform ``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait td p wait for finishing the job tr td p code -f, --force td p force start performing tr td p code -O, --pool-name NAME td p specify resource pool by name **Example** td import:perform logs_201201 ### td import error records p Show records which did not pass validations **Usage** ``` td import:error_records ``` **Example** td import:error_records logs_201201 ### td import commit p Start to commit a performed bulk import session **Usage** ``` td import:commit ``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait td p wait for finishing the commit **Example** td import:commit logs_201201 ### td import delete p Delete a bulk import session **Usage** ``` td import:delete ``` **Example** td import:delete logs_201201 ### td import freeze p Pause any further data upload for a bulk import session/Reject succeeding uploadings to a bulk import session **Usage** ``` td import:freeze ``` **Example** td import:freeze logs_201201 ### td import unfreeze p Unfreeze a bulk import session **Usage** ``` td import:unfreeze ``` **Example** td import:unfreeze logs_201201 ### td import config create guess config from arguments **Usage** ``` td import:config ``` table colgroup col col tbody tr th p Options th p Description tr td p code -o, --out FILE_NAME td p output file name for connector:guess tr td p code -f, --format FORMAT td p source file format [csv, tsv, mysql]; default=csv tr td p code --db-url URL td p Database Connection URL tr td p code --db-user NAME td p user name for database tr td p code --db-password PASSWORD td p password for database tr td p code --columns COLUMNS td p not supported tr td p code --column-header COLUMN-HEADER td p not supported tr td p code --time-column TIME-COLUMN td p not supported tr td p code --time-format TIME-FORMAT td p not supported **Example** td import:config "s3://:@/my_bucket/path/to/*.csv" -o seed. ## Bulk Import Commands You can create and organize bulk imports from the command line. + [td bulk_import:list](#td-bulk-import-list) + [td bulk_import:show ](#td-bulk-import-show) + [td bulk_import:create
](#td-bulk-import-create) + [td bulk_import:prepare_parts ](#td-bulk-import-prepare-parts) + [td bulk_import:upload_parts ](#td-bulk-import-upload-parts) + [td bulk_import:delete_parts ](#td-bulk-import-delete-parts) + [td bulk_import:perform ](#td-bulk-import-perform) + [td bulk_import:error_records ](#td-bulk-import-error-records) + [td bulk_import:commit ](#td-bulk-import-commit) + [td bulk_import:delete ](#td-bulk-import-delete) + [td bulk_import:freeze ](#td-bulk-import-freeze) + [td bulk_import:unfreeze ](#td-bulk-import-unfreeze) For instructions on how to use the bulk import commands, refer to the [Bulk Import API Tutorial](https://api-docs.treasuredata.com/en/api/td-api/bulk-import-tutorial/#bulk-import-api-tutorial). ### td bulk import list List bulk import sessions **Usage** td bulk_import:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td bulk_import:list ### td bulk import show Shows a list of uploaded parts **Usage** ``` td bulk_import:show ``` **Example** td bulk_import:show logs_201201 ### td bulk import create Creates a new bulk import session to the table **Usage** ``` td bulk_import:create
``` **Example** td bulk_import:create logs_201201 example_db event_logs ### td bulk import prepare parts Converts files into part file format **Usage** ``` td bulk_import:prepare_parts ``` table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format NAME td p source file format [csv, tsv, msgpack, json] tr td p code -h, --columns NAME,NAME,... td p column names (use --column-header instead if the first line has column names) tr td p code -H, --column-header td p first line includes column names tr td p code -d, --delimiter REGEX p code --null REGEX p code --true REGEX p code --false REGEX td p delimiter between columns (default: (?-mix:\t|,)) p null expression for the automatic type conversion (default: (?i-mx:\A(?:null||\-|\\N)\z)) p true expression for the automatic type conversion (default: (?i-mx:\A(?:true)\z)) p false expression for the automatic type conversion (default: (?i-mx:\A(?:false)\z)) tr td p code -S, --all-string td p disable automatic type conversion tr td p code -t, --time-column NAME td p name of the time column tr td p code -T, --time-format FORMAT td p strftime(3) format of the time column tr td p code -time-value TIME td p value of the time column tr td p code -e, --encoding NAME td p text encoding tr td p code -C, --compress NAME td p compression format name [plain, gzip] (default: auto detect) tr td p code s, --split-size SIZE_IN_KB td p size of each parts (default: 16384) tr td p code -o, --output DIR td p output directory **Example** td bulk_import:prepare_parts logs/*.csv --format csv --columns time,uid,price,count --time-column "time" -o parts/ ### td bulk import upload parts Uploads or re-uploads files into a bulk import session **Usage** ``` td bulk_import:upload_parts ``` table colgroup col col tbody tr th p Options th p Description tr td p code -P, --prefix NAME p br p br td p add prefix to parts name tr td p code -s, --use-suffix COUNT p code --auto-perform p code --parallel NUM td p use COUNT number of . (dots) in the source file name to the parts name p perform bulk import job automatically p perform uploading in parallel (default: 2; max 8) tr td p code -O, --pool-name NAME td p specify resource pool by name **Example** td bulk_import:upload_parts parts/* --parallel 4 ### td bulk import delete parts p Delete uploaded files from a bulk import session **Usage** ``` td bulk_import:delete_parts ``` table colgroup col col tbody tr th p Options th p Description tr td p code -P, --prefix NAME td p add prefix to parts name **Example** td bulk_import:delete_parts logs_201201 01h 02h 03h ### td bulk import perform Start to validate and convert uploaded files **Usage** ``` td bulk_import:perform ``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait p code -f, --force p code -O, --pool-name NAME td p wait for finishing the job p force start performing p specify resource pool by name **Example** td bulk_import:perform logs_201201 ### td bulk import error records Show records which did not pass validations **Usage** ``` td bulk_import:error_records ``` **Example** td bulk_import:error_records logs_201201 ### td bulk import commit Start to commit a performed bulk import session **Usage** ``` td bulk_import:commit ``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait td p wait for finishing the commit **Example** td bulk_import:commit logs_201201 ### td bulk import delete Delete a bulk import session **Usage** ``` td bulk_import:delete ``` **Example** td bulk_import:delete logs_201201 ### td bulk import freeze Block the upload to a bulk import session **Usage** ``` td bulk_import:freeze ``` **Example** td bulk_import:freeze logs_201201 ### td bulk import unfreeze Unfreeze a frozen bulk import session **Usage** ``` td bulk_import:unfreeze ``` **Example** td bulk_import:unfreeze logs_201201 ## Result Commands You can use the command line to list, create, show, and delete results. + [td result:list](#td-result-list) + [td result:show](#td-result-show) + [td result:create](#td-result-create) + [td result:delete](#td-result-delete) ### td result list Show list of result URLs **Usage** td result:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td result:list td results ### td result show Describe information of a result URL. **Usage** ``` td result:show ``` **Example** td result name ### td result create Create a result URL **Usage** ``` td result:create ``` table colgroup col col tbody tr th p Options th p Description tr td p code -u, --user NAME td p set user name for authentication tr td p code -p, --password td p ask password for authentication **Example** td result:create name mysql://my-server/mydb ### td result delete Delete a result URL. **Usage** ``` td result:delete ``` **Example** td result:delete name ## Schedule Commands You can use the command line to schedule, update, delete, and list queries. + [td sched:list](#td-sched-list) + [td sched:create](#td-sched-create) + [td sched:delete](#td-sched-delete) + [td sched:update](#td-sched-update) + [td sched:history](#td-sched-history) + [td sched:run](#td-sched-run) + [td sched:result](#td-sched-result) ### td sched list Show list of schedules **Usage** td sched:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td sched:list td scheds ### td sched create p Create a schedule **Usage** ``` td sched:create [sql] ``` table colgroup col col tbody tr th p Options th p Description tr td p code -d, --database DB_NAME td p use the database (required) tr td p code -t, --timezone TZ td p name of the timezone. p Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option tr td p code -D, --delay SECONDS td p delay time of the schedule tr td p code -r, --result RESULT_URL td p write result to the URL (see also result:create subcommand) tr td p code -u, --user NAME td p set user name for the result URL tr td p code -p, --password td p ask password for the result URL tr td p code -P, --priority PRIORITY td p set priority tr td p code -q, --query PATH td p use file instead of inline query tr td p code -R, --retry COUNT td p automatic retrying count tr td p code -T, --type TYPE td p set query type (hive) **Example** td sched:create sched1 "0 * * * *" -d example_db "select count(*) from table1" -r rset1 td sched:create sched1 "0 * * * *" -d example_db -q query.txt -r rset2 ### td sched delete p Delete a schedule **Usage** ``` td sched:delete ``` **Example** td sched:delete sched1 ### td sched update p Modify a schedule **Usage** ``` td sched:update ``` table colgroup col col tbody tr th p Options th p Description tr td p code -n, --newname NAME td p change the schedule's name tr td p code -s, --schedule CRON td p change the schedule tr td p code -q, --query SQL td p change the query tr td p code -d, --database DB_NAME td p change the database tr td p code -r, --result RESULT_URL td p change the result target (see also result:create subcommand) tr td p code -t, --timezone TZ td p name of the timezone. p Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option tr td p code -D, --delay SECONDS td p change the delay time of the schedule tr td p code -P, --priority PRIORITY td p set priority tr td p code -R, --retry COUNT td p automatic retrying count tr td p code -T, --type TYPE p code --engine-version ENGINE_VERSION td p set query type (hive) p specify query engine version by name **Example** td sched:update sched1 -s "0 */2 * * *" -d my_db -t "Asia/Tokyo" -D 3600 ### td sched history p Show history of scheduled queries **Usage** ``` td sched:history [max] ``` table colgroup col col tbody tr th p Options th p Description tr td p code -p, --page PAGE td p skip N pages tr td p code -s, --skip N td p skip N schedules tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) p br **Example** td sched sched1 --page 1 ### td sched run Run scheduled queries for the specified time **Usage** ``` td sched:run
``` **Example** td schema example_db table1 ### td schema set Set new schema on a table **Usage** ``` td schema:set
[columns...] ``` **Example** td schema:set example_db table1 user:string size:int ### td schema add Add new columns to a table. **Usage** ``` td schema:add
``` **Example** td schema:add example_db table1 user:string size:int ### td schema remove Remove columns from a table **Usage** ``` td schema:remove
``` **Example** td schema:remove example_db table1 user size ## Connector Commands You can use the command line to control several elements related to connectors. + [td connector:guess](#td-connector-guess) + [td connector:preview](#td-connector-preview) + [td connector:issue](#td-connector-issue) + [td connector:list](#td-connector-list) + [td connector:create](#td-connector-create) + [td connector:show](#td-connector-show) + [td connector:update](#td-connector-update) + [td connector:delete](#td-connector-delete) + [td connector:history](#td-connector-history) + [td connector:run](#td-connector-run) ### td connector guess Run `guess` to generate a connector configuration file. Using the connector's credentials, this command examines the data and attempts to determine the file type, delimiter character, and column names. This "guess" is then written to the configuration file for the connector. This command is useful for file-based connectors. **Usage** td connector:guess [config] table colgroup col col tbody tr th p Options th p Description tr td p code -type[=TYPE] td p (obsoleted) tr td p code -access-id ID td p (obsoleted) tr td p code -access-secret SECRET td p (obsoleted) tr td p code -source SOURCE td p (obsoleted) tr td p code -o, --out FILE_NAME td p output file name for connector:preview tr td p code -g, --guess NAME,NAME,... td p specify list of guess plugins that users want to use **Example** td connector:guess seed.yml -o config.yml **Example seed.yml** in: type: s3 bucket: my-s3-bucket endpoint: s3-us-west-1.amazonaws.com path_prefix: path/prefix/to/import/ access_key_id: ABCXYZ123ABCXYZ123 secret_access_key: AbCxYz123aBcXyZ123 out: mode: append ### td connector preview p Show a subset of possible data that the data connector fetches **Usage** ``` td connector:preview ``` table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td connector:preview td-load.yml ### td connector issue Runs connector execution one time only **Usage** ``` td connector:issue ``` table colgroup col col tbody tr th p Options th p Description tr td p code -database DB_NAME td p destination database tr td p code -table TABLE_NAME td p destination table tr td p code -time-column COLUMN_NAME td p data partitioning key tr td p code -w, --wait td p wait for finishing the job tr td p code -auto-create-table td p Create table and database if doesn't exist **Example** td connector:issue td-load.yml ### td connector list Shows a list of connector sessions **Usage** td connector:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td connector:list ### td connector create p Creates a new connector session **Usage** ``` td connector:create
``` table colgroup col col tbody tr th p Options th p Description tr td p code -time-column COLUMN_NAME td p data partitioning key tr td p code -t, --timezone TZ td p name of the timezone. p Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option tr td p code -D, --delay SECONDS td p delay time of the schedule **Example** td connector:create connector1 "0 * * * *" connector_database connector_table td-load.yml ### td connector show p Shows the execution settings for a connector such as name, timezone, delay, database, table **Usage** ``` td connector:show ``` **Example** td connector:show connector1 ### td connector update p Modify a connector session **Usage** ``` td connector:update [config] ``` table colgroup col col tbody tr th p Options th p Description tr td p code -n, --newname NAME td p change the schedule's name tr td p code -d, --database DB_NAME td p change the database tr td p code -t, --table TABLE_NAME td p change the table tr td p code -s, --schedule [CRON] td p change the schedule or leave blank to remove the schedule tr td p code -z, --timezone TZ td p name of the timezone. p Only extended timezones like 'Asia/Tokyo', 'America/Los_Angeles' are supported, (no 'PST', 'PDT', etc...). When a timezone is specified, the cron schedule is referred to that timezone. Otherwise, the cron schedule is referred to the UTC timezone. E.g. cron schedule '0 12 * * *' will execute daily at 5 AM without timezone option and at 12PM with the -t / --timezone 'America/Los_Angeles' timezone option tr td p code -D, --delay SECONDS td p change the delay time of the schedule tr td p code -T, --time-column COLUMN_NAME td p change the name of the time column tr td p code -c, --config CONFIG_FILE td p update the connector configuration tr td p code --config-diff CONFIG_DIFF_FIL td p update the connector config_diff **Example** td connector:update connector1 -c td-bulkload.yml -s '@daily' ... ### td connector delete p Delete a connector session **Usage** ``` td connector:delete ``` **Example** ``` td connector:delete connector1 ``` ### td connector history p Show the job history of a connector session **Usage** ``` td connector:history ``` table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td connector:history connector1 ### td connector run Run a connector session for the specified time option. **Usage** ``` td connector:run [time] ``` table colgroup col col tbody tr th p Options th p Description tr td p code -w, --wait td p wait for finishing the job **Example** td connector:run connector1 "2016-01-01 00:00:00" ## User Commands You can use the command line to control several elements related to users. + [td user:list](#td-user-list) + [td user:show](#td-user-show) + [td user:create](#td-user-create) + [td user:delete](#td-user-delete) + [td user:apikey:list](#td-user-apikey-list) + [td user:apikey:add](#td-user-apikey-add) + [td user:apikey:remove](#td-user-apikey-remove) ### td user list Show a list of users. **Usage** td user:list table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Example** td user:list td user:list -f csv ### td user show Show a user. **Usage** ``` td user:show ``` **Example** td user:show "Roberta Smith" ### td user create Create a user. As part of the user creation process, you will be prompted to provide a password for the user. **Usage** td user:create --email **Example** td user:create "Roberta" --email "roberta.smith@acme.com" ### td user delete Delete a user. **Usage** td user:delete **Example** td user:delete roberta.smith@acme.com ### td user apikey list Show API keys for a user. table colgroup col col tbody tr th p Options th p Description tr td p code -f, --format FORMAT td p format of the result rendering (tsv, csv, json or table. default is table) **Usage** td user:apikey:list **Example** td user:apikey:list roberta.smith@acme.com td user:apikey:list roberta.smith@acme.com -f csv ### td user apikey add Add an API key to a user. **Usage** td user:apikey:add **Example** td user:apikey:add roberta.smith@acme.com ### td user apikey remove Remove an API key from a user. **Usage** ``` td user:apikey:remove ``` **Example** td user:apikey:remove roberta.smith@acme.com 1234565/abcdefg ## Workflow Commands You can create or modify workflows from the CLI using the following commands. The command wf can be used interchangeably with workflow. + [Basic Workflow Commands](#basic-workflow-commands) + [Local-mode commands](#local-mode-commands) + [Server-mode commands](#server-mode-commands) + [Client-mode commands](#client-mode-commands) ### Basic Workflow Commands #### td workflow reset Reset the workflow moduleS **Usage** td workflow:reset #### td workflow:update Update the workflow module **Usage** td workflow:update [version] #### td workflow:version p Show workflow module version **Usage** td workflow:version ### Local-mode commands You can use the following commands to locally initiate changes to workflows. **Usage** td workflow [options...] table colgroup col col tbody tr th p Options th p Description tr td p code init < dir > td p create a new workflow project tr td p code r[un] < workflow.dig > td p run a workflow tr td p code c[heck] td p show workflow definitions tr td p code sched[uler] td p run a scheduler server tr td p code migrate(run|check) td p migrate database tr td p code selfupdate td p update CLI to the latest version Info Secrets for local mode use the following command: `td workflow secrets --local` ### Server-mode commands You can use the following commands to initiate changes to workflows from the server. **Usage** td workflow [options...] table colgroup col col tbody tr th p Options th p Description tr td p code server td p start server ### Client-mode commands You can use the following commands to initiate changes to workflows from the client. **Usage** td workflow [options...] table colgroup col col tbody tr th p Options th p Description tr td p code push < project-name > td p create and upload a new revision tr td p code download < project-name > td p pull an uploaded revision tr td p code start < project-name > < name > td p start a new session attempt of a workflow tr td p code retry < attempt-id > td p retry a session tr td p code kill < attempt-id > td p kill a running session attempt tr td p code backfill < schedule-id > td p start sessions of a schedule for past times tr td p code backfill < project-name > < name > td p start sessions of a schedule for past times tr td p code reschedule < schedule-id > td p skip sessions of a schedule to a future time tr td p code reschedule < project-name > < name > td p skip sessions of a schedule to a future time tr td p code projects [name] td p show projects tr td p code workflows [project-name] [name] td p show registered workflow definitions tr td p code schedules td p show registered schedules tr td p code disable < schedule-id > td p disable a workflow schedule tr td p code disable < project-name > td p disable all workflow schedules in a project tr td p code disable < project-name > < name > td p disable a workflow schedule tr td p code enable < schedule-id > td p enable a workflow schedule tr td p code enable < project-name > td p enable all workflow schedules in a project tr td p code enable < project-name > < name > td p enable a workflow schedule tr td p code sessions td p show sessions for all workflows tr td p code sessions < project-name > td p show sessions for all workflows in a project tr td p code sessions < project-name > < name > td p show sessions for a workflow tr td p code session < session-id > td p show a single session tr td p code attempts td p show attempts for all sessions tr td p code attempts < session-id > td p show attempts for a session tr td p code attempt < attempt-id > td p show a single attempt tr td p code tasks < attempt-id > td p show tasks of a session attempt tr td p code delete < project-name > td p delete a project tr td p code secrets --project < project-name > td p manage secrets tr td p code version td p show client and server version table colgroup col col tbody tr th p parameter th p Description tr td p code -L, --log PATH td p output log messages to a file (default: -) tr td p code -l, --log-level LEVEL td p log level (error, warn, info, debug or trace) tr td p code -X KEY=VALUE td p add a performance system config tr td p code -c, --config PATH.properties td p Configuration file (default: /Users/ < user_name > /.config/digdag/config) tr td p code --version td p show client version p client options: table colgroup col col tbody tr th p parameter th p Description tr td p code -e, --endpoint URL td p Server endpoint tr td p code -H, --header KEY=VALUE td p Additional headers tr td p code --disable-version-check td p Disable server version check tr td p code --disable-cert-validation td p Disable certificate verification tr td p code --basic-auth < user:pass > td p Add an Authorization header with the provided username and password ## Job Commands You can view status and results of jobs, view lists of jobs and delete jobs using the CLI. + [td job:show](#td-job-show) + [td job:status](#td-job-status) + [td job:list](#td-job-list) + [td job:kill](#td-job-kill) ### td job show Show status and results of a job. **Usage** td job:show **Example** td job:show 1461 table colgroup col col tbody tr th Options th Description tr td code -v, --verbose td show logs tr td code -w, --wait td p wait for finishing the job p br tr td code -G, --vertical td use vertical table to show results tr td code -o, --output PATH td write results to the file tr td code -l, --limit ROWS td limit the number of result rows shown when not outputting to file tr td code -c, --column-header td p output of the columns' header when the schema is available for the table (only applies to tsv and csv formats) tr td code -x, --exclude td do not automatically retrieve the job result tr td code --null STRING td p null expression in csv or tsv tr td code -f, --format FORMAT td format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz) ### td job status Show status progress of a job. **Usage** td job:status **Example** td job:status 1461 ### td job list td job:list [max] [max] is the number of jobs to show. **Example** td jobs --page 1 table colgroup col col tbody tr th Options th Description tr td code -p, --page PAGE td skip N pages tr td code -s, --skip N td skip N jobs tr td code -R, --running td show only running jobs tr td code -S, --success td show only succeeded jobs tr td code -E, --error td show only failed jobs tr td code --slow [SECONDS] td show slow queries (default threshold: 3600 seconds) tr td code -f, --format FORMAT td format of the result rendering (tsv, csv, json or table. default is table) ### td job kill Kill or cancel a job. td jobs --page 1 **Example** td jobs --page 1