# Working with DB-API ## Intro `pytd` implements [Python Database API Specification v2.0](https://www.python.org/dev/peps/pep-0249/) with the help of [prestodb/presto-python-client](https://github.com/prestodb/presto-python-client). ## Setup To connect to the API declare a `connect` object. The default engine is Trino(Presto). You can manually specify Hive via the `default_engine` argument. ```python from pytd.dbapi import connect conn = connect(pytd.Client(database='sample_datasets')) # or, connect with Hive: conn = connect(pytd.Client(database='sample_datasets', default_engine='hive')) ``` ## Query The `cursor` function allows you to flexibly fetch query results from a custom function: ```python def query(sql, connection): cur = connection.cursor() cur.execute(sql) rows = cur.fetchall() columns = [desc[0] for desc in cur.description] return {'data': rows, 'columns': columns} query('select symbol, count(1) as cnt from nasdaq group by 1 order by 1', conn) ``` ## Example Below is an example of generator-based iterative retrieval, just like [pandas.DataFrame.iterrows](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html): ```python def iterrows(sql, connection): cur = connection.cursor() cur.execute(sql) index = 0 columns = None while True: row = cur.fetchone() if row is None: break if columns is None: columns = [desc[0] for desc in cur.description] yield index, dict(zip(columns, row)) index += 1 for index, row in iterrows('select symbol, count(1) as cnt from nasdaq group by 1 order by 1', conn): print(index, row) # 0 {'cnt': 590, 'symbol': 'AAIT'} # 1 {'cnt': 82, 'symbol': 'AAL'} # 2 {'cnt': 9252, 'symbol': 'AAME'} # 3 {'cnt': 253, 'symbol': 'AAOI'} # 4 {'cnt': 5980, 'symbol': 'AAON'} # ... ``` ## Further Reading - [pytd DB-API Reference](https://pytd-doc.readthedocs.io/en/latest/dbapi.html)