This repo contains useful Python code samples. Feel free to use as you need. Thanks.
- impala.py -- How to connect to Impala and distribute SQL over Hadoop cluster - simple example.
- mysql_etl.py -- How to connect to MySQL DB from python.
- mysql_etl_longblob.py -- How to convert any MySQL table "longblob" columns into text data type so it can be imported into Hive metastore easily.
- mysql_mastertables_list.py -- How to connect to MySQL DB from python and create a master table list on which any action needs to be performed.
- mysql_sqoop_hive.py -- How to import tables from MySQL to Hive metastore using the Scoop command