This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.
- Common misuse/abuse of hive
How to use and interpret Hive's explain command
File formats and their relative performance (Text, JSON, SequenceFile, Avro, ORC and Parquet)
Spark and hive for transformation
- and more...