Apache Hive is the primary data-warehousing tool we use at our workplace for querying and data-analysis.On top of that, we use Apache Oozie to schedule our workflows of Sqoop and Pig jobs apart from Hive … Continue reading
Heartbeat in hadoop
Failed to receive heartbeat
Adding a new host to the cloudera hadoop cluster fails with no heartbeat from the agent
http://www.cloudera.com/ is a Hadoop distribution and helps you handle the complexity of … Continue reading
Apache Sentry is a Big data tool used to enforce fine grained role based authorization to data and metadata on your hadoop clusters. Recently I was playing around with Sentry and from the configuration manual … Continue reading
How to run sqoop jobs from Oozie
Sqoop is a tool to import/export data from a relaional database to HDFS and vice-versa. It is super-easy to use and uses Map-reduce jobs behind the scenes to … Continue reading
How to set up Sqoop incremental imports?
Here’s a step by step guide for Sqoop incremental imports and since it says step-by-step, it’s going to be only that 😉 .
Hive, a data … Continue reading
Illegal partition exception in sqoop for incremental imports
Sqoop is an amazing tool by apache that is widely used to import/export data between Hadoop and relational databases. I have particularly used this for developing a … Continue reading
Horton-works data platform abbreviated as HDP is a completely open source distribution of Hadoop. As much as a breeze it is to provision, manage and monitor clusters each with multiple nodes through Ambari, It’s a … Continue reading
How to run sqoop job through oozie successfully
Sqoop jobs are used to create and save the import and export commands. It helps to automate the sqoop tasks and in re-execution of sqoop actions. I … Continue reading