What is JobTracker in Hadoop?

JobTracker is a service within Hadoop which runs MapReduce jobs on the cluster.

In Hadoop, JobTracker was a critical component in the earlier versions (Hadoop 1.x), but it has been deprecated in favor of the ResourceManager in Hadoop 2.x and later versions.

In Hadoop 1.x:

  • JobTracker: It was a daemon that managed and monitored MapReduce jobs submitted to the Hadoop cluster. It was responsible for dividing the job into tasks, scheduling these tasks on TaskTrackers, and monitoring their execution. The JobTracker kept track of the overall progress of the job.

However, with the introduction of Hadoop 2.x, the architecture underwent significant changes. The JobTracker’s responsibilities were split into two separate components:

  1. ResourceManager (RM): Manages the global assignment of resources in the cluster. It keeps track of available resources and schedules tasks across the cluster.
  2. ApplicationMaster (AM): Manages the execution of a specific job or application. Each application has its own ApplicationMaster, which negotiates resources with the ResourceManager and works with the NodeManagers to execute and monitor tasks.

The ResourceManager and ApplicationMaster architecture provides better scalability, resource utilization, and fault tolerance compared to the JobTracker model.

So, to summarize, in modern Hadoop versions (2.x and later), the JobTracker is no longer a component, and the ResourceManager and ApplicationMaster have taken over its role in managing resources and job execution.