In a distributed cluster environment , Because the program Bug（ Include Hadoop Of itself bug）, Causes of unbalanced load or unequal distribution of resources , This can cause multiple tasks of the same job to run at inconsistent speeds , Some tasks may run significantly slower than others （ For example, a task for a job is only on schedule 50%, All other tasks have been run ）, These tasks slow down the overall execution of the job . To avoid that ,Hadoop Speculative execution is used （Speculative Execution） Mechanism , It follows from certain rules “ dragging ” The task of , And start a backup task for such a task , Have the task process the same data as the original task , Finally, the calculation result of the first successful operation is selected as the final result .
Hive You can also turn on speculative execution Set the expected execution parameters to be turned on ：Hadoop Of mapred-site.xml File to configure
<property> <name>mapreduce.map.speculative</name> <value>true</value> <description>If true, then multiple instances of some map tasks may be executed in parallel.</description> </property> <property> <name>mapreduce.reduce.speculative</name> <value>true</value> <description>If true, then multiple instances of some reduce tasks may be executed in parallel.</description> </property>
however hive It also provides configuration items to control reduce-side Presumed execution ：
<property> <name>hive.mapred.reduce.tasks.speculative.execution</name> <value>true</value> <description>Whether speculative execution for reducers should be turned on. </description> </property>
About tuning these inferred execution variables , It is difficult to give a specific suggestion . If the user is sensitive to run-time deviations , So you can turn those off . If the user needs to execute time due to the large amount of input data map perhaps Reduce task Words , So the waste caused by the startup speculation is huge .
Come here Hive That's the end of the performance tuning series , What's more, you can leave a message in the comment area , Small bacteria will be selected and adopted , I hope you can support me more !
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .