-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split MAP and REDUCE tasks into individual mesos tasks #60
base: master
Are you sure you want to change the base?
Conversation
correct spelling mistake propertios->properties
Update README.md
Warning: MESOS_NATIVE_LIBRARY is deprecated, use MESOS_NATIVE_JAVA_LIBRARY instead. Future releases will not support JNI bindings via MESOS_NATIVE_LIBRARY.
Use MESOS_NATIVE_JAVA_LIBRARY in README.md
…through tasks, then decide whether they are a managed TaskTracker * ResourcePolicy is abstract for all intents, but it could be instantiated. Make it literally abstract * A bunch of lint warnings removed * Rearrange code to be easier to read -- interface implementations commented and methods in order, update some docs to JavaDoc format.
Modified `MesosScheduler.java` and `configuration.md`. Now `mapred.mesos.framework.principal`, `mapred.mesos.framework.secretfile`, `mapred.mesos.framework.user`, and `mapred.mesos.framework.name` are configureable options. Addresses issue mesos#53 Added Support for Framework Authentication Added Support for Framework Authentication
Added Framework Authentication (Issue mesos#53).
Previously the "idle check" would be run against all task trackers regardless of whether they have any jobs assigned to them or not. The main MesosScheduler is responsible for cleaning up task trackers once jobs have *finished* so this change stops us performing idle checks on trackers that have no jobs. This change fixes the observed behaviour of task trackers being killed and respawning continuously when they're waiting for jobs (e.g with the min map/reduce slot config option, or the fixed resource policy).
Avoid respawning task trackers constantly when they are idle
This commit splits out the resources for MAP and REDUCE slots into two Mesos tasks instead of one. This allows the idle-slot tracking to operator on MAP and REDUCE slots individually further increasing our ability to release idle resources faster.
@Override | ||
public void run() { | ||
try { | ||
taskTracker.run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. It's safe to reuse the same object across different threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it was like this before, but the code was rearranged. I'll take it that it is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I just moved the code around. I think it is safe so long as it's synchronized()
properly which I think it is. Perhaps it's worth running over all the code making sure any access to the taskTracker
field is properly thread safe.
Looks pretty good to me. If you've tested it, merge away. Might be worth bumping the version in the pom too. |
Thanks for the quick review! I've just rolled this out on one of our clusters so I want to let things settle a bit first, and get some serious traffic through the JT/TTs before saying it's good to go. |
In one of the code paths (related to flaky trackers) we were using synchronized() in nested function calls, agains the same object. This is not needed, and causes a deadlock.
If we synchronized() against the scheduler here and we grab hold of the lock, at the same time in another thread a callback from Mesos comes in and that too also calls synchronized(). This behaviour casuses the Mesos Scheduler driver to lock up because it's single threaded. In the event that the former then decides to kill a mesos task, we'll see a deadlock because the killTask() message can't be sent while the driver is waiting on another callback.
This commit splits out the resources for MAP and REDUCE slots into two Mesos tasks instead of one, while still using a single TaskTracker JVM. This allows the idle-slot tracking to operator on MAP and REDUCE slots individually further increasing our ability to release idle resources faster.
This is an implementation of #47.