Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRIUSupport: Add multi-threaded hooks #18058

Closed
tajila opened this issue Aug 31, 2023 · 7 comments · Fixed by #18107
Closed

CRIUSupport: Add multi-threaded hooks #18058

tajila opened this issue Aug 31, 2023 · 7 comments · Fixed by #18107
Assignees
Labels
comp:vm criu Used to track CRIU snapshot related work

Comments

@tajila
Copy link
Contributor

tajila commented Aug 31, 2023

	enum HookMode {
		SINGLE_THREAD,
		MULTI_THREAD
	}
	/**
	 * User hook that is run before checkpointing the JVM.
	 *
	 * If single-threaded mode is requested, no other application threads
	 * will be active. Users should avoid synchronization of objects that are not owned
	 * by the thread, terminally blocking operations and launching new threads in the hook.
	 * If the thread attempts to acquire a lock that it doesn't own, an exception will
	 * be thrown.
	 *
	 * If multi-threaded mode is requested, the hook will be run alongside all other active Java threads.
	 * 
	 * High priority hooks are run last before checkpoint, and vice-versa for low priority hooks.
	 * The priority of the hook is with respect to the other hooks run within that mode. Multi-threaded
	 * hooks are implicitly lower priority than single-threaded hooks. Ie. the lowest priority single-threaded
	 * hook is higher priority than the highest priority multi-threaded hook. 
	 * 
	 * @param hook user hook
	 * @param hookMode mode in which the hook is run, either multi-threaded or single-threaded
	 * @param priority the priority of the hook, between 0 - 99. Throws UnsupportedOperationException otherwise.
	 * @return this
	 */
	public CRIUSupport registerPreCheckpointHook(Runnable hook, HookMode mode, int priority) throws UnsupportedOperationException
	/**
	 * User hook that is run after restoring a checkpoint image.
	 *
	 * If single-threaded mode is requested, no other application threads
	 * will be active. Users should avoid synchronization of objects that are not owned
	 * by the thread, terminally blocking operations and launching new threads in the hook.
	 * If the thread attempts to acquire a lock that it doesn't own, an exception will
	 * be thrown.
	 *
	 * If multi-threaded mode is requested, the hook will be run alongside all other active Java threads.
	 * 
	 * High priority hooks are run first after restore, and vice-versa for low priority hooks.
	 * The priority of the hook is with respect to the other hooks run within that mode. Multi-threaded
	 * hooks are implicitly lower priority than single-threaded hooks. Ie. the lowest priority single-threaded
	 * hook is higher priority than the highest priority multi-threaded hook. 
	 * 
	 * @param hook user hook
	 * @param hookMode mode in which the hook is run, either multi-threaded or single-threaded
	 * @param priority the priority of the hook, between 0 - 99. Throws UnsupportedOperationException otherwise.
	 * @return this
	 */
	public CRIUSupport registerPostRestoreHook(Runnable hook, HookMode mode, int priority) throws UnsupportedOperationException
@tajila tajila added criu Used to track CRIU snapshot related work comp:vm labels Aug 31, 2023
@tajila
Copy link
Contributor Author

tajila commented Aug 31, 2023

@JasonFengJ9 Please take a look at this

@tajila
Copy link
Contributor Author

tajila commented Aug 31, 2023

The multi-threaded hooks should be called in checkpointJVM but outside of checkpointJVMImpl.

We can bump up the priorities for below to make room for the user priorities

	private static final int RESTORE_CLEAR_INETADDRESS_CACHE_PRIORITY = 100;
	private static final int RESTORE_ENVIRONMENT_VARIABLES_PRIORITY = 100;
	private static final int RESTORE_SYSTEM_PROPERTIES_PRIORITY = 100;
	static final int RESET_CRIUSEC_PRIORITY = 100;
	static final int RESTORE_SECURITY_PROVIDERS_PRIORITY = 100;

@tajila
Copy link
Contributor Author

tajila commented Aug 31, 2023

We need to add two new lists to J9InternalCheckpointHookAPI for the multi-threaded variants.

@tajila
Copy link
Contributor Author

tajila commented Aug 31, 2023

The existing APIs can redirect to the new ones with USER_HOOKS_PRIORITY and HookMode.SINGLE_THREAD as defaults.

@JasonFengJ9
Copy link
Member

Will look into this.

@JasonFengJ9
Copy link
Member

JasonFengJ9 commented Sep 6, 2023

For multi-thread hooks, they might require Thread.join() before/after invoking checkpointJVMImpl().

for (Thread hook : threads) {
     hook.join();
}

In that case, the multi-thread hook registration API should take Thread instead of Runnable as an incoming parameter, and we might have two sets of hook registration APIs such as

public CRIUSupport registerPreCheckpointHookSingleThread(Runnable hook, int priority)
public CRIUSupport registerPostRestoreHookSingleThread(Runnable hook, int priority)
public CRIUSupport registerPreCheckpointHookMultiThread(Thread hook, int priority)
public CRIUSupport registerPostRestoreHookMultiThread(Thread hook, int priority)

Note: there is no need for HookMode.

Alternatively, single-thread hook API could be passed w/ a Thread (implements Runnable) as well and rely on HookMode to differentiate two thread modes.

@TobiAjila any preference?

Edited: A draft change - https://github.com/eclipse-openj9/openj9/compare/master...JasonFengJ9:criumultithreadhook?expand=1

@tajila
Copy link
Contributor Author

tajila commented Sep 7, 2023

For multi-thread hooks, they might require Thread.join() before/after invoking checkpointJVMImpl().

Multithread doesnt necesarrily mean run in parallel; I guess concurrent is a better definition. It simply means we don't run them in single thread mode. In any case, they would need to be run serially in order to respect the priorities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm criu Used to track CRIU snapshot related work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants