-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Hi,
I've seen this type of question a lot:
http://stackoverflow.com/questions/20050927/how-to-get-the-ipython-notebook-title-associated-with-the-currently-running-ipyt?rq=1
It makes sense to me that the kernel should not know what it's talking to from a design perspective.
However, I'm currently in the process of working through a Jupyter High Availability scenario. Our goal is to have two Jupyter instances running in two different VMs and switch them if one of those two VMs go down for some reason without losing the kernel state.
We have control over the kernels we are running (see https://github.com/jupyter-incubator/sparkmagic/blob/master/remotespark/wrapperkernel/sparkkernelbase.py), and we'd like to be able to tie some state (a session number) to a particular kernel instance.
It seems to me like I'd need some things to achieve this, but maybe you have better ideas:
- Fire some piece of code automatically every time a notebook starts: this could be the
__init__
method in my kernel or some other piece of code that is triggered every time a kernel gets started (some Javascript code in the notebook maybe? I know this wouldn't apply for other clients but it's a start). - This previous bit of code that gets fired would need to always be run with the same ID to be able to identify the state it needs to reconstruct (i.e. it would need to know that for this particular kernel we had X particular state).
- Some persistent storage that both Jupyter instances could have access to.
I thought of a concrete implementation and I'd like to hear some feedback on it if possible:
There is a Notebook extension that reads some ID in the notebook's page DOM (I need help knowing what ID this would be: e.g. notebook name with relative paths from root folder included or a GUID in some hidden cell in the notebook file), which would then issue a request to the kernel with this ID to restore its state. The kernel would then take this ID and get the session ID from cloud storage. If the ID is embedded in Javascript, both Jupyter servers would need to trust the notebook from the get go.
Thanks for any help or pointers you may have!
(cc. @msftristew, @MohamedElKamhawy, @ellisonbg)