Create JupytrHub project on Google Cloud — step by step.

Recently I had the opportunity to implement the jupyterhub solution in the Google cloud environment in a practical way. I had to go through all the stages of this small project adding more and more advanced issues.

You can find most of the information on Zero to jupyterhub web — the website I base myself on and I understand that this web provides a lot of the information.

We’ll use for this project Google Cloud, so let’s start from something very basic. Create a GoogleCloud account if you don’t have it yet.

Now is usefull to create new project for our jupyterhub.

Create a new project in Google Cloud

and adding our Project Name

Once the new project is created we need to enable Kubernetes. The easiest way to do this is to go to the Kubernetes Engine:

then we goes to Clusters and here if the service is not enabled — we will have the opportunity to enable it.

OK — we are ready to create our cluster kubernetes solution. We can do this manually, via the Create Cluster button.

But to ensure repeatability, I recommend doing it through the command line.
For this we will use gcloud commands. Instructions to install the Google cloud SDK in your environment you can find here in official web.

Once installed, run the command:

gcloud init

and authenticate to enable the management of the Google cloud account from our computer.

In the next step we choose our newly created project from the list and if you want you can choose your zone — I ignore it for the moment and I will define it later.
Now we are ready to start.

As the basis of our project I chose two e2-standard-2 machines in the central part of the US and defined their name as jhubmedium. It costs about 50USD/month each, so this project costs 100USD/moth. If yopu need calculate your prices go to Google cloud pricing calculator. Let’s execute this command.

gcloud container clusters create \
--machine-type e2-standard-2 \
--num-nodes 2 \
--zone us-central1-a \
--cluster-version latest \
jhubmedium

After about 3 minutes, we can see that our cluster is up and running.

Creating cluster jhubmedium in us-central1-a... Cluster is being health-checked (master is healthy)...done.
Created [https://container.googleapis.com/v1/projects/jupyterhubmedium/zones/us-central1-a/clusters/jhubmedium].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/jhubmedium?project=jupyterhubmedium
kubeconfig entry generated for jhubmedium.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
jhubmedium us-central1-a 1.18.14-gke.1200 34.70.XX.XX e2-standard-2 1.18.14-gke.1200 2 RUNNING

And we can see it also in our panel.

Kubectl

Once we have this tool operational we can check our nodes.

$ kubectl get node
NAME STATUS ROLES AGE VERSION
gke-jhubmedium-default-pool-9967fb45-7lh2 Ready <none> 8m v1.18.14-gke.1200
gke-jhubmedium-default-pool-9967fb45-hwjh Ready <none> 8m v1.18.14-gke.1200

Give your account permissions to perform all administrative actions needed.

kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole=cluster-admin \
--user=<GOOGLE-EMAIL-ACCOUNT>

At the moment we have a Google account set up, we have a Kubernetes cluster and we are ready to deploy the basic version of jupytehub.

Helm

Once Helm is installed, we can download the project from jupyterhub. We do it like this

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

This should show output like:

Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
...Successfully got an update from the "jupyterhub" chart repository
Update Complete. ⎈ Happy Helming!⎈

For the Helm project to work we need to set up a file with minimal data. Let’s call this file config.yaml. What we need at this point is simply a token.

Let’s generate this token

openssl rand -hex 32

and insert that token into the config.yaml file, so the file config.yaml will be like that:

# config.yaml
proxy:
secretToken: "e8147687dde6......."

At the moment I still have to define variables that I will use in the future. Let’s say that in linux we do:

export RELEASE=jhub
export NAMESPACE=jhub

These variables are important for future project management. And in this moment we can create our project with jupyterhub:

helm upgrade --cleanup-on-fail \
--install $RELEASE jupyterhub/jupyterhub \
--namespace $NAMESPACE \
--create-namespace \
--version=0.10.6 \
--values config.yaml

After few minutes we can see our project ready

Release "jhub" does not exist. Installing it now.
NAME: jhub
LAST DEPLOYED: Mon Jan 25 15:22:38 2021
NAMESPACE: jhub
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing JupyterHub!
Your release is named jhub and installed into the namespace jhub.You can find if the hub and proxy is ready by doing:kubectl --namespace=jhub get podand watching for both those pods to be in status 'Running'.You can find the public IP of the JupyterHub by doing:kubectl --namespace=jhub get svc proxy-publicIt might take a few minutes for it to appear!Note that this is still an alpha release! If you have questions, feel free to
1. Read the guide at https://z2jh.jupyter.org
2. Chat with us at https://gitter.im/jupyterhub/jupyterhub
3. File issues at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues

We can see all created pods

$ kubectl --namespace=jhub get pod
NAME READY STATUS RESTARTS AGE
continuous-image-puller-4jpft 1/1 Running 0 52s
continuous-image-puller-z5zbw 1/1 Running 0 52s
hub-8dfb7797f-pt5ft 1/1 Running 0 52s
proxy-79b56996cf-zk9dn 1/1 Running 0 52s
user-scheduler-599dd58d74-4c6vb 1/1 Running 0 51s
user-scheduler-599dd58d74-7b7cq 1/1 Running 0 52s

and we can see which IP has our proxy:

$ kubectl --namespace=jhub get svc proxy-public
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
proxy-public LoadBalancer 10.3.242.214 34.121.XX.XX 80:30738/TCP 43s

Now our basic jupyterhub is ready. If you go to the External-IP in your browser, you will see:

User and pass are whatever you want in this moment. So let’s go inside with user user and password pass, and our system will start creating resources

and finally:

Let’s create notebook:

If we want to see what is inside our cluster we can do that also accessing through console panel. Workloads show us all podst created before and one new. Jupyter-<userName> is created for any user thas login ito our system.

In storage are two disks. One for our system and in this moment “claim-<userName>” for every user that login into our system

In compute Engine we can see “phisical” machines used for our cluster

And in Disk menu we see disks used for cluster, for hub and for persistent Disk for user that login into our system.

That’s all if you want to try creating a basic jupyterhub system. That’s what it’s for — to see if it works. Now comes the last part — how to delete it to free up the resources and then create a new — improved — version a bit more practical.

We can delete all cluster using gcloud command

gcloud container clusters delete jhubmedium--zone us-central1-a

or using Delete option in console

In next part we will see how to add some usefull components like shared drives or basic authentication. Also we’ll see howto add some interesting options in configuration of our jupyterhub project.

Data scientist, Developer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store