JupyterHub Training Guide - Disasters Hub

Introduction
Getting Started
JupyterHub Interface Overview
Working with Jupyter Notebooks
Data Management
Environment and Package Management
Shutting Down

Introduction

What is JupyterHub?

JupyterHub is a multi-user server that manages and provides web-based Jupyter notebook environments for multiple users. It allows you to:

Access powerful computing resources through your web browser
Write and execute code in Python, R, Julia, and other languages
Visualize data with interactive plots and charts
Collaborate with team members on shared projects
Work from anywhere without local setup requirements

The Disasters Hub

The Disasters Hub (https://hub.disasters.2i2c.cloud/) is a specialized JupyterHub instance designed for disaster response and analysis work. It provides:

Pre-configured environments for geospatial analysis
Access to disaster-related datasets
Collaboration tools for response teams
Integration with cloud storage services
Scalable computing resources

Key Benefits

✅ No Installation Required - Everything runs in your browser
✅ Pre-configured Environments - Common packages already installed
✅ Persistent Storage - Your work is saved between sessions
✅ Collaboration Ready - Share notebooks with team members
—

Getting Started

Accessing the Disasters Hub

Navigate to the Hub
- Open your web browser (Chrome, Firefox, Safari, or Edge recommended)
- Go to: https://hub.disasters.2i2c.cloud/
- Bookmark this URL for easy access
First-Time Login
- Must sign in through Keycloak - CI Logon
- After Keycloak has been completed, request to be added to the Disasters Jupyterhub account
Authentication
- You’ll see a login screen with authentication options
- Common authentication methods:
  - GitHub: Use your GitHub credentials
  - Google: Use your Google account
  - Institutional Login: Use your organization’s credentials
- Select your authentication method and follow the prompts

Server Selection

After login, you may be presented with server options:

Server Options:
┌─────────────────────────────────────┐
│ • Small (4 CPU, 4GB RAM)            │
│ • Medium (4 CPU, 7GB RAM)           │
│ • Large (4 CPU, 15GB RAM)           │
│ • Additional resources if needed    │
└─────────────────────────────────────┘

JupyterHub Interface Overview

The JupyterLab Interface

Once logged in, you’ll see the JupyterLab interface:

┌──────────────────────────────────────────────────────────┐
│ [File] [Edit] [View] [Run] [Kernel] [Tabs] [Settings]    │
├──────────────────────────────────────────────────────────┤
│ 📁 File Browser │          Main Work Area                | 
│ ├── 📂 data     │                                        │
│ ├── 📂 notebooks│      [Launcher Tab]                    │
│ ├── 📂 scripts  │      • Notebook (Python 3)             │
│ └── 📄 README   │      • Console                         │
│                 │      • Terminal                        |
│ [+] New         │      • Text File                       │
└──────────────────────────────────────────────────────────┘

Key Interface Components

Top Menu Bar
- File operations, editing, running code
- Kernel management
- View options and settings
Left Sidebar
- File Browser (📁): Navigate and manage files
- Running Terminals and Kernels (▶): Monitor active sessions
- Command Palette (🔧): Access all commands
- Extension Manager (🧩): Add functionality
Main Work Area
- Multiple tabs for notebooks, terminals, and files
- Drag tabs to rearrange or create split views
- Right-click tabs for additional options
Status Bar
- Current kernel status
- Line/column position
- File encoding and type

Working with Jupyter Notebooks

Notebook Basics

A Jupyter notebook consists of cells that can contain:

Code: Executable Python (or other language) code
Markdown: Formatted text, equations, and images
Raw: Unformatted text

Cell Operations

Running Cells

Run current cell: Shift + Enter (run and move to next)
Run current cell in place: Ctrl + Enter (stay in cell)
Run all cells: Menu → Run → Run All Cells

Cell Types

# Code Cell Example
import pandas as pd
import numpy as np
data = pd.read_csv('data.csv')
data.head()

# Markdown Cell Example
## Analysis Results
- **Finding 1**: Data shows increasing trend
- **Finding 2**: Correlation coefficient: 0.85

$$E = mc^2$$  # LaTeX equation

Cell Management

Insert cell above: A (in command mode)
Insert cell below: B (in command mode)
Delete cell: DD (press D twice in command mode)
Copy cell: C (in command mode)
Paste cell: V (in command mode)
Undo deletion: Z (in command mode)

Working with Kernels

The kernel is the computational engine that executes your code.

Kernel Operations

Restart kernel: Kernel → Restart
Restart and clear output: Kernel → Restart & Clear Output
Restart and run all: Kernel → Restart & Run All
Interrupt execution: Kernel → Interrupt (or I,I in command mode)
Change kernel: Kernel → Change Kernel

Kernel Status Indicators

○: Kernel idle
●: Kernel busy
[*]: Cell currently executing
[1]: Cell execution number

Data Management

File Upload/Download

Uploading Files

Drag and drop files directly into the file browser
Upload button: Click the ⬆ button in the file browser toolbar

Downloading Files

Right-click file in browser → Download

Working with Cloud Storage

Credentials for reading from S3 are already integrated within the Disasters Hub!

AWS S3 Integration

import boto3
import pandas as pd

# Read from S3
df = pd.read_csv('s3://bucket-name/path/to/file.csv')

# Write to S3
df.to_csv('s3://bucket-name/output/results.csv', index=False)

Data Persistence

⚠️ Important: Your home directory is persistent, but understand the storage limits:

Home directory: 100 GB/user (persistent)
Shared data: Read-only datasets available to all users
Temporary storage: /tmp cleared on restart
Best practice: Store large datasets in cloud storage, not home directory

Environment and Package Management

Installing Packages

Using pip (Python packages)

# In a notebook cell
!pip install package_name

# Install specific version
!pip install pandas==1.3.0

# Install from requirements file
!pip install -r requirements.txt

# Install in user directory (if no write permissions)
!pip install --user package_name

Using conda

# In a notebook cell
!conda install -c conda-forge package_name -y

# Install multiple packages
!conda install numpy pandas matplotlib -y

# Create new environment
!conda create -n myenv python=3.9 -y
!conda activate myenv  # Note: Activation in notebooks is tricky

Managing Python Environments

Check current environment

import sys
print(sys.executable)  # Python interpreter path
print(sys.version)     # Python version

# List installed packages
!pip list
!conda list

Creating isolated environments

# In terminal
python -m venv myproject
source myproject/bin/activate  # Linux/Mac
pip install -r requirements.txt

Using Different Kernels

Install IPython kernel:

python -m ipykernel install --user --name mykernel --display-name "My Kernel"

List available kernels:
```
jupyter kernelspec list
```
Remove a kernel:
```
jupyter kernelspec uninstall mykernel
```

Shutting Down Properly

Always shut down kernels and terminals when done:

Shutdown kernel: Kernel → Shutdown
Close terminals: Exit or Ctrl+D
Hub Control Panel: File → Hub Control Panel → Stop My Server
Logout: File → Log Out

⚠️ Important: Idle servers will be automatically culled after a period of inactivity (usually 1-2 hours).

Last Updated: 2025
Version: 1.0
Disasters Hub Training Guide

For additional assistance, contact your hub administrator or visit the 2i2c support portal.