From Handmade Ceph on ODROID (2018) to Rook on Kubernetes (2025)
In late 2018, I built a tiny Ceph cluster on ODROID boards running Ubuntu 18.04. It was practical, educational, slightly fragile, and deeply satisfying.
By mid-2025, I retired that setup and moved the same storage ideas into a Kubernetes-native model using Rook. This post is a look back at what the original build looked like and how the operating model changed once orchestration moved into Kubernetes.
2018: The original ODROID Ceph howto
The old cluster was explicit and manual by design:
- 4 ODROID-HC1 nodes for OSD-heavy work
- 1 ODROID-C2 for monitor, MDS, and manager duties
- Ubuntu 18.04 minimal images everywhere
- Static IPs and
/etc/hostsentries on every node cephuser, passwordless sudo, SSH key distribution, NTP setup- Disk prep by hand (
parted,mkfs.xfs,ceph-deploy disk zap) - Cluster bootstrap with
ceph-deploy
The flow looked like this:
- Prepare every board and align hostnames/networking.
- Build trust paths (
ssh-keyscan,ssh-copy-id) fromceph-admin. - Manually format
/dev/sdadevices on all OSD nodes. - Run
ceph-deploy new, editceph.conf, install daemons. - Create monitor, OSDs, MDS, and manager.
- Check health (
ceph -s) and mount CephFS on a client.
It worked and it taught me a lot. It also made every rebuild feel like a ritual.
What eventually hurt
For a home-lab scale cluster, the setup was fine. Over time, these pain points dominated:
- Node lifecycle was mostly imperative shell history.
- Host-level drift was easy (packages, configs, SSH trust, clock sync).
- Recovery steps lived in memory and old notes.
- Operational intent was not declarative; state had to be reconstructed.
- Cluster management remained separate from application scheduling.
The cluster was stable enough, but not composable enough.
Why Rook + Kubernetes made sense
Rook turned Ceph operations into Kubernetes resources:
- Desired state moved into YAML (
CephCluster,CephBlockPool,CephFilesystem,StorageClass). - Reconciliation replaced one-off command runs.
- Day-2 ops integrated with the same control plane as workloads.
- Storage consumers (PVCs) became first-class and self-service.
- Upgrades and component placement became easier to reason about.
In short: from hand-crafted pets to declarative cattle, while still keeping Ceph’s storage semantics.
Mapping the old world to the new one
The easiest way to explain the transformation is side-by-side:
ceph-deploy new mon1->CephClusterCR applied to the cluster- Manual monitor/manager placement ->
mon.count, placement rules in the CR ceph-deploy osd create ... --data /dev/sda->storage.nodes[].devices[]in Rook configceph -sover SSH ->kubectl -n rook-ceph get cephcluster+ toolbox checks- Manual CephFS mount command -> PVC backed by CephFS
StorageClass
The mental model changed from „run this command on that node“ to „declare what the cluster should look like, then reconcile.“
A practical migration arc (what worked for me)
I did not „in-place convert“ the old cluster. I treated it as a controlled replacement:
- Stand up Kubernetes on ODROID-capable nodes.
- Deploy Rook-Ceph operator and a fresh Ceph cluster definition.
- Create new pools/filesystems and corresponding
StorageClassobjects. - Migrate workloads and data gradually (app-by-app).
- Validate durability and performance under normal and failure conditions.
- Decommission the legacy 2018 cluster in mid-2025.
This reduced risk and let me test each application with explicit rollback points.
Example: tiny Rook CephCluster skeleton
This is the shape of the new world, replacing pages of host prep:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: quay.io/ceph/ceph:v19.2.1
mon:
count: 3
storage:
useAllNodes: false
nodes:
- name: odroid-1
devices:
- name: /dev/sda
- name: odroid-2
devices:
- name: /dev/sda
- name: odroid-3
devices:
- name: /dev/sda
Even this minimal declaration communicates more operational intent than ad-hoc shell steps ever did.
Looking back
The 2018 ODROID Ceph cluster was a great era: raw Linux, direct control, and hard-earned understanding of monitors, OSDs, and CephFS internals.
The Rook/Kubernetes era traded some of that tactile control for consistency, repeatability, and cleaner operations. For me, that was the right trade, especially once the cluster became infrastructure for other systems instead of the main experiment itself.
I still keep the old howto PDF around. Not as a runbook anymore, but as a snapshot of how much hands-on storage engineering I learned before letting controllers take the wheel.
If you’re setting up a storage engine for your Kubernetes platform and want to avoid the common pitfalls, we can help you design and deploy it the right way.
Contact Us
Need help designing or operating Kubernetes storage with Rook and Ceph?
Contact us to plan your architecture, validate reliability, and roll out with confidence.