+ - 0:00:00
Notes for current slide
Notes for next slide

state, it's what's happening

Josh Berkus

Red Hat OSAS

SCALE 15x 2017

1 / 55

projectatomic logo

2 / 55

kube logo

3 / 55

Kubernetes StatefulSet

A set of APIs for running stateful applications in a Linux container cloud.

(aka PetSet)

4 / 55

Code

github.com/jberkus/atomicdb

5 / 55

docker logo

6 / 55

automated elephant

7 / 55

The way containers are designed, and particularly the way Docker is designed, the assumption is that the container is stateless. -- Mark Davis, ClusterHQ

8 / 55

Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database. 12factor.net

9 / 55

"backing service"

10 / 55

amazon RDS logo

11 / 55

giant crap

12 / 55

house no foundation

13 / 55

facebook no data

14 / 55

wikipedia no data

15 / 55

State What?

16 / 55

What is State?

the difference between code and running applications

17 / 55

Minimal State

  • current task
  • memory cache
  • cluster deployment
18 / 55

state scale

Scale Of State

19 / 55

switching cost diagram

Switching Cost

20 / 55

Four Stateful Qualities

  1. Storage
21 / 55

Four Stateful Qualities

  1. Storage
  2. Node Identity
  3. Cluster Role
  4. Session State
22 / 55

giant man

1. Storage

23 / 55

Storage Requirements

  • goes with the container
  • exclusive write access
24 / 55

Storage Solutions

  1. automated dir naming
    in app initialization
  2. automated data migration
    Flocker Volumes
  3. network storage
    (allocated per container)
    Kube StatefulSet PVT
25 / 55

StatefulSet PVT

  • Persistent Volume Template
  • Requests new network storage for each pod
  • Storage associated with the pod (container)
  • Replacements get same storage
26 / 55

StatefulSet PVT

volumeClaimTemplates:
- metadata:
name: pgdata
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ ReadWriteOnce ]
27 / 55

StatefulSet PVT

[root@ip-172-31-47-16 ~]# kubectl get pvc
NAME STATUS CAPACITY AGE
pgdata-patroni-0 Bound 25Gi 59m
pgdata-patroni-1 Bound 25Gi 59m
pgdata-patroni-2 Bound 25Gi 59m
28 / 55

StatefulSet PVT Limitations

  • no local storage (yet)
  • recovery logic
  • garbage collection
29 / 55

batman revealed

2. Identity

30 / 55

multiple man

31 / 55

spiderman

32 / 55

Identity Needs

  • peering nodes
    etcd, cassandra
    replication slots
  • special nodes
    replication master
    reporting nodes
    spares/shadows
33 / 55

Identity Attributes

  1. individual
  2. durable
  3. predictable
  4. addressable
34 / 55

StatefulSet Identity

[centos@ip-172-31-45-224 ~]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
etcd-0 1/1 Running 0 5d
etcd-1 1/1 Running 0 5d
etcd-2 1/1 Running 0 5d
patroni-0 1/1 Running 0 2d
patroni-1 1/1 Running 0 2d
patroni-2 1/1 Running 0 2d
patroni-3 1/1 Running 0 2d
patroni-4 1/1 Running 0 2d
35 / 55

StatefulSet Identity

  • pods start with -0 and increment
  • pods start in order
  • routable as pod-0.service.svc.cluster.local
36 / 55

StatefulSet WIP

  • replacement
  • promotion
  • namespacing
  • shadow nodes
  • federation
37 / 55

avengers team

3. Cluster Role

38 / 55

Cluster Role

what is my job in the cluster?

  • replication master
  • shard X
  • storage bucket Y
  • bootstrap node
39 / 55

Cluster Role

  • can (does) change
  • sometimes exclusive
  • leader elections
40 / 55

Cluster Role

DCS to the rescue!

  1. DCS shared config
  2. DCS leader elections
  3. annotations/labels
41 / 55

DCS?

distributed consensus store

  • etcd
  • consul
  • zookeeper
  • embedded RAFT

consistent, fault-tolerant

42 / 55

DCS vs. Embedded

DCS: scale, manage separately. Proven software.

Embedded: Just One Image. Fewer cluster fail scenarios.

43 / 55

Bot/Autopilot

bot working

44 / 55

Bot Code

Code in the application container which automatically manages its Cluster Role using a simple state machine and access to a consensus store.

  • automatic
  • autonomous
  • correct
45 / 55

Cluster Role

[root@psql-3615923279-dvli8 /]# curl -L
etcd:2379/v2/keys/service/patroni01/members/patroni_2
{
...
{role:master,
state:running,
conn_url:postgres://patroni-2.patroni:5432/postgres,
api_url:http://patroni-2.patroni:8008/patroni,
xlog_location:67109184}
}
}
46 / 55

Cluster Role

kubectl get pods -l patroni-role -L patroni-role
NAME READY STATUS PATRONI-ROLE
patroni-0 1/1 Running replica
patroni-1 1/1 Running replica
patroni-2 1/1 Running master
47 / 55

Bot/DCS Issues

  • switching asynchronous
  • inconsistency
  • failover data loss
  • cluster failure
  • MOAR TESTING!
48 / 55

professor x orders pizza

4. Sessions

49 / 55

Session State

not everything is a REST request

  • downloads
  • database transactions
  • data state
  • auth server sessions
50 / 55

Session State

solutions?

  • discovery DNS
    ... don't follow failover
  • smart proxies
    ... not done yet
51 / 55

Discovery DNS

psql -h pgwrite.patroni.default.svc.cluster.local
psql -h pgread.patroni.default.svc.cluster.local
52 / 55

Session State Future

Develop smart proxies which read Cluster Role events and automatically reconfigure and fail over connections according to an autonomous rules system.

53 / 55

Stateful Solutions

  1. Storage: StatefulSet PVT (70%)
  2. Identity: StatefulSet (90%)
  3. Cluster Role: DCS + Bot (70%)
  4. Session: Discovery DNS (50%)
54 / 55

¿questions?

more
jberkus:

more
events:

www.projectatomic.io
@fuzzychef
jberkus.github.io

Kubecon EU
March 27, Berlin

Cloud Native PDX
Meetup

55 / 55

projectatomic logo

2 / 55
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow