+ - 0:00:00
Notes for current slide
Notes for next slide
Full Auto Cars
Full
Auto
Database

Josh Berkus

Red Hat Project Atomic

KubeCon.EU 2016

1 / 57

auto battle game shot

2 / 57

google car

3 / 57

WIP: waiting for 1.2/1.3

under constuction

4 / 57

yak shaving

5 / 57

Demo

6 / 57

Single Master DBs: Problem

  • low availability
  • unidirectional replication
  • very manual HA solutions
7 / 57

Why not multi-master DBs?

just moving the problem around

  • "eventual" consistency
  • network lag
  • maturity issues
  • feature poverty
  • app compatibility
8 / 57

But PG Replicaton is Awesome!

  • Easy to set up
  • Guaranteed
  • Corruption-free
  • Anti-footgun
  • Combines with DR
9 / 57

y u no guy

Y U No Failover?

10 / 57

"Automated failover is too complicated. You don't want it."

11 / 57

NO!

12 / 57

Hard != Impossible

google car

13 / 57

Hard != Impossible

general autofailover is prohibitive

but ... we can implement common use cases

14 / 57

The 80% Solution

  1. Pool of async replicas
  2. Cheap/replacable nodes
    Containers
  3. Watchdog service
  4. Auto-promote one replica
  5. Other nodes remaster
  6. Update routing
15 / 57

Now, a little history ...

handyrep logo

16 / 57

Handyrep

  • master-controller architecture
  • based on Python Fabric + SSH
  • worked in production
  • worked with any Postgres config
  • pluggable

www.handyrep.org

17 / 57

Handyrep: too general

  • Difficult to install
  • Difficult to debug
  • Over 100 configuration options
  • Scaled poorly
  • HR server was SPoF
18 / 57

Zalando

  • no1 European online fashion
  • 15m customers
  • 150 databases
  • 24/7/365 operation

... needed automated, decentralized HA

19 / 57

Failover Failure

crushed elephant
  • False failover
  • Misfires
  • Race conditions
20 / 57

split brain

21 / 57

Split Brain and S-M DBs

  • worst possible outcome
  • automated recovery impossible
  • manual recovery painful
22 / 57

St. Francis feeding the flying elephants

Patroni

23 / 57

compose.io announcement

24 / 57
  1. Postgres is a poor store of its own replication state
  2. Smart agents > top-down controllers
25 / 57

Compose Governor

  • Containers
  • Etcd-based consensus
  • Simple PostgreSQL controller

... so we forked it.

26 / 57

How it works

27 / 57

failover in three parts

failover est omnis divisa in partes tres

28 / 57

failover in three parts 2

failover est omnis divisa in partes tres

29 / 57

The Patroni Controller

patroni controller

30 / 57

Patroni controller

  • Python daemon
  • Runs in each container as PID 1
  • Controls Postgres startup/shutdown/config
  • Provides external REST API
  • Enforces opinionated config
31 / 57

Patroni Failover

how patroni works animation leader

32 / 57

Patroni Failover

how patroni works animation

33 / 57

Patroni Failover

how patroni works animation

34 / 57

Patroni Failover

how patroni works animation

35 / 57

Patroni Failover

how patroni works animation

36 / 57

Patroni Failover

how patroni works animation

37 / 57

Patroni Failover

how patroni works animation

38 / 57

Patroni Failover

how patroni works animation

39 / 57

Patroni Failover

how patroni works animation

40 / 57

Patroni Failover

how patroni works animation

41 / 57

Patroni Failover

how patroni works animation

42 / 57

Patroni Failover

how patroni works animation

43 / 57

Patroni Failover

how patroni works animation

44 / 57

Patroni Failover

how patroni works animation

45 / 57

Patroni Failover

how patroni works animation

46 / 57

What about split-brain?

split brain

47 / 57

Etcd

  • distributed consensus HTTP data store
  • Raft algoritm
  • implements CA
  • great for config + metadata
    • not for data data
48 / 57

Etcd Alternatives

  • Zookeeper
    • larger scale
    • supported
  • Consul
    • integrates discovery
    • not (yet) suppported
49 / 57

What's AtomicDB?

WIP project

  • PostgreSQL
  • Patroni
  • Atomic Host
  • Kubernetes
  • Dynamic proxy (dev)
  • Cockpit UI (dev)
50 / 57

Let's see that again

51 / 57

The Proxy Problem

  • differentiate master and read-only connections
  • master service needs to follow failover
  • failover logic too complex for kubernetes (1.1)
52 / 57

pgbouncer?

  • current implementation in pgbouncer
  • master, read slaves separate services/ports
  • depends on flannel LB

not good enough. Waiting for 1.2/1.3!

53 / 57

More features

  • pg_rewind support (9.4+)
  • configurable node imaging
    • WAL-E
    • PITR
  • synchronous replication
  • non-failover replicas
54 / 57

More Stuff Under development

  • cascading replication
  • integrated proxy
  • BDR support?

fork us on Github!

55 / 57

Resources

  • This Presentation:
    jberkus.github.io/full_auto_db
  • Patroni Project:
    github.com/zalando/patroni
  • AtomicDB Project:
    github.com/jberkus/atomicdb
56 / 57

¿questions?

more
jberkus:

project atomic:

 

@fuzzychef
www.databasesoup.com

www.projectatomic.io RedHat booth for Cockpit Kube demo  

57 / 57

auto battle game shot

2 / 57
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow