Date: 2016-11-02
Time: 11:10–12:00
Room: Omega
Level: Intermediate
Yammer is the leading Enterprise Social Network established in 2008 and was acquired by Microsoft in 2012. PostgreSQL has been the main datastore of Yammer since inception as part of a traditional Rails monolith model. Over the years, we have broken up PostgreSQL into many smaller clusters with the introduction of micro services. Even then, as a small ops team, we have no dedicated DBA or PostgreSQL DevOps person. I'm going to share how we manage over 10 PostgreSQL clusters in production spanning 2 datacenters and handling over 40k QPS at peak with well over 99.95% SLA, how we collect and monitor metrics, keep databases in good shape with compaction and reindexing and do backup/recovery. We also built an in-house tool to assist with replication chain management, replica lag monitoring, load balancing and automatic failover of read replicas. At the moment, we are working towards moving our PostgreSQL clusters from dedicated datacenters to Microsoft Azure cloud with the goal to automate master failover as well.
About the author: Chinh Nguyen is a senior software engineer at Yammer/Microsoft and the current lead of PostgreSQL (and a bunch of other datastores, including HBase).